Sat, Sep 14
Fri, Sep 13
P.S. to sum up -- Wiktionary just needs just a single Lua function for the minimum viable product: getEntity('L100000') that simply returns the whole Lexeme JSON. Everything else is optional.
I have imported some Russian nouns (~20,000 so far, but will be more soon), plus added a link from Wiktionary to the corresponding Lexeme. I think the simplest use case for Lexemes would be to allow Wiktionary Lua script to be able to load Lexeme by its ID. This will instantly make Lexemes useful to Wiktionary because the Lua script will be able to:
- generate table of the word forms
- generate etymology and pronunciation sections
- do the above for every lexeme if more than one is used on the page.
Wed, Sep 11
@Anomie thx for the explanation. Several weeks ago by bot was banned for a short time because it didn't have the maxlag param. Are you saying that it was a mistake because WMF MW doesn't actually pay any attention to it? Also, would it be possible to update the documentation to indicate what the proper bot should do when running on WMF servers? Thanks!
Tue, Sep 3
In theory it should be fairly straightforward to create a <graph> that outputs a single number, but that would still be an image, not text (and it might look slightly off - e.g. fuzzier or in different font)
Wed, Aug 21
Thanks, closing for now, waiting for the Vega team and the students.
Aug 18 2019
@Catrope thanks for tackling it! I always thought parser cache is non-persisted, so if a page does not get any edits in 2 months, the relevant data might not be there?
Aug 16 2019
This is awesome, thank you @TheDJ and @JeanFred ! One kinda important issue -- it breaks on localized columns, e.g. Data:I18n/No_globals.tab -- CSV outputs empty values, and Excel shows English (I think).
Aug 7 2019
Aug 4 2019
Recently someone also raised this as an issue with graphs. Has anything changed to destabilize mw.ext.data.get() method?
Aug 1 2019
Vega1 doesn't need it - it doesn't support external URLs. Vega2 approach sounds correct. Thx for digging into it!
@Pchelolo Graphoid first calls the action=graph to get the data, but then it should also call to the WDQS directly using that query. Also, you can see what that request looks like if you go to the wiki page with a graph, click edit source, and do a page preview -- your browser should make very similar request to WDQS, except that unlike Graphoid, browser forces a few headers like user agent (IIRC)
Jul 29 2019
@Lea_Lacroix_WMDE not from my side - I'm a bit overbooked at the moment with my main job (elastic.co) and family. It will take me some effort to get the system running again on my laptop to see what Graphoid sends to the servers. It might be easier to track it from the server side logs if anyone has that access.
Jul 24 2019
vegaErr: Error: Load failed with response code 403. -- Vega attempts to call Wikidata API to get the needed data, and I suspect that API returns 403. I would look at the HTTP request Vega makes (it should be very similar to the query stored in the graph on the wiki page), and try to find it in WDQS logs. Perhaps WDQS now blocks some HTTP requests that do not appear to originate from the browser (i.e. have fewer headers than expected)?
Jul 15 2019
Rest in Peace Zero, it was fun... 😢
Jul 2 2019
Thanks, it makes sense. I was only suggesting to search the logs for the same query as being ran from the in-browser's page preview.
@Smalyshev the query should be identical as being issues from the browser when you do a "page preview" with a graph that uses WDQS query. Graphoid does exactly the same steps as the browser - essentially making all the requests and putting together a resulting image.
@Smalyshev i would love to help, but it is a bit hard to say without access to the logs or the servers
Jun 25 2019
Jun 19 2019
Jun 15 2019
@stjn I don't think there is any reliable way to compute the image height/width ahead of time without going through the full graph rendering. PHP could call out to Graphoid during the post-save, but that may require even more work than simply updating to Vega 3+ and implementing the above logic (defaulting to autosize=fit model).
@stjn this is a fairly complex issue. I have done a lot of work with the Vega lib since then, integrating it as part of Kibana (Elasticsearch), so I can elaborate on a "better" way to address it. I will try to be brief :) Also, take a look here on how I did it in Kibana.
Jun 11 2019
sounds good, thanks @sbassett !
@sbassett sorry, unsure what you mean. Some code has already been completed -- https://github.com/nyurik/mw-graph-shared/commit/b97fd309897f701bd0db6b1d60635d0786d84887 -- making it possible to integrate the future v3+. The next step would be to create a patch for graphoid & graph ext using that shared code.
May 24 2019
May 21 2019
May 17 2019
May 12 2019
@Bawolff I was re-reading our notes in light of the resumed interest in this project. Assuming we are not yet rolling out the sandbox approach per our discussion, I have began looking closely at the issues you raised:
@Xiaoyanghaitao if you want to target Vega 5.4.0 initially, you might as well put that version in the subject (or you could just say "latest production version")
May 9 2019
May 8 2019
Agree this would be great to have. I think this should be an API-level rather than an extension-level feature, because in many cases an extension would need to bridge the communication between inside and outside the frame. The system should allow:
- set up a separate initial configuration blob
- set up a separate resource loading for the iframe
- designate some resources as only available via the iframe loading (to restrict accidental loading by the primary site)
- establish a clear usage pattern for extensions to follow (this is more of a documentation than coding)
Do we want a separate ticket for Vega-Lite? Vega-Lite can be thought of as an "add-on" converter library that simply converts one JSON into a different JSON, without any other functionality (e.g. no XHR calls, no UI, etc). This way users can use a much simpler language VegaLite, and it will be dynamically converted to a full Vega.
@Bawolff is there anything in this ticket that is sensitive? There is a discussion about GSOC student tackling the Vega 3 upgrade, and they would need access to this discussion.
@Milimetric has any team at WMF decided to adapt it yet? From what I was told, there is already a GSOC student about to get started on this, and we could ask them to do some restructuring work as well. Which restructuring steps did you have in mind? BTW, it would be great if you could co-mentor this person as well (we already have @domoritz and myself signed up as mentors)
May 4 2019
Apr 19 2019
Apr 12 2019
A bot has been implemented and documented for this functionality. Needs a whole bunch of bot approvals, or a global bot flag. For now running it by hand for a few pages. Anyone interested in this functionality, we can now start building cross-site templates and lua modules...
Apr 10 2019
@Milimetric I think the removal of Graphoid will be far more difficult than just keeping it. If you remove it, you will have to support multiple Vega versions on the client - not as trivial to do as it is for the Graphoid service (which was built with that support).
It's not about "allowing", it's just a matter of me (or someone else) to sit down and implement it :) Afterwards, I will have to document it in detail, and apply for a bot flag for my YurikBot (it has been dormant for a while now), preferably to be allowed on all wikis at once.
Apr 9 2019
Another, not yet mentioned consideration: There was a significant syntax change between Vega 1.5, Vega 2, and 3+. Graph extension & Graphoid support both 1.5 and 2, but Graph ext does it in an imperfect way: browser only loads the latest version required, so if a page has both 1.5 and 2, it would just load 2. This only affects dynamic graphs (v2 only), but it really breaks during page preview -- only v2 graphs are previewed if both are present. When used as images from Graphoid, there are no issues (regular page viewing).
@mobrovac @dr0ptp4kt there is a much larger issue -- performance, and this is identical to maps: when a page loads, do you want to show a map / graph right away, e.g. if it is at the top, or do you want to load a large library, load all the data for it, and only then show the page? Or to show an empty box that eventually gets loaded? Live map requires leaflet, tile images, could get data from commons, could run sparql query (potentially overloading the server), etc. The exact same issues are for the graph - load Vega lib, load data, run sparql qeuries, etc. So if map needs it, graph needs it, and if servers can take both - sure. I would much rather have interactive content from the start.
Apr 8 2019
@Anomie you are right that documentation does not say it will handle the entire ISO 8601, but I do think MW should handle at least a very common 2019-04-08T04:12:45+00:00-style format (only +00:00 / -00:00, not other timezones). It is, for better or worse, fairly common with Python, and supporting it shouldn't be a significant undertaking.
Apr 5 2019
Mar 21 2019
Mar 20 2019
Mar 16 2019
Here's a working implementation using the browser's prompt text box, based on the original idea by @Ricordisamoa. Copy it into your https://wiki.openstreetmap.org/wiki/User:___username___/common.js page. Note that the save will happen even if the user clicks Cancel because there is no good way to abort saving without crashing. This code can be directly used as a gadget.
Mar 12 2019
Mar 8 2019
This feature has been requested several times when adding Wikibase to OpenStreetMap wiki. I think the very first step we can already do is to make it possible for gadgets to add edit summary string during the "save" command. This way we can experiment with the UI implementation.
Mar 3 2019
@A2093064 I think a much more intuitive behavior would be to use the "root" page, e.g. MediaWiki:talkpageheader as a fallback when a language-specific page does not exist, and use /en to override just English -- thus treating /en the same as any other language. Yes, it is possible to run a bot to generate all such pages, but ... ew :)
@A2093064 The talkpageheader message is set to a "-" by default - see code. This seems like a very strange behavior, especially because it makes this message useless for non-WP installations, or for any multi-lingual sites like Commons.
Mar 1 2019
Feb 28 2019
Feb 21 2019
Note that Graphoid already supports POST approach (see v2 Graphoid api). You can post a graph spec to it, and it will return an image. But v2 was never actually used by any services, even though it is in production, just not exposed to end users. The issue with v2 is that we don't want to make MW parser (HTML rendering) dependent on an external service because of potential (unverified) speed concern - we need to generate some graph image URL without waiting for Graphoid to actually generate that image.
Feb 13 2019
@Umherirrender that's an interesting idea, thx. It is not exactly the same because this wouldn't get me the redirect to redirect - e.g. A->B->C would only get me B->C without A->C.
Feb 11 2019
@Pchelolo @Gehel I agree that Node 8 upgrade is urgent and must happen. This ticket was discussing Node 10, and mentioned that it was blocked on some dependencies, so I was proposing how to help with the difficulties of dependency upgrades, and k8s might be a good solution to that. I will be happy to chat on IRC. Thanks!!!
@Gehel is there a doc/procedure describing limits to docker deployment? E.g. would WMF have a docker repository and a well established build process that uploads to it, does it limit to the base images like "must use base Debian but not base Alpine", etc.
@MSantos and @Jhernandez , would it perhaps make things far simpler to migrate to use Kubernetes/Docker? Each maps service will be able to use any version of Node, or any native package, or even a different version/distribution of Linux as needed, and it will not affect any other services even if they are running on the same physical hardware. Even migration will be easier, because you will be able to use different version of Node on the same machine by different instances of Tilerator/Kartotherian.
Feb 7 2019
Feb 4 2019
Note that this is also possible with Sophox -- e.g. a SPARQL query can use Wikidata via federation together with all of OSM data itself (tags), plus attach the OSM geometry shapes to each wikidata ID it sees in the query result. Click "run query" under the example.
Feb 1 2019
FYI, there has been a number of discussions at OSM on how to document disputed territories. See the latest proposal.
Jan 27 2019
Jan 22 2019
Jan 19 2019
@akosiaris at the moment, yes, graphs are shown if they are in page-props db - i.e. generated from the last page revision. If you try to view an older version of the page, you will only see the graph image if it hasn't changed since then, otherwise you will see a broken image (because the hash would be different, and that hash would not be stored in the page props). On the other hand, you can view the older variant if you use page preview -- graphs will be rendered on the client, without using page props.
Jan 18 2019
The actual data table is present. Was there any related code changes in MW?
I think the first step is to save HTML (preserve template/module parsing results). Next step - when snapshoting, switch to image permalinks. Lastly, implement "computed blobs storage" as this ticket describes, and also use permalinks when snapshoting.
These things don't need to happen at the same time. Even making it possible to view proper text of the older version of an article is a good first step.
Jan 17 2019
Jan 16 2019
Most of the time, Vega is used via a template, because otherwise you have a massive copy/paste of code without any benefit, while having no way to fix issues or improve appearance of all graphs en mass. Thus, per what @Anomie said - MCR is an orthogonal (in its current form) to the generated content. This actually has more similarities with the image thumb service than MCR (content is generated from "master" - wiki markup, and cached for usage by both the rendering service like Graphoid and directly from the client via the dynamic graph loading).
Jan 3 2019
Jan 2 2019
Are there any updates/progress on this issue? The OpenStreetMap Wikibase has both the local images and it supports Commons, which means every OSM Wikibase "image" property is actually two properties - one of "image" type, and another being a manually copy/pasted string of the local (OSM) wiki file - in case the file is not on Commons. It would greatly simplify things for OSM community if an image property would be a single "media", regardless of where it is actually stored. What could be done to solve this?
Dec 20 2018
I agree with @Legoktm -- storing data blobs in page props was a hack. But to my knowledge, there is no good alternative storage for the parser-generated blobs. Essentially any system that needs independent access to those blobs would require something like this, essentially solving T119043
See related approach by Module:TNT -- it stores templatedata as a table (example). In this case <templatedata> will be dynamically generated during the parse time, and is available to every language/every wiki.