Wed, Mar 22
Sat, Mar 18
Wed, Mar 15
@Pnorman are you adding an index to OSM ID for updates? Also, why an index on geometry? I don't see a usecase for that yet (can always be added later if needed)
I suggest you use a python script to get the data from overpass-turbo like this one (I wil upload the new version today that includes nodes and ways). I use that script to validate that OSM's wikidata IDs match Wikidata instance-of and possibly other properties.
Tue, Mar 14
@Nemo_bis, what alternative to template translation are you suggesting?
Sat, Mar 11
Sounds great. Source loader can consume multiple source files, so tilerator could use production file plus some more sources
Fri, Mar 10
In case needed, Kartotherian has a module to combine multiple sources based on zoom - e.g. data for zooms 0..4 can be coming from one source, and 5+ from the other
Thu, Mar 9
Wed, Mar 8
@Smalyshev you are right that it shouldn't be random. Instead, we could establish a well known list of the fallback languages. I would argue that latin-based languages should be first in that list, followed by the "closeness" to latin alphabet - e.g. if there are no known latin-language, use the next script that has the highest number of speakers or the number of Wikipedia readers, but is the closest to Latin. E.g. Russian probably before Greek, but Greek before Chineese. Or something along those lines. It really doesn't matter what order we choose, as long as there is a way to get something. Having nothing is always the worst.
Tue, Mar 7
For those who work with the data extensively, could we have an easy way to copy wikidata IDs without navigating to them? Goal: when viewing an item, to be able to quickly copy Pnnn and Qnnn values of any statement. This means that when showing that Q or P number, it should not be a link (links are much harder to select). Thanks!
@MaxSem that extra blob of json can be added to the sources.prod.yaml file - it supports metadata injection.
Quite a few users have been requesting this. The Vega graphs already support this boxing mode, it just requires an extra param in the spec. @JGirault, what would happen if the actual image is bigger than the size you auto-detect? Will it autogrow? I think the best way to give this option to the users (literally two people asked me about it last night), is to make it optional instead of on by default. This way graph template authors can easily use this functionality when they design graphs in a way that will work with it.
@Lydia_Pintscher lets not close it, but reassign it to hovercard as one of the requirements? Is there a tag for it?
Mon, Mar 6
Sun, Mar 5
Wed, Mar 1
Sure, all existing tech can be used for this. I would suggest creating a table first using a .tab page on commons. That table should probably have a countryId (string) column (values like "US", "FR", ...), and you can add all sorts of other fun columns there - like number of images uploaded? Organizer1,2,3? Basically think of a spreadsheet , so whatever fits into a table structure, you can add there. Once the data is figured out, you can create both a graph and a table (wiki markup) from it. The graph would use the table for the list of countries to highlight (and possibly make it proportional if you want some sort of a competitive map), and the lua modules could use that same data to generate the list of participating countries, ...
Tue, Feb 28
@Lydia_Pintscher, showing description assumes that it is given for each item. Never the case. Any time i search in wikidata, it shows me useless Qnnn, or at most a label, because the search does not use language fallbacks. P-31/P-279 have much higher chance of having more informative label/description than the item itself, especially in the language i'm searching.
Mon, Feb 27
Fri, Feb 24
Feb 22 2017
@Smalyshev the issue here is really about the location of coordinates. Commons' datasets, and mapframe/maplink tags may contain all tags, points, and shapes. Wikidata cannot contain shapes, but can contain the rest. OSM cannot contain anything that is outside of their scope (like historical features, zip code areas, animal migration paths, etc). So the question is - should Wikipedia allow point coordinates to be retrieved from OSM - in other words treat a single [longitude, latitude] coordinate pair as an object that can be referenced by a wikidata ID, or should that pair be stored in all the other places. The geoshapes cannot be stored in Wikidata, hence its natural to be stored everywhere else. In a way - should we normalize or denormalize point data? Shapes clearly should be normalized.
Feb 21 2017
This is easy enough to fix by adding data to memcached when saving, just like we do in graphoid. Moreover, this can be done at the jsonconfig level.
I'm a bit unsure if there should be a node coordinate support in maps. Our main use case is to prevent significant data duplication, by reducing complex geometries (e.g. an outline of a country, city, or a river) to a single wikidata ID, or even better - to a SPARQL query that gets that ID. So we prevent duplication by getting it from OSM. So - OSM and .map datasets store geometries, while wikidata and .map datasets can store data points. Note that we already have a limitation - OSM can only provide geometries, not the associated tags like names, population, etc -- all that data can only come from wikidata. I think we should continue this split -- simple data comes from wiki sources (wikidata, .map datasets, or directly in <mapframe>/<maplink>, while complex geometries should not be duplicated, and should come from OSM if available.
Feb 17 2017
@JGirault, looks awesome, thanks!
Feb 14 2017
Feb 10 2017
I just posted a question to community on how to handle language fallbacks. Also, got it to run on my machine. :)
Feb 9 2017
@Pnorman geoshape service accesses both the line and polygon tables. If we can generate an alternative data source for shapes, it would be good (because we could also solve the bug with non-closed relations like roads and rivers)
Feb 7 2017
Feb 4 2017
Feb 3 2017
Feb 1 2017
Jan 27 2017
Jan 26 2017
Jan 25 2017
Jan 24 2017
@Pnorman I think it makes sense to add all names even if they won't be used immediately, simply because the tiles wouldn't need to be regenerated later on (we seem to have a lot of problems with that due to space and overall complexity of switching involved). Also, I suspect that road names might benefit as well, e.g. when roads are in non-latin script.
@Pnorman, thanks for the in-depth notes, I do plan to work on implementing Kartotherian support for unlimited number of languages functionality relatively soon. How difficult would it be for you to already store that data in some JSON field for now, even though it would not be used? How much extra storage would it be? @TheDJ thanks for your support! @Deskana, when have the Interactive team not met announcement expectations by the communities? Has there been any complains by the community?
Jan 23 2017
@Deskana, T154071 is actually a very simple task - it simply requires an extra string to be allowed in the dataset, and a few localization strings to be added. Tiny patch. CC-BY might be easier than ODBL for legal reasons though -- CCing @Slaporte -- so maybe just do a CC-BY to solve 50%
@Legoktm I think you would have to set priority of all maps, graphs, and datasets to lowest in this case, which I suspect is not the intention. The priority of this bug is still much higher than most other map bugs, its just that they are not being scheduled for implementation.
@Pnorman what would you recommend the best way for OSM editors to create such objects? Or should there be a software solution on the server side?
Jan 18 2017
Jan 17 2017
we could probably even have half of january as well :)
Looks awesome! Could we add the more recent data to it too?
Jan 15 2017
I'm on the fence about the "languages using the same script as the requested one" is the best path forward. For example, if a Russian user tries to get a label that has Ukranian and English, which should be preferable? I would argue that in a general case, Latin-based scripts have a greater universal appeal (at least in Russian schools Latin script was studied very early on), but I don't know the situation in other countries.
Jan 14 2017
Jan 13 2017
It actually appears this is due to the first letter casing being different, and not being normalized by the jsonconfig title parser
Jan 11 2017
Seems that the MW API graphoid uses to fetch the dataset from commons gives stale data.