I think wikipedias have high enough traffic to always use the static service. Wikivoyage and private wikis should never use it. For all others, i think either way is fine.
I think only wikivoyage and private wikis were set up that way. Private wikis would have security issues and are much harder to set up for the service, so it was not worth the trouble.
I don't think this would be a very good idea for the Wikivoyage - their usage is highly map-oriented, so adding a static page with an additional "click to load" might not be palatable to the community.
Sun, Sep 24
Thu, Sep 21
I think both of these are from the service template. CC @services group.
Wed, Sep 20
@Lydia_Pintscher I found that I can specify the list of fallbacks, but can I specify a list + "anything", which doesn't even have to be deterministic? Without it, one would have to generate a full list of all sites with every link - just like we currently have in the WDQS examples page. You don't want that :)
This is awesome, sorry I didn't know about it! Is fallback documented anywhere? (i did try it, and it does work with comma-separated site values)
Tue, Sep 19
So it seems the sources & variables file specified in the /etc/tilerator/config.yaml has incorrectly specifying the username/password, most likely for postgres db. I remember @Gehel was doing some cleanup to get various test and prod boxes in sync for that - double check with him.
@Pnorman which sources config file are you using?
I think all 4 are good, but I would like @MaxSem to sign off on it too :)
@Gehel, they are unused at this point - I used data and vem to test some data stuff. I think we can delete them at this point. If I need to, I will recreate a test instance.
Should we also show this warning for embed? I have raised the timeout to 3 minutes, so it would be bad user experience to wait for the query, even though the reason it takes so long might be due to server hanging. Something like lastModified is a very quick query, so if that fails too, it would be a good indicator of the server being down.
Agree, it should not prevent editing at all. I think you meant T=10s, N=3
Sun, Sep 17
@Smalyshev, I force-killed it, and it dumped this (I couldn't copy all of it - just some parts):
@Smalyshev & @Gehel I'm not sure if this is the same error or different. Today, the service froze in a peculiar way: all queries would time out (both from clients and update ones), and Blazegraph wouldn't quit with Ctrl+C. HTOP shows a single blazegraph process using 100% of a single CPU, but about once in a few minutes, almost all CPUs would jump to a 100% for about 5-10 seconds, and then go back to 0 except for a single process. The last errors in the log, might be unrelated:
Fri, Sep 15
Thu, Sep 7
Wed, Sep 6
On the same topic, the language code for schema:inLanguage "en" has the same issue - there are also 60 million of them, with about 10 mil being English - so that's another 0.5 GB, plus some unknown perf benefit. I wonder if it would make sense to pre-declare only top 10 languages and top 10 wikis - I suspect it would almost as beneficial, without having to maintain ever changing list.
There are 60million of isPartOf statements, and I assume that all of them have their objects as the root of a wiki. The space saving would be 9-2 bytes = ~0.5 GB, so not very significant, but it also eliminates a lookup for each value. I wonder how we can measure the performance benefit. I couldn't run the count(distinct ?obj)query due to timeout.
I would like to solicit more community feedback on how useful this would be. Perhaps this is not needed at all, or not worth the hassle As an already working example on a test server, here is a query that lists Wikidata items without French labels but with French articles, ordered by the popularity of the French articles.
Tue, Sep 5
@Mike_Peel <mapframe> is currently not enabled on enwp. You need to create a new phabricator ticket to request it, and gather community consesus that it is needed - WMF have been enabling it on per-request basis.
@Lydia_Pintscher, having a built in ranking system is awesome, but that's a problem of search optimization - just like the other ticket suggests, it will be a part of the search drop-down.
Sat, Sep 2
I just finished full re-import with -Xmx=16GB , and it worked fine. I am not sure what has caused the original issue. For future reference, here are the import stats (importing both Wikidata and OSM data from the same dir - 33GB total, 683 files of about the same size. Total time was about 36 hours. First 329 files were OSM, second 354 - wikidata (note how the graph starts jumping, possibly due to per-file statement distribution in WD. OSM data is much more uniform in terms of statement per osm object)
Thu, Aug 31
I'm re-running the import with the default settings (mmx=16gb), almost 700 gz files (127GB). This time, all of the OSM data uses integer storage instead of the original strings ("osmnode:123" treats 123 as a prefix + an int). Will know the results in 2 days.
Wed, Aug 30
@Smalyshev I agree we shouldn't implement this without a story.
@Smalyshev those functions only get the components in WGS 84 -- latitude and longitude. In order to calculate the center of a set of points, I would need to convert each point to X,Y,Z - geocentric Cartesian coordinates. Afterwards, I would need to average each one, and convert the resulting avgX, avgY, and avgZ back to WSG 84's latitude & longitude.
@Smalyshev would it be possible to at least provide a few "helper" functions like convert(point) -> ?x, ?y, ?z and the reverse? This way the user could bind x,y,z variables, do the average aggregation, and use the reverse x,y,z -> point.
Tue, Aug 29
Float is good enough to store longitude and latitude - precision 1-2 meters at worst. The calculations on the other hand would need to be done with doubles - most trigonometry functions use them anyway.
Mon, Aug 28
Sun, Aug 27
Further reading into the code:
Aug 27 2017
Aug 24 2017
@TheDJ I would call this a low signal to noise ratio - tile servers are mostly used by non-wikipedia sites, so that little blimp in internal usage is not consequential. Besides, we already saw that blimp before, when geohack enabled tiles for enwiki. I'm still not sure why it is still being discussed per @Gehel above, there is no problem with performance there. The fact that these servers benefit unrelated 3rd party sites, but does not benefit most of Wikipedia itself is a major misuse of donated funds, and should be investigated further.
Aug 23 2017
@Smalyshev we don't need to support timezones. The problem here is that in some languages like python, UTC time (Zulu) is created as +00:00 instead of Z. Both forms are fine according to ISO 8601. I think Wikidata should simply support both to mean the same thing. Per this post and looking at wp article.
@Smalyshev i added the query to the description above
Aug 22 2017
@Jonas and @Smalyshev, just FYI - I'm running the service with a few minor modifications at http://tinyurl.com/y7faq2xj (embed version). All my modifications are in github. So far I had to (code diff):
- replace hardcoded query.wikidata.org
- add a few extra namespaces
- adjust copyright text
- add a custom formatter to generate an "edit" link to OSM
Aug 21 2017
@Jdlrobson, the tile servers were ready for this (and much higher) level of traffic a year ago per multiple chats with ops. The disbanding of the team was not due to technology. This feature is actually very minor from the tile server perspective.