Fri, Mar 2
Looks like it was just cache corruption. This is weird; I don't know how it could happen short of actual filesystem corruption. I deleted the cache files and regenerated them. Looks fine now. I could add a "force purge" option, but I'm a little worried that could be abused as a DOS attack vector.
Yikes, that looks very messed up. I can rebuild the stack on labs and see if that fixes it.
Feb 3 2018
All righty! I deployed a new version that uses jsub to deploy the processing tasks on the grid. Unfortunately the -once parameter is still unreliable so I might have to add my own locking if it turns out to be a problem.
Feb 1 2018
Hello @chasemp as a matter of fact I can. I had written code for the panoramic image viewer reprojection that utilizes the grid. I should be able to apply the same to the zoomviewer. I'll work on it over the weekend if that's fine.
Dec 28 2017
@Dispenser, backup is fine, but have you thought about making GHEL work with the new setup?
Dec 14 2017
Death blow for GHEL coordinate extraction and WikiMiniAtlas. 🙁
Sep 19 2017
I have written a script that checks the tif file integrity (using imagemagick). It has already weeded out dozens of broken files (including the Napoleon). I will put that into the crontab to run weekly.
Sep 18 2017
@Shonagon, thanks for the heads up on the Napoleon image. Let me see if I can identify broken images and purge them automatically. I should probably use the method I developed for tiled 360 degree panoramics (image processing on the grid infrastructure) for the Zoomviewer, too. I think that would make it more robust.
I have deleted the cache file. Seems to work now.
Sep 14 2017
Which IE version are you using?
Sigh, zoomviewer used to be a _lot_ faster. I wonder what changed. I'll follow up on this. I see that somebody is trying to pull up the ordnance map which still needs to be preprocessed (the script should do that automatically). Let me know how that works.
Ok, running again. The webservice was hung:
Alright, I'll take a look. On mobile right now.
Please create a new task for this. A constantly open "Zoomviewer is down" task is misleading and counterproductive. You are welcome to reopen this when the zoomviewer is down again and needs attention from me (and has not been taken over by WMF yet).
I'm not away.
Sep 13 2017
Aug 29 2017
I merged and deployed the patch by @TheDJ , thanks!
Jul 7 2017
Lame. The webservice was down. A simple webservice start brought it back up. Do I really have to put in a cronjob that kick the service once in a while?
Ok, noted. I'll investigate.
Jun 23 2017
but at least @dschwen can comment directly in this task now
Mar 20 2017
Andrew, I managed to get my existing VM running again. You can lower the quota again.
Mar 17 2017
Ok, up till now I had no pressure to get on horizon, but I need to rebuild an instance now, and being unable to log in is becoming a major showstopper for me now. I'd really appreciate some help on this.
Mar 16 2017
Yeah, no change
I'll try that, but I seriously doubt this is the issue here. I use GA for a whole bunch of services and horizon is the only one that gives me grief. (and compared to https://time.is/ my phone is within one second)
Yes, it is still happening.
Ok, I think I'm back in business! Testing a bit more now.
Ok, next issue is that suddenly the column the_geom does not exist anymore in the land_polygons and coastlines tables....
Looks like the OSM data uses SRID 3857 and I compare to a Bounding Box with SRID 900913
Sure the query is here: https://github.com/dschwen/wikiminiatlas/blob/master/tiles/jsontile.php#L115
After fixing those to ST_SetSRID my PostGIS query now fails with
Nope. My stuff fails now with:
Feb 23 2017
@chasemp yes a few days downtime should be OK. I have a cache layer that should serve most of the requests.
Yes, it is used by me! I'm pulling data from that server for the client-side rendered tiles and 3D buildings in WikiMiniAtlas.
Feb 20 2017
Yes, shut down now, delete later.
Looking closer at the apache2 config it looks liek this is a debug/experimental server
Hmmm, Yeah. I did /etc/init.d/renderd stop and apache2ctl stop and the OSM widget tile service on de.WP still functioned just as good/bad (lots of 404 at high zooms either way).
Ok, there is an apache2 with modtile, a renderd, and a (small 34MB)_ postgres 9.1 database running on there.
modtile and renderd are installed from http://ppa.launchpad.net/kakrueger/osm-unstable/ubuntu/
For modtile and renderd there are trusty packages on that PPA.
I personally am not using this instance. It might be in use live for the OSM gadget on the german Wikipedia (does https://tiles.wmflabs.org/ point to that machine?)
If nobody else steps up I could try to log in and see if do-release-upgrade works there, too (it did for my instances). This will most likely require a rebuild of the map stack on that machine, though. But if the machine is lost otherwise it might just be worth the risk.
Feb 13 2017
Yeah, the pruning confused the system. I'll try to fix this.
Hey all! The _corruption_ i.e. missing tifs is probably a result of a requested cache purge that I performed a while ago. I'll take a look.
Feb 2 2017
Done. I put the find command into a small script and added it to the crontab (via jsub).
Jan 10 2017
@Andrew I upgraded the other two instances to Xenial. Given that the upgrade was rather painless (so far... I hope all the puppet stuff is still working as intended. Do I need to let puppet know I upgraded?!) I will just continue upgrading those instances when the time comes.
I have upgraded my remaining instances.
Jan 9 2017
Phew! Thanks, yes, much better :-)
NONONONO!!!!! CAN NOT BE DELETED. I upgraded it!!!!!!
fastcci-master was successfully upgraded to 14.04LTS, please take it off the list. I'll work on fastcci-worker1 next!
Yeah, well, still not working for me. Maybe somebody could take a look.
Jan 8 2017
I tried disabling and re-enabling 2fa. Still the same error.
Jan 6 2017
I'm upgrading them to trusty, will that work?
Jan 5 2017
Ok, stupid(?) question: Can't I just do a release upgrade (do-release-upgrade) and be in the clear?
Please do not remove the fastcci or maps-wma1 instances! They are being used.
Sep 2 2016
Jul 26 2016
Yes! Many thanks!
Jul 25 2016
Uuuuaaahhhh, now I'm getting ERROR: permission denied for relation coastlines
Jul 13 2016
We are currently importing OSM data without the -K|--keep-coastlines switch. I.e. the main tables do NOT contain any coastline data. Instead we are using postprocessed coastline data in a special table. HOWEVER this data is not automatically updated, and probably hasn't been updated in about two years at all!
Jun 30 2016
What we need is called a "gnomonic" projection. Here's an example: https://mycodingwrongs.wordpress.com/2010/07/24/reprojecting-blue-marble/
Jun 29 2016
Sounds good, what do you need from me?
Jun 17 2016
It would really be nice I somebody kept me in the loop if analytics are done to study the use of WMA. If I only find out about it because it breaks the WMA for some users that kind of... sucks.
Looks like somebody added some logging https://meta.wikimedia.org/wiki/Schema:GeoFeatures
Hmm, I have no idea what schema.geoFeatures is. I don't think I added that. In Chrome the map still loads though.
Jun 11 2016
@notconfusing rebooting the instances took care of it. Sorry about the slow answer. I rebooted a single instance first and wanted to make sure it helped before rebooting all of them.
May 25 2016
Hey Chase, I went ahead and deleted cache entries older than 90 days.
May 22 2016
running manually it seems to hang after
Apr 17 2016
Ok, we're back.
Ah, right libtiff.so.5: cannot open shared object file: No such file or directory, so where the frack did that go now?
I don't know that the F$^&* is wrong with labs. It is very frustrating that the level of stability that it provided is rather low. It ends up creating a burden that increases with the number of projects one has. So apparently this has been broken for 4 days and only now I hear about it (might be time to invest in some better monitoring). Service is out for no good reason.
Apr 5 2016
Do you have a link/example?
Mar 30 2016
I'd love to see mobile support for Gadgets. In commons we have a bunch of user interface enhancements that would translate to the mobile version (like the "Good Pictures" button: https://commons.wikimedia.org/wiki/Help:FastCCI).
Mar 22 2016
No further emails received. Closing. Thanks!
Mar 13 2016
Thanks, I upgraded puppet. Let's see if that makes me compliant again :-)
Just got another one
Mar 9 2016
Last nag: ~13h ago. Fixed three days ago.
Mar 3 2016
P.S.: You may need to clear your browser cache to see the new version properly...
Ok, it is live!
Man, that image is quite large. Nice. But it is pushing the envelope for the conversion process. I may need to work more on the backend to support images of that size more efficiently. I have the automatic refresh system going (the BoschTheCrucifixionOfStJulia.jpg image is still building the pyramidal TIFF). It is available by replacing index.php with index2.php (it needs more testing before I move it over).
Fae, I'm working on something. The logic that refreshes the image cache is borked. I'm reimplementing this completely. Sorry it is taking so long.
Mar 2 2016
Will take a look. Thanks for the bug report.
Mar 1 2016
Feb 29 2016
Restarted the webserver 45mins ago. The rest of the time I spent figuring out my LDAP password to reply :-)
Dec 15 2015
Jheald, the ZoomViewer has currently 86574 cached images, I would guess that quite a few of your 50000 artworks are already among them. Either way, it is not a mere drop, but not a flood either ;-)
Details are extracted on the fly and do not require any further storage.
I found and fixed that bug.
Jheald, the code is already in proxy.php to create the cached tile representation (multiresolution TIFF pyramid). Apparently it has a bug that prevents it from working :o)
Thanks Andrew. I was not aware that I disable puppet. I'll check out what I did there.