Page MenuHomePhabricator

Automate bulk export of vector tiles
Closed, DeclinedPublic

Description

All of our maps vector data (as stored in Cassandra at the moment) needs to be exported into large files, and made available for download. This will prevent dumb page-scraping, and simplify upload to Mapbox cluster to use their GL Studio for style creation.

TBD: We might want to export it as mbtiles or as zip files

  • Only the last 1 or 2 copies is needed. History can be discarded
  • Estimated total download size is around 60GB

Event Timeline

Just adding my .02€, what sort of storage requirements are we talking about here, and what hosts would be used for generating these files?

Description is very minimal at this point. @Yurik could you add more details? Expected size of those dumps, constraints that you are aware about? List of things that need to happen to make this work?

Once those are exported to some location we can grab, I can set up an rsync to pick them up and plop them on the dataset hosts for public access. 60GB or even 120GB if we keep the last couple, is fine.

Those exports do not exist yet, so all solutions are possible. There is enough space on the maps servers at this point, so we can certainly export locally on maps1001.eqiad.wmnet and rsync them from there. If that's the solution that make sense from @ArielGlenn point of view, let's go for it.

I'm still not entirely sure I understand the use case here. If the cost of hosting those 120GB is small enough, let's just do it and see how the use case develop. If more justification is needed, let's dig...

Mholloway subscribed.

Per discussion with @Gehel, not a big enough problem to expend limited resources on right now.