Page MenuHomePhabricator
Feed Advanced Search

Jun 21 2021

bennofs added a comment to T222985: Provide wikidata JSON dumps compressed with zstd .

lbzip2 decompresses in parallel as well. We use that for compression of the SQL/XML dumps.

Jun 21 2021, 6:01 AM · [DEPRECATED] wdwb-tech, Dumps-Generation, Wikidata

Jun 16 2021

bennofs added a comment to T247449: wdumps custom generated dumps storage space.

I've moved the dump files to scratch for now, hope this helps. I also manually cleaned up some big dumps.

Jun 16 2021, 3:30 PM · Data-Services, cloud-services-team (Kanban)

Mar 12 2020

bennofs added a comment to T247449: wdumps custom generated dumps storage space.

Sorry for that, I'll look into automating the cleanup.

Mar 12 2020, 12:48 AM · Data-Services, cloud-services-team (Kanban)

Mar 11 2020

Losangelosgenetics awarded T246562: 'wdumps' tool does not contain the expected www/python/src/app.py entrypoint script a Dislike token.
Mar 11 2020, 8:46 PM · Toolforge, Tools

Mar 3 2020

bennofs added a comment to T246562: 'wdumps' tool does not contain the expected www/python/src/app.py entrypoint script.

Your legacy URL ingress is also not including the same rewrite rules as webservice would create.

Are these rewrite rules documented somewhere? This was one of the troubles I had getting started with using webservice directly, in that I have no idea how ingress works and what is rewritten (is the /$TOOL name removed from the URL before passing the request to my app or is it not?). The only reference I could find to this was https://wikitech.wikimedia.org/wiki/Help:Toolforge/Web#Other_/_generic_web_servers, but that is at the end of a very long article and also mixes old grid engine and kubernetes parts.

Mar 3 2020, 1:18 AM · Toolforge, Tools

Mar 1 2020

bennofs closed T246562: 'wdumps' tool does not contain the expected www/python/src/app.py entrypoint script as Resolved.
Mar 1 2020, 8:47 PM · Toolforge, Tools
bennofs added a comment to T246562: 'wdumps' tool does not contain the expected www/python/src/app.py entrypoint script.

Thanks, I cleaned up the k8s configs and it works fine again now. I might consider switching to the webservice tooling later, however I also like using raw k8s since it is easier to understand what is going on and modify things as necessary.

Mar 1 2020, 8:47 PM · Toolforge, Tools
bennofs added a comment to T246562: 'wdumps' tool does not contain the expected www/python/src/app.py entrypoint script.

The tool never used the Webservice command launcher. It starts its own uwsgi process using k8s. I will look into what's wrong. It definitely used to work.

Mar 1 2020, 5:02 PM · Toolforge, Tools

Sep 21 2019

bennofs added a comment to T147577: NotMaterializedException (Vocab(2):<various>) on combination of subquery, limit, triple, and label service.

This does not seem fully fixed yet: https://www.wikidata.org/wiki/Wikidata_talk:SPARQL_query_service#Possible_bug. Example from that post:

Sep 21 2019, 11:42 AM · Upstream, Discovery-ARCHIVED, Wikidata, Wikidata-Query-Service

Aug 19 2019

bennofs added a comment to T230588: Wikidata Query Service is swapping items and properties.

This query https://query.wikidata.org/#SELECT%20%3Fprop%20%3Ftype%20WHERE%20%7B%20%3Fprop%20wikibase%3ApropertyType%20%3Ftype%20FILTER%20%28CONTAINS%28STR%28%3Fprop%29%2C%22Q%22%29%20%26%26%203%21%3D1%29%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%7D still returns wrong results. Has WDQS not updated yet?

Aug 19 2019, 6:12 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

May 15 2019

bennofs added a comment to T222985: Provide wikidata JSON dumps compressed with zstd .
$ time zstdcat -v -d wikidata-20190506-all.json.bz2 | zstd > /dev/null                                                                                                                                                  
May 15 2019, 11:13 AM · [DEPRECATED] wdwb-tech, Dumps-Generation, Wikidata
bennofs added a comment to T222985: Provide wikidata JSON dumps compressed with zstd .

But I can do a zstd decompression -> zstd compression test.

May 15 2019, 11:08 AM · [DEPRECATED] wdwb-tech, Dumps-Generation, Wikidata
bennofs added a comment to T222985: Provide wikidata JSON dumps compressed with zstd .

I don't have enough disk space for a compression test, that's correct.

May 15 2019, 11:07 AM · [DEPRECATED] wdwb-tech, Dumps-Generation, Wikidata
bennofs added a comment to T222985: Provide wikidata JSON dumps compressed with zstd .

Now the same with zstd:

May 15 2019, 9:04 AM · [DEPRECATED] wdwb-tech, Dumps-Generation, Wikidata

May 14 2019

bennofs added a comment to T222985: Provide wikidata JSON dumps compressed with zstd .

So I tried lbzip2, here's the result (on a VM sever with 2 cores, 2.1GHz, the decompression is CPU bound):

May 14 2019, 5:12 PM · [DEPRECATED] wdwb-tech, Dumps-Generation, Wikidata

May 10 2019

bennofs created T222985: Provide wikidata JSON dumps compressed with zstd .
May 10 2019, 10:06 PM · [DEPRECATED] wdwb-tech, Dumps-Generation, Wikidata