This script will be useful to generate lots of data so that we can run some performance tests and get some measurements re different implementations of anything we have performance questions on. Also remember to allow high degree of duplication in term texts across languages, types and entity type.
Description
Details
Project | Branch | Lines +/- | Subject | |
---|---|---|---|---|
mediawiki/extensions/Wikibase | master | +301 -0 | Add random entities and terms generator maintenance script to repo. |
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | • Addshore | T208425 [EPIC] Kill the wb_terms table | |||
Resolved | ArielGlenn | T226167 audit public tables and make sure we dump them all | |||
Resolved | • Addshore | T219175 [Mega] - Migrate data from wb_terms to new schema | |||
Resolved | • Addshore | T219120 [Checkpoint 1] Create Schema, Migration plan and Doctrine DBAL connection | |||
Resolved | • alaa_wmde | T220210 Create a script to generate lots of Items/Properties with lots of Terms |
Event Timeline
While not the best fro a design or flexibility perspective, I suspect the most pragmatic approach here is to just create a MW maintenance script in Wikibase (Repo?).
@JeroenDeDauw yup that's what I did will push it in a moment .. thought I did already 😓 still needs final few lines of code
Change 503359 had a related patch set uploaded (by Alaa Sarhan; owner: Alaa Sarhan):
[mediawiki/extensions/Wikibase@master] Add random entities and terms generator maintenance script to repo.
Yesterday while thinking about design stuff I randomly realized that we might not need a script like this. Can't we just use https://github.com/Wikidata/WikibaseImport to important a bunch of real entities? If that is too slow, then perhaps we can use https://github.com/JeroenDeDauw/Replicator to import JSON dumps.
@JeroenDeDauw sure we can use that too. even better actually as we will be testing with production data
@Ladsgroup if you agree with just using WikibaseImport or Replicator or whatever that already exists, please feel free to close this one as Declined ;)
WikibaseImport contains a limited number of items and properties which is good for testing but not enough. I think we should keep this maintenance script.
WikibaseImport contains a limited number of items and properties
What does this mean? I thought WikibaseImport gets items and properties from Wikidata. How does it contain a limited number?
@JeroenDeDauw I'm testing WikbaseImport in the meanwhile .. it isn't really limited technically and one seem to be able to import all properties and entities in given ranges (from it's readme) .. though it seems to be: 1) quite slow, and 2) importing everything (incl. statements and linked entities) with no way to disable it (adding to the slowness) and 3) requires a separate wikibase instance to import from (a dependency that one might not want to have, esp. locally).
This script generates entities with only terms attached to them (no statements yet, but could be added later with an option, say --with-statements). Those generated random entities can be used for stress tests, and maybe as fixtures for integration tests.
Change 503359 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Add random entities and terms generator maintenance script to repo.