Page MenuHomePhabricator

Wikidata Analytics: codebase modularization
Closed, ResolvedPublic

Description

There is still some modularization that should happen in the Wikidata Analytics core codebase:

  • functions that work with the Wikibase API (e.g. to fetch Wikidata labels);
  • funtcions that work with SPARQL/GAS programs against WDQS;
  • OS operations for hdfs I/O + large file re-compositions;
  • some hand crafted batch operations to produce large co-occurrence matrices;
  • and some other misc things.

All of the above + if anything else is found needs to be re-factored so that all Wikidata Analytics components use one and the same set of functions.
This implies a development of an internal R package to be installed and used on the Analytics Clients (the stat100* machines).

Event Timeline

Current status:

funtcions that work with SPARQL/GAS programs against WDQS;

  • Develop an R function to query SPARQL/GAS against WDQS constrained by N (arbitrary) attempts to guard against timeout errors.
  • Deploying {WMDEData} with renv::install() across the WMF Analytics Clients (stat1004, stat1005, stat1006, stat1007, stat1008).
  • {WMDEData} (with renv{}) is now deployed across the WMF Analytics Clients.

Change 714433 had a related patch set uploaded (by GoranSMilovanovic; author: GoranSMilovanovic):

[analytics/wmde/WD/WikidataAnalytics@master] T283575

https://gerrit.wikimedia.org/r/714433

Change 714433 merged by GoranSMilovanovic:

[analytics/wmde/WD/WikidataAnalytics@master] T283575

https://gerrit.wikimedia.org/r/714433

Change 714540 had a related patch set uploaded (by GoranSMilovanovic; author: GoranSMilovanovic):

[analytics/wmde/WD/WikidataAnalytics@master] T283575

https://gerrit.wikimedia.org/r/714540

Change 714540 merged by GoranSMilovanovic:

[analytics/wmde/WD/WikidataAnalytics@master] T283575

https://gerrit.wikimedia.org/r/714540

Change 719229 had a related patch set uploaded (by GoranSMilovanovic; author: GoranSMilovanovic):

[analytics/wmde/WD/WikidataAnalytics@master] T283575

https://gerrit.wikimedia.org/r/719229

Change 719229 merged by GoranSMilovanovic:

[analytics/wmde/WD/WikidataAnalytics@master] T283575

https://gerrit.wikimedia.org/r/719229

Change 719340 had a related patch set uploaded (by GoranSMilovanovic; author: GoranSMilovanovic):

[analytics/wmde/WD/WikidataAnalytics@master] T283575

https://gerrit.wikimedia.org/r/719340

Change 719340 merged by GoranSMilovanovic:

[analytics/wmde/WD/WikidataAnalytics@master] T283575

https://gerrit.wikimedia.org/r/719340