Maniphest T283575

Wikidata Analytics: codebase modularization
Closed, ResolvedPublic
Actions

Assigned To

GoranSMilovanovic

Authored By

	GoranSMilovanovic
	May 25 2021, 10:31 AM

Tags

Referenced Files

None

Subscribers

GoranSMilovanovic

Lydia_Pintscher

Description

There is still some modularization that should happen in the Wikidata Analytics core codebase:

functions that work with the Wikibase API (e.g. to fetch Wikidata labels);
funtcions that work with SPARQL/GAS programs against WDQS;
OS operations for hdfs I/O + large file re-compositions;
some hand crafted batch operations to produce large co-occurrence matrices;
and some other misc things.

All of the above + if anything else is found needs to be re-factored so that all Wikidata Analytics components use one and the same set of functions.
This implies a development of an internal R package to be installed and used on the Analytics Clients (the stat100* machines).

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Open		None	T283568 [Epic] Wikidata Analytics Core Codebase Maintenance
		Resolved		GoranSMilovanovic	T283575 Wikidata Analytics: codebase modularization

Event Timeline

GoranSMilovanovic created this task.May 25 2021, 10:31 AM

GoranSMilovanovic moved this task from Technical Wishlist to Wikidata Analytics on the User-GoranSMilovanovic board.May 25 2021, 10:34 AM

GoranSMilovanovic moved this task from Wikidata Analytics to Prioritized on the User-GoranSMilovanovic board.Jun 1 2021, 11:55 PM

Current status:

funtcions that work with SPARQL/GAS programs against WDQS;

Develop an R function to query SPARQL/GAS against WDQS constrained by N (arbitrary) attempts to guard against timeout errors.

Manuel unsubscribed.Jun 17 2021, 10:14 AM

GoranSMilovanovic moved this task from Prioritized to Current/Deprioritized on the User-GoranSMilovanovic board.Jul 5 2021, 10:42 PM

GoranSMilovanovic moved this task from Current/Deprioritized to Prioritized on the User-GoranSMilovanovic board.Jul 28 2021, 12:24 AM

GoranSMilovanovic mentioned this in T283568: [Epic] Wikidata Analytics Core Codebase Maintenance.Aug 22 2021, 9:03 PM

Deploying {WMDEData} with renv::install() across the WMF Analytics Clients (stat1004, stat1005, stat1006, stat1007, stat1008).

{WMDEData} (with renv{}) is now deployed across the WMF Analytics Clients.

Change 714433 had a related patch set uploaded (by GoranSMilovanovic; author: GoranSMilovanovic):

[analytics/wmde/WD/WikidataAnalytics@master] T283575

https://gerrit.wikimedia.org/r/714433

Change 714433 merged by GoranSMilovanovic:

[analytics/wmde/WD/WikidataAnalytics@master] T283575

https://gerrit.wikimedia.org/r/714433

GoranSMilovanovic mentioned this in rAWWA94c500cd8761: T283575.Aug 23 2021, 11:16 PM

WD_percentUsage_PRODUCTION.R component of the WD Usage and Coverage system tested and running on regular schedule from stat1008.

Maintenance_bot removed a project: Patch-For-Review.Aug 24 2021, 12:10 AM

Change 714540 had a related patch set uploaded (by GoranSMilovanovic; author: GoranSMilovanovic):

[analytics/wmde/WD/WikidataAnalytics@master] T283575

https://gerrit.wikimedia.org/r/714540

gerritbot added a project: Patch-For-Review.Aug 24 2021, 9:28 AM

Change 714540 merged by GoranSMilovanovic:

[analytics/wmde/WD/WikidataAnalytics@master] T283575

https://gerrit.wikimedia.org/r/714540

GoranSMilovanovic mentioned this in rAWWA1ca702781cf2: T283575.Aug 26 2021, 11:21 AM

Maintenance_bot removed a project: Patch-For-Review.Aug 26 2021, 12:10 PM

Change 719229 had a related patch set uploaded (by GoranSMilovanovic; author: GoranSMilovanovic):

[analytics/wmde/WD/WikidataAnalytics@master] T283575

https://gerrit.wikimedia.org/r/719229

Change 719229 merged by GoranSMilovanovic:

[analytics/wmde/WD/WikidataAnalytics@master] T283575

https://gerrit.wikimedia.org/r/719229

GoranSMilovanovic mentioned this in rAWWAa8d2540f08db: T283575.Sep 7 2021, 9:29 AM

Maintenance_bot removed a project: Patch-For-Review.Sep 7 2021, 10:10 AM

Change 719340 had a related patch set uploaded (by GoranSMilovanovic; author: GoranSMilovanovic):

[analytics/wmde/WD/WikidataAnalytics@master] T283575

https://gerrit.wikimedia.org/r/719340

Change 719340 merged by GoranSMilovanovic:

[analytics/wmde/WD/WikidataAnalytics@master] T283575

https://gerrit.wikimedia.org/r/719340

GoranSMilovanovic mentioned this in rAWWA1dde9facaa21: T283575.Sep 7 2021, 7:24 PM

GoranSMilovanovic closed this task as Resolved.Sep 7 2021, 9:58 PM