Page MenuHomePhabricator

Add all prefixes defined in Blazegraph
Closed, ResolvedPublic

Description

As of now, the Jena parser fails if it cannot find some prefix definitions.

We would like to include a list of all prefixes defined in Blazegraph by reusing those declared in other parts of the code, instead of listing them separately for the parser.

Event Timeline

Change 710567 had a related patch set uploaded (by AKhatun; author: AKhatun):

[wikidata/query/rdf@master] Add missing prefixes to prevent SPARQL parsing errors

https://gerrit.wikimedia.org/r/710567

Change 710567 merged by jenkins-bot:

[wikidata/query/rdf@master] Add missing prefixes to prevent SPARQL parsing errors

https://gerrit.wikimedia.org/r/710567

Change 711157 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[wikimedia/discovery/analytics@master] Bump rdf-spark-tools 0.3.77 -> 0.3.81

https://gerrit.wikimedia.org/r/711157

Change 711157 merged by jenkins-bot:

[wikimedia/discovery/analytics@master] Bump rdf-spark-tools 0.3.77 -> 0.3.81

https://gerrit.wikimedia.org/r/711157

Mentioned in SAL (#wikimedia-operations) [2021-08-10T16:34:47Z] <ebernhardson@deploy1002> Started deploy [wikimedia/discovery/analytics@d3c5363]: T287225: Bump rdf-spark-tools to 0.3.81

Mentioned in SAL (#wikimedia-operations) [2021-08-10T16:36:57Z] <ebernhardson@deploy1002> Finished deploy [wikimedia/discovery/analytics@d3c5363]: T287225: Bump rdf-spark-tools to 0.3.81 (duration: 02m 10s)

This is now deployed, the first hour of processing it applies to should be 2021-08-10T14:00Z

This is now deployed, the first hour of processing it applies to should be 2021-08-10T14:00Z

@EBernhardson: Thanks for the deploy!
Can we re-run the previous jobs? All preferably, since the analysis will require previous data.

Edit: catch up is set to true, so maybe just deleting the existing runs?

Checked the inputs, looks like we have data going back to 2021-05-12T00:00Z , I've cleared the relevant time frame and it's started rerunning them.

sudo -u analytics-search airflow clear --start_date 2021-05-12T00:00Z --end_date 2021-08-10T23:00Z process_sparql_query_hourly