Page MenuHomePhabricator
Feed Advanced Search

Jan 6 2022

AKhatun_WMF moved T293631: Get estimates for splitting other large subgraphs from Wikidata from Analysis to Current work on the Wikidata-Query-Service board.
Jan 6 2022, 5:45 AM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF moved T288257: Get estimates for size of astronomical objects and queries in Wikidata graph from Analysis to Current work on the Wikidata-Query-Service board.
Jan 6 2022, 5:44 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF moved T293631: Get estimates for splitting other large subgraphs from Wikidata from Incoming to Needs Reporting on the Discovery-Search (Current work) board.
Jan 6 2022, 5:39 AM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF added a project to T293631: Get estimates for splitting other large subgraphs from Wikidata: Discovery-Search (Current work).

With the completion of T293632 and T293636, this task is complete.

Jan 6 2022, 5:39 AM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF moved T293628: Get baseline measurements/expectations for splitting various subgraphs from Wikidata from Incoming to Needs Reporting on the Discovery-Search (Current work) board.
Jan 6 2022, 5:37 AM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF added a project to T293628: Get baseline measurements/expectations for splitting various subgraphs from Wikidata: Discovery-Search (Current work).
Jan 6 2022, 5:37 AM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF moved T293628: Get baseline measurements/expectations for splitting various subgraphs from Wikidata from Analysis to Current work on the Wikidata-Query-Service board.
Jan 6 2022, 5:36 AM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF moved T293628: Get baseline measurements/expectations for splitting various subgraphs from Wikidata from incoming to in progress on the Wikidata board.

With the completion of all subtasks, this task is complete.

Jan 6 2022, 5:35 AM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF moved T293636: Identify and analyze queries that touch on various large subgraphs from In Progress to Needs Reporting on the Discovery-Search (Current work) board.

The analysis was completed and documented here: Wikidata_Subgraph_Query_Analysis

Jan 6 2022, 5:33 AM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata

Nov 15 2021

AKhatun_WMF claimed T258834: Create a Commons equivalent of the wikidata_entity table in the Data Lake.
Nov 15 2021, 4:15 AM · Data-Engineering-Kanban, Patch-For-Review, Data-Engineering, Wikidata, Wikidata-Query-Service, Structured-Data-Backlog, Product-Analytics

Nov 11 2021

AKhatun_WMF moved T258834: Create a Commons equivalent of the wikidata_entity table in the Data Lake from Analysis to Current work on the Wikidata-Query-Service board.
Nov 11 2021, 9:21 AM · Data-Engineering-Kanban, Patch-For-Review, Data-Engineering, Wikidata, Wikidata-Query-Service, Structured-Data-Backlog, Product-Analytics
AKhatun_WMF moved T293636: Identify and analyze queries that touch on various large subgraphs from Analysis to Current work on the Wikidata-Query-Service board.
Nov 11 2021, 9:21 AM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata

Nov 9 2021

AKhatun_WMF moved T291205: Analysis: Property usage by items' P31 from Incoming to Needs Reporting on the Discovery-Search (Current work) board.
Nov 9 2021, 1:27 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF added a project to T291205: Analysis: Property usage by items' P31: Discovery-Search (Current work).

Some analysis was done here:

Nov 9 2021, 1:27 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF moved T293632: Analysis of large subgraphs in Wikidata from Incoming to Needs Reporting on the Discovery-Search (Current work) board.
Nov 9 2021, 1:07 AM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF added a comment to T293632: Analysis of large subgraphs in Wikidata.

The analysis was completed and documented here: https://wikitech.wikimedia.org/wiki/User:AKhatun/Wikidata_Subgraph_Analysis

Nov 9 2021, 1:06 AM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata

Nov 8 2021

AKhatun_WMF added a comment to T295188: Create aggregate list of potential Blazegraph data deletion sources in case of catastrophic failure.
Nov 8 2021, 9:30 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF added a comment to T295188: Create aggregate list of potential Blazegraph data deletion sources in case of catastrophic failure.
Nov 8 2021, 8:48 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

Oct 19 2021

AKhatun_WMF added a comment to T288264: Get estimates for all Wikidata statements of a specific datatype.

Basically Wikidata's Properties have a datatype.

Ah, datatype of properties.

I am not seeing that in the analysis you linked but maybe I am overlooking something.

The one I listed is for datatype of objects, so you didn't miss anything.
Thank you for clarifying! It should be fairly easy to find out as well :)

Oct 19 2021, 4:20 PM · Wikidata, Wikidata-Query-Service

Oct 18 2021

AKhatun_WMF updated subscribers of T288264: Get estimates for all Wikidata statements of a specific datatype.

@Lydia_Pintscher
Is this ticket asking for counts of various datatype used in WIkidata? Both URI and literals.
Does wikitech:User:AKhatun/Wikidata_Basic_Analysis#Object help?

Oct 18 2021, 5:11 PM · Wikidata, Wikidata-Query-Service
AKhatun_WMF moved T293632: Analysis of large subgraphs in Wikidata from Analysis to Current work on the Wikidata-Query-Service board.
Oct 18 2021, 2:40 PM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF moved T293628: Get baseline measurements/expectations for splitting various subgraphs from Wikidata from Incoming to Analysis on the Wikidata-Query-Service board.
Oct 18 2021, 2:39 PM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF moved T293631: Get estimates for splitting other large subgraphs from Wikidata from Incoming to Analysis on the Wikidata-Query-Service board.
Oct 18 2021, 2:38 PM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF moved T293632: Analysis of large subgraphs in Wikidata from Incoming to Analysis on the Wikidata-Query-Service board.
Oct 18 2021, 2:38 PM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF moved T293636: Identify and analyze queries that touch on various large subgraphs from Incoming to Analysis on the Wikidata-Query-Service board.
Oct 18 2021, 2:38 PM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF created T293636: Identify and analyze queries that touch on various large subgraphs.
Oct 18 2021, 2:23 PM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF updated the task description for T293632: Analysis of large subgraphs in Wikidata.
Oct 18 2021, 2:20 PM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF created T293632: Analysis of large subgraphs in Wikidata.
Oct 18 2021, 2:18 PM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF created T293631: Get estimates for splitting other large subgraphs from Wikidata.
Oct 18 2021, 2:12 PM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF updated the task description for T293628: Get baseline measurements/expectations for splitting various subgraphs from Wikidata.
Oct 18 2021, 2:01 PM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF removed a subtask for T282790: [EPIC] Get estimates for dropping data from Wikidata in case of Blazegraph catastrophic failure: T288257: Get estimates for size of astronomical objects and queries in Wikidata graph.
Oct 18 2021, 1:58 PM · Epic, Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF removed a parent task for T288257: Get estimates for size of astronomical objects and queries in Wikidata graph: T282790: [EPIC] Get estimates for dropping data from Wikidata in case of Blazegraph catastrophic failure.
Oct 18 2021, 1:58 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF removed a subtask for T282790: [EPIC] Get estimates for dropping data from Wikidata in case of Blazegraph catastrophic failure: T281854: Get baseline measurements/expectations for splitting scholarly articles from Wikidata.
Oct 18 2021, 1:58 PM · Epic, Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF removed a parent task for T281854: Get baseline measurements/expectations for splitting scholarly articles from Wikidata: T282790: [EPIC] Get estimates for dropping data from Wikidata in case of Blazegraph catastrophic failure.
Oct 18 2021, 1:58 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF added a subtask for T293628: Get baseline measurements/expectations for splitting various subgraphs from Wikidata: T288257: Get estimates for size of astronomical objects and queries in Wikidata graph.
Oct 18 2021, 1:58 PM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF added a parent task for T288257: Get estimates for size of astronomical objects and queries in Wikidata graph: T293628: Get baseline measurements/expectations for splitting various subgraphs from Wikidata.
Oct 18 2021, 1:58 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF added a subtask for T293628: Get baseline measurements/expectations for splitting various subgraphs from Wikidata: T281854: Get baseline measurements/expectations for splitting scholarly articles from Wikidata.
Oct 18 2021, 1:57 PM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF added a parent task for T281854: Get baseline measurements/expectations for splitting scholarly articles from Wikidata: T293628: Get baseline measurements/expectations for splitting various subgraphs from Wikidata.
Oct 18 2021, 1:57 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF added a subtask for T293628: Get baseline measurements/expectations for splitting various subgraphs from Wikidata: T291205: Analysis: Property usage by items' P31.
Oct 18 2021, 1:54 PM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF added a parent task for T291205: Analysis: Property usage by items' P31: T293628: Get baseline measurements/expectations for splitting various subgraphs from Wikidata.
Oct 18 2021, 1:54 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF added a subtask for T282790: [EPIC] Get estimates for dropping data from Wikidata in case of Blazegraph catastrophic failure: T293628: Get baseline measurements/expectations for splitting various subgraphs from Wikidata.
Oct 18 2021, 1:52 PM · Epic, Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF added a parent task for T293628: Get baseline measurements/expectations for splitting various subgraphs from Wikidata: T282790: [EPIC] Get estimates for dropping data from Wikidata in case of Blazegraph catastrophic failure.
Oct 18 2021, 1:52 PM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
AKhatun_WMF created T293628: Get baseline measurements/expectations for splitting various subgraphs from Wikidata.
Oct 18 2021, 1:51 PM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata

Oct 4 2021

AKhatun_WMF added a comment to T292306: [DSE Hackathon] Sounds of the Commons: Neural Audio Mashups.

Interested in playing with autoencoders.

write a script that will randomly combine these audio files and sample the latent spaces of their combined embeddings to create new machine-generated audio files

Does this entail we train the autoencoder with the dataset we curated from commons and then have it generate a sample audio file from random numbers? Maybe I'm a bit confused about what 'randomly combining' audio files means here.

Oct 4 2021, 11:47 AM · Machine-Learning-Team

Sep 27 2021

AKhatun_WMF moved T291205: Analysis: Property usage by items' P31 from Analysis to Current work on the Wikidata-Query-Service board.
Sep 27 2021, 10:28 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF claimed T291205: Analysis: Property usage by items' P31.
Sep 27 2021, 10:27 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

Sep 24 2021

AKhatun_WMF added a comment to T288257: Get estimates for size of astronomical objects and queries in Wikidata graph.

Astronomical objects are structured hierarchically and so not everything is direct instance of Q6999 (unlike scholarly articles).

Sep 24 2021, 12:08 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF updated the task description for T291205: Analysis: Property usage by items' P31.
Sep 24 2021, 11:56 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF moved T291190: Determine cost-benefit of doing vertical data slicing on WDQS from Incoming to Needs Reporting on the Discovery-Search (Current work) board.
Sep 24 2021, 11:44 AM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata Analytics, Wikidata
AKhatun_WMF edited projects for T291190: Determine cost-benefit of doing vertical data slicing on WDQS, added: Discovery-Search (Current work); removed Discovery-Search.
Sep 24 2021, 11:43 AM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata Analytics, Wikidata
AKhatun_WMF added a project to T291190: Determine cost-benefit of doing vertical data slicing on WDQS: Discovery-Search.
Sep 24 2021, 11:40 AM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata Analytics, Wikidata
AKhatun_WMF moved T291190: Determine cost-benefit of doing vertical data slicing on WDQS from Analysis to Current work on the Wikidata-Query-Service board.
Sep 24 2021, 11:32 AM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata Analytics, Wikidata
AKhatun_WMF added a comment to T291190: Determine cost-benefit of doing vertical data slicing on WDQS.

Query analysis report for some vertical slices of Wikidata: Wikidata_Vertical_Analysis#Query_Analysis
Summary: Wikidata_Vertical_Analysis#TL;DR

Sep 24 2021, 11:31 AM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata Analytics, Wikidata
AKhatun_WMF moved T281854: Get baseline measurements/expectations for splitting scholarly articles from Wikidata from In Progress to Needs Reporting on the Discovery-Search (Current work) board.
Sep 24 2021, 11:25 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF added a comment to T281854: Get baseline measurements/expectations for splitting scholarly articles from Wikidata.

Here is the analysis done on scholarly articles in Wikidata and WDQS queries related to them: https://wikitech.wikimedia.org/wiki/User:AKhatun/Wikidata_Scholarly_Articles_Subgraph_Analysis

Sep 24 2021, 11:23 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF updated the task description for T281854: Get baseline measurements/expectations for splitting scholarly articles from Wikidata.
Sep 24 2021, 11:14 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

Sep 17 2021

AKhatun_WMF added a subtask for T282790: [EPIC] Get estimates for dropping data from Wikidata in case of Blazegraph catastrophic failure: T291190: Determine cost-benefit of doing vertical data slicing on WDQS.
Sep 17 2021, 7:31 AM · Epic, Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF added a parent task for T291190: Determine cost-benefit of doing vertical data slicing on WDQS: T282790: [EPIC] Get estimates for dropping data from Wikidata in case of Blazegraph catastrophic failure.
Sep 17 2021, 7:31 AM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata Analytics, Wikidata

Aug 26 2021

AKhatun_WMF created T289754: Triple level deduplication.
Aug 26 2021, 6:02 AM · Wikidata, Wikidata-Query-Service
AKhatun_WMF created T289753: Optimize deduplication of triples when loading into wikibase RDF dumps.
Aug 26 2021, 5:25 AM · Wikidata, Wikidata-Query-Service

Aug 10 2021

AKhatun_WMF added a comment to T287225: Add all prefixes defined in Blazegraph.

This is now deployed, the first hour of processing it applies to should be 2021-08-10T14:00Z

Aug 10 2021, 4:52 PM · Patch-For-Review, Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

Aug 9 2021

So9q awarded T281854: Get baseline measurements/expectations for splitting scholarly articles from Wikidata a Burninate token.
Aug 9 2021, 12:10 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

Aug 6 2021

AKhatun_WMF updated the task description for T281854: Get baseline measurements/expectations for splitting scholarly articles from Wikidata.
Aug 6 2021, 1:18 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF added a comment to T281854: Get baseline measurements/expectations for splitting scholarly articles from Wikidata.

@AKhatun_WMF, when you write "authors connected to other subgraphs", do you mean subgraphs within Wikidata (so, excluding external identifiers), or also graphs from other resources part of, for example, the Linked Open Data Cloud?

Aug 6 2021, 1:18 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF updated the task description for T281854: Get baseline measurements/expectations for splitting scholarly articles from Wikidata.
Aug 6 2021, 12:49 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF moved T287225: Add all prefixes defined in Blazegraph from Analysis to Current work on the Wikidata-Query-Service board.
Aug 6 2021, 6:35 AM · Patch-For-Review, Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF moved T281854: Get baseline measurements/expectations for splitting scholarly articles from Wikidata from Analysis to Current work on the Wikidata-Query-Service board.
Aug 6 2021, 6:35 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

Jul 26 2021

AKhatun_WMF moved T287225: Add all prefixes defined in Blazegraph from Incoming to Analysis on the Wikidata-Query-Service board.
Jul 26 2021, 11:25 AM · Patch-For-Review, Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF claimed T286436: Deduplicate triples when loading the wikibase RDF dumps into hive.
Jul 26 2021, 11:24 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF added a comment to T286436: Deduplicate triples when loading the wikibase RDF dumps into hive.

Joseph will suggest an optimization to this task when he is back. For now a simple .distinct() has been done on Spark dataframe to facilitate analysis on Wikidata dumps.

Jul 26 2021, 11:23 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

Jul 24 2021

AKhatun_WMF added a comment to T281854: Get baseline measurements/expectations for splitting scholarly articles from Wikidata.

Some of the statistics that is wanted are listed on Scholia, currently on the frontpage: https://scholia.toolforge.org/ (UPDATE: now here: https://scholia.toolforge.org/statistics)

"percentage, number of Wikidata entities that are scholarly article":
37.246.721 Scholarly articles, so 37/97 ~ 40% are scholarly articles.

Jul 24 2021, 10:24 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

Jul 23 2021

AKhatun_WMF created T287225: Add all prefixes defined in Blazegraph.
Jul 23 2021, 4:26 AM · Patch-For-Review, Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

Jul 19 2021

AKhatun_WMF added a comment to T285465: Document and analyze the number of parsing errors for parsed WDQS queries.

@dcausse: Yes, just adding the prefix declaration in Jena parser is what we want to do.
@JAllemandou: Should I add the other prefixes as well?

Jul 19 2021, 2:04 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

Jul 16 2021

AKhatun_WMF updated subscribers of T285465: Document and analyze the number of parsing errors for parsed WDQS queries.
Jul 16 2021, 1:35 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF added a comment to T285465: Document and analyze the number of parsing errors for parsed WDQS queries.

@JAllemandou @dcausse

  • For June, the average daily successful parsing rate was ~85%. Ranging from 75% to 90%. Note that this only includes queries with status 200 and 500.
  • 11% of the distinct queries ran into errors related to prefixes. The number of distinct queries due to each prefix is shown below. By adding the first 4 prefixes (mwapi, geof, foaf, gas) into the query processors' prefix list the average daily successful parsing rate was ~95% (93% to 97%). A few prefixes were off slightly (data instead of wdata, ref instead of wdref. These account for very few queries, but I fixed them nevertheless.)
prefix_namecount
mwapi7419357
geof54183
foaf17198
gas13753
wds2761
wdv216
fn62
dc50
mediawiki23
wdref22
wdata3

Total distinct queries: 67467327

  • Other errors included:
    • Variable used when already in-scope. This happened when the same variable was reused in a query. Testing such queries in WDQS returns results nicely. These form 2% of the errors in distinct queries.
    • Another notable error is the WITH clause. Although it runs well in WDQS, parser doesn't accept it. These form 2.5% of the distinct queries.
Jul 16 2021, 1:34 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

Jul 13 2021

AKhatun_WMF moved T285465: Document and analyze the number of parsing errors for parsed WDQS queries from Analysis to Current work on the Wikidata-Query-Service board.
Jul 13 2021, 10:22 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF claimed T285465: Document and analyze the number of parsing errors for parsed WDQS queries.
Jul 13 2021, 10:22 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

Jul 11 2021

AKhatun_WMF updated the task description for T286410: Requesting update to SSH key for Aisha Khatun.
Jul 11 2021, 11:29 AM · SRE, SRE-Access-Requests
AKhatun_WMF created T286410: Requesting update to SSH key for Aisha Khatun.
Jul 11 2021, 11:25 AM · SRE, SRE-Access-Requests
AKhatun_WMF added a comment to T280967: Requesting access to Wikimedia Analytics Data for Aisha Khatun.

Thanks!

Jul 11 2021, 11:05 AM · SRE, SRE-Access-Requests
AKhatun_WMF added a comment to T280967: Requesting access to Wikimedia Analytics Data for Aisha Khatun.

Hi @akosiaris, I had to fresh install OS and lost my ssh keys. Is it possible to change it so I can regain access? Should I put on a new public key here?

Jul 11 2021, 9:02 AM · SRE, SRE-Access-Requests
AKhatun_WMF reopened T280967: Requesting access to Wikimedia Analytics Data for Aisha Khatun as "Open".
Jul 11 2021, 9:00 AM · SRE, SRE-Access-Requests

Jun 23 2021

AKhatun_WMF added a comment to T282790: [EPIC] Get estimates for dropping data from Wikidata in case of Blazegraph catastrophic failure.

Some of the vertical analyses were done as a part of familiarizing with wikidata. See the findings in Wikidata_Vertical_Analysis. Will get back to this ticket when done with T282139.

Jun 23 2021, 9:16 AM · Epic, Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

Jun 22 2021

AKhatun_WMF moved T282139: Provide a quantitative description of the Wikidata-triples dataset from Incoming to In Progress on the Discovery-Search (Current work) board.
Jun 22 2021, 8:23 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF claimed T282139: Provide a quantitative description of the Wikidata-triples dataset.
Jun 22 2021, 7:47 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF moved T273854: Automate regular WDQS query parsing and data-extraction from Current work to Analysis on the Wikidata-Query-Service board.
Jun 22 2021, 7:47 AM · Discovery-Search (Current work), Wikidata-Query-Service, Analytics, Wikidata
AKhatun_WMF moved T282139: Provide a quantitative description of the Wikidata-triples dataset from Analysis to Current work on the Wikidata-Query-Service board.
Jun 22 2021, 7:46 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
AKhatun_WMF moved T283258: Provide a job regularly deleting wdqs processed query after 90 days from Current work to Analysis on the Wikidata-Query-Service board.
Jun 22 2021, 7:46 AM · Discovery-Search (Current work), Patch-For-Review, Wikidata, Wikidata-Query-Service

Jun 21 2021

AKhatun_WMF committed rWDAN58fc22bef150: Airflow dag to extract and process sparql queries.
Airflow dag to extract and process sparql queries
Jun 21 2021, 5:43 PM

Jun 4 2021

AKhatun_WMF triaged T283256: Extract operator/nodes/triples/paths/exprs list from queries as Low priority.
Jun 4 2021, 7:24 AM · Wikidata, Wikidata-Query-Service
AKhatun_WMF claimed T273854: Automate regular WDQS query parsing and data-extraction.
Jun 4 2021, 7:22 AM · Discovery-Search (Current work), Wikidata-Query-Service, Analytics, Wikidata
AKhatun_WMF closed T283255: Create CLI job extracting info from wdqs queries as Resolved.
Jun 4 2021, 7:21 AM · Wikidata, Wikidata-Query-Service
AKhatun_WMF closed T283255: Create CLI job extracting info from wdqs queries, a subtask of T280640: [EPIC] Refine WDQS queries analysis, as Resolved.
Jun 4 2021, 7:20 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

Jun 3 2021

AKhatun_WMF added a comment to T282139: Provide a quantitative description of the Wikidata-triples dataset.

Some of the suggested information to analyse or extract through this analysis are:

  • Top items
  • Top properties
  • Top subject, object types
  • Top property types
  • Top wikidata vs other predicates
  • Number of S, P, O that don't involve wikidata
    • The aim is to find the size of the subgraph not concerning wikidata, i.e size of leaves. They are leaves because once they point to something outside of wikidata, they are not expanded within wikidata. Some things are not even exapandable like literals. If we have too many leaves, we may consider using property graphs (where leaves will be listed as properties of a node).
Jun 3 2021, 6:53 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

Jun 1 2021

AKhatun_WMF added a comment to T283256: Extract operator/nodes/triples/paths/exprs list from queries.

Update 1 June 2021:

Jun 1 2021, 9:59 AM · Wikidata, Wikidata-Query-Service

May 27 2021

AKhatun_WMF moved T273854: Automate regular WDQS query parsing and data-extraction from Watching / Waiting to Analysis on the Wikidata-Query-Service board.
May 27 2021, 2:07 PM · Discovery-Search (Current work), Wikidata-Query-Service, Analytics, Wikidata

May 25 2021

AKhatun_WMF removed a project from T280640: [EPIC] Refine WDQS queries analysis: Patch-For-Review.
May 25 2021, 8:47 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

May 24 2021

AKhatun_WMF added a comment to T283256: Extract operator/nodes/triples/paths/exprs list from queries.

Idea on how to store the SPARQL query as a list:
Let's make a list of generic custom class QueryElem[T]. QueryElem contains elemType: String and elem: T.

May 24 2021, 10:53 AM · Wikidata, Wikidata-Query-Service

May 21 2021

dcausse awarded T282129: Test triple-analysis functions over a large dataset with Spark a Love token.
May 21 2021, 7:19 AM · Wikidata, Wikidata-Query-Service

May 20 2021

AKhatun_WMF added a comment to T282130: Provide a way to save extracted query-information in parquet format.

@AKhatun_WMF That's great! could you please provide some info on expected data-size in parquet (for daily data for instance)? Many thanks.

May 20 2021, 9:24 AM · Wikidata, Wikidata-Query-Service