Page MenuHomePhabricator

Pfps (Peter F. Patel-Schneider)
User

Today

  • No visible events.

Tomorrow

  • No visible events.

Monday

  • No visible events.

User Details

User Since
Aug 23 2016, 11:49 PM (490 w, 3 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
Pfps [ Global Accounts ]

Recent Activity

Wed, Jan 14

Pfps added a comment to T409781: request for access to Wikidata Query logs.

How can this be escalated?

Wed, Jan 14, 2:45 PM · Wikidata, Wikidata-Query-Service

Mon, Jan 12

Pfps added a comment to T206560: [Epic] Evaluate alternatives to Blazegraph.

OK, there is a newer newsletter. But that's not a newer version of the information in the November newsletter, as far as I can tell. The wording in the inactive banner contains: "Either the page is no longer relevant or consensus on its purpose has become unclear." I don't think that either of these are the case and those who see the wording are likely to be misled.

Mon, Jan 12, 5:39 PM · Wikidata, Epic, Wikidata-Query-Service
Pfps updated the task description for T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.
Mon, Jan 12, 2:50 PM · Google-Summer-of-Code (Google Summer of Code (2026)), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps renamed T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata from GSoC 2025: Gamifying constraint violation fixes on Wikidata to GSoC 2026: Gamifying constraint violation fixes on Wikidata.
Mon, Jan 12, 2:46 PM · Google-Summer-of-Code (Google Summer of Code (2026)), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

This project is being revived for the 2026 GSoC.

Mon, Jan 12, 2:37 PM · Google-Summer-of-Code (Google Summer of Code (2026)), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps added a project to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata: Google-Summer-of-Code (Google Summer of Code (2026)).
Mon, Jan 12, 2:34 PM · Google-Summer-of-Code (Google Summer of Code (2026)), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps added a comment to T206560: [Epic] Evaluate alternatives to Blazegraph.

https://www.wikidata.org/wiki/Wikidata:Wikidata_Platform_team/Newsletter_November_2025 is marked as inactive, with rather strong warnings about not being relevant. It this really the case?

Mon, Jan 12, 1:10 PM · Wikidata, Epic, Wikidata-Query-Service
Pfps added a comment to T206560: [Epic] Evaluate alternatives to Blazegraph.

https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_update mentions this task so maybe posting this request here will be effective.

Mon, Jan 12, 1:08 PM · Wikidata, Epic, Wikidata-Query-Service

Sat, Jan 10

Pfps added a comment to T414266: Feature request: Export Wikidata JSON as JSONL.

From https://www.wikidata.org/wiki/Wikidata:Database_download

Sat, Jan 10, 8:36 PM · Data-Engineering, Dumps-Generation

Fri, Dec 19

Pfps added a comment to T413097: Raise quota on wikiqlever so that an instance with 256 GB RAM and 3 x 4 TB SSD can be launched.

Having a non-split query service available to interested users is going to be useful during the period from the end of the legacy service to the time that the new WDQS is available. This alternative service probably doesn't need the same uptime characteristics as even the WDQS.

Fri, Dec 19, 4:12 PM · cloud-services-team, Cloud-VPS (Quota-requests), WikiCite, Wikidata, Wikidata-Query-Service

Dec 5 2025

Pfps added a comment to T409781: request for access to Wikidata Query logs.

It's well past 11/21. Has there been any progress?

Dec 5 2025, 5:37 PM · Wikidata, Wikidata-Query-Service

Nov 10 2025

Pfps renamed T409781: request for access to Wikidata Query logs from reqeust for access to Wikidata Query logs to request for access to Wikidata Query logs.
Nov 10 2025, 8:39 PM · Wikidata, Wikidata-Query-Service
Pfps created T409781: request for access to Wikidata Query logs.
Nov 10 2025, 8:39 PM · Wikidata, Wikidata-Query-Service

Oct 7 2025

Pfps added a comment to T405395: DPE SRE work to enable testing of Blazegraph alternatives.

We are looking into AWS as a possible way to bootstrap experiments while we wait for on prem hardware. Given that we are memory bound, we are considering these two high-memory instances as a baseline:

  • r8i.4xlarge: 16vcpus, 128 GiB ram.
  • r8i.12xlarge: 48cpus, 384 GiB ram.
Oct 7 2025, 2:17 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), Wikidata-Query-Service, Wikidata
Pfps added a comment to T405395: DPE SRE work to enable testing of Blazegraph alternatives.

The evaluation of QLever on doubled Wikidata has some decent data to report on a preliminary basis. See https://www.wikidata.org/wiki/Wikidata:Scaling_Wikidata/Benchmarking/Phase_2_Preliminary_Report for the report.

Oct 7 2025, 2:12 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), Wikidata-Query-Service, Wikidata

Oct 2 2025

Pfps added a comment to T405395: DPE SRE work to enable testing of Blazegraph alternatives.

My benchmarking used a machine with 16 cores (Ryzen 9950X) and 192GB of main memory, but I only ran one query at a time. Having lots of main memory is useful for measuring throughput with multiple queries running in parallel.

Oct 2 2025, 10:38 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), Wikidata-Query-Service, Wikidata
Pfps added a comment to T405395: DPE SRE work to enable testing of Blazegraph alternatives.

If you want to play around with loading Wikidata into QLever a 16-core machine is very useful as it can considerably cut down loading time.

Oct 2 2025, 3:37 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), Wikidata-Query-Service, Wikidata
Pfps added a comment to T405395: DPE SRE work to enable testing of Blazegraph alternatives.

The main constraint is that qlever is designed for Ubuntu, not Debian, which presents some challenges.

I think that this is probably not as much of a barrier as you might think.

It's true that the Dockerfiles that are distributed with qlever are built using Ubuntu, but I'm fairly confident that we could make our own qlever image fairly easily, based on Debian.

We could set up a blubber/kokkuri pipeline that replicates the actions of the Dockerfiles, but with a Debian base image.

Oct 2 2025, 3:31 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), Wikidata-Query-Service, Wikidata

Oct 1 2025

Pfps added a comment to T406140: problems with https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format.

FWIW, in T384344: Wikibase/Wikidata and WDQS disagree about statement, reference and value namespace prefixes we held that changing the prefixes in the TTL dump files (without changing the resulting URIs) was a significant change, not a breaking change.

Oct 1 2025, 3:55 PM · Wikidata data dumps, Documentation, Wikidata, Wikidata-Query-Service
Pfps added a comment to T406140: problems with https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format.

I'm looking at the dumps from 20241028 and thereabouts (because those are the ones that I have benchmark data about and I'm doing some more benchmarking). Maybe some of the prefixes have changed since then and only data: is problematic.

Oct 1 2025, 3:52 PM · Wikidata data dumps, Documentation, Wikidata, Wikidata-Query-Service
Pfps created T406140: problems with https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format.
Oct 1 2025, 3:28 PM · Wikidata data dumps, Documentation, Wikidata, Wikidata-Query-Service

Aug 29 2025

Pfps created T403249: Example queries on both WDQS endpoints have problems due to split.
Aug 29 2025, 10:29 AM · Wikidata Omega Product, Wikidata Query UI, Wikidata

Feb 27 2025

Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

I added links to the Phabricator pages of the mentors.
I added pointers to several Phabricator tasks related to the Distributed Game. These links can be used to find games implemented in the Distributed Game and other information that would be useful in the microtasks and throughout the project.

Feb 27 2025, 3:46 PM · Google-Summer-of-Code (Google Summer of Code (2026)), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps updated the task description for T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.
Feb 27 2025, 3:43 PM · Google-Summer-of-Code (Google Summer of Code (2026)), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps updated the task description for T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.
Feb 27 2025, 3:43 PM · Google-Summer-of-Code (Google Summer of Code (2026)), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps updated the task description for T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.
Feb 27 2025, 3:24 PM · Google-Summer-of-Code (Google Summer of Code (2026)), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps updated the task description for T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.
Feb 27 2025, 3:23 PM · Google-Summer-of-Code (Google Summer of Code (2026)), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025

Feb 25 2025

Pfps created T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.
Feb 25 2025, 9:51 PM · Google-Summer-of-Code (Google Summer of Code (2026)), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025

Feb 21 2025

Pfps added a comment to T330525: Migrate Wikidata off of Blazegraph.

@Hanna_Bast Thanks for the detailed comments. I have updated the benchmarking code, which does output TSV files that are later analyzed to produce statistics. Many of the benchmarks are run in three variants - as-is, with only counts returned, and with DISTINCT added. The benchmarking code also records a bit of information about the output - counts for multiple results and a single value for single results. The latter provided the first indication that different engines produce different results for numeric and GeoSPARQL values.

Feb 21 2025, 4:45 PM · Wikidata, Wikidata-Query-Service

Jan 20 2025

Pfps added a comment to T349512: [Analytics] Collect multiple sets of SPARQL queries.

Is there an easy way to get the resultant query sets as plain files?

Jan 20 2025, 1:50 PM · Wikidata Analytics (Kanban), Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

Oct 11 2024

Pfps created T377001: LDF service outputs illegal language-tagged strings.
Oct 11 2024, 2:27 PM · Wikidata, Wikidata-Query-Service

Jan 17 2024

Pfps created T355235: better display of ontology-related information when editing Wikidata.
Jan 17 2024, 3:32 PM · Wikidata-WikiProject-Ontology
Pfps added a member for Wikidata-WikiProject-Ontology: Pfps.
Jan 17 2024, 3:24 PM
Pfps added a comment to T97566: Provide another way to surface usage instructions besides tacking them onto descriptions.

More and more I run into incorrect information in Wikidata that I think would happen less if there was a good way of presenting usage instructions to users. The most recent example was for music (Q638), where Dmitry had to go in and clean out multiple incorrect subclasses. There is already Wikidata usage instructions (P2559) that can be used to hold these instructions so the only missing part is showing usage instructions more prominently. Couldn't it just be possible to put this property at the top of the displayed list of properties? That's not a great solution but it would be a good start. Even better would be to display more information when the item is used as a value, but that's a more complicated change to the Wikidata user interface.

Jan 17 2024, 3:23 PM · User-Smalyshev, User-Daniel, Mobile-Apps, MediaWiki-extensions-Wikibase-Repo, Wikidata

Dec 30 2023

Pfps added a comment to T353964: Change maximum size for project to be pinged.

I don't think that this is a duplicate of https://phabricator.wikimedia.org/T148154

Dec 30 2023, 5:49 PM

Dec 22 2023

Pfps created T353964: Change maximum size for project to be pinged.
Dec 22 2023, 6:25 PM

Dec 1 2023

Pfps added a comment to T341405: An "improved autofix" needed to replace non-standard statements in Wikidata, in order to keep the data model coherent.

How can I contribute to turning this vision into reality?

Dec 1 2023, 11:58 AM · Wikidata data quality and trust, Wikidata

Sep 6 2023

Pfps created T345748: wikidata redirects cause problems with queries.
Sep 6 2023, 3:54 PM · Wikidata, Wikidata-Query-Service

May 12 2020

Pfps added a comment to T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS.

Based on a quick look at various Phabricator tickets and other information it appears to me that the only connection between the WDQS and Wikidata edit throttling is that a slowness parameter for the WDQS is used to modify a Wikidata parameter that is supposed to be checked by bots before they make edits. Further, it appears that the only reason for this connection is to slow down Wikidata edits so that the WDQS can keep up - the WDQS does not feed back into Wikidata edits, even edits by bots. So this connection could be severed by a trivial change to Wikidata and the only effect would be that the WDQS KB might lag behind Wikidata, either temporarily or permanently, and queries to the WDQS might become slow or even impossible without improvements to the WDQS infrastructure. I thus view it misleading to state in this Phabricator ticket that "performance issues [of the WDQS] cause edits on wikidata to be throttled", which gives the impression that the WDQS forms a part of the Wikidata editing process or some other essential part of Wikidata itself.

May 12 2020, 1:48 PM · Community-consensus-needed, Wikidata-Query-Service, Wikidata

May 11 2020

Pfps added a comment to T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS.

I was completely unaware that WDQS is so integrated into the inner workings of Wikidata. Where is this described? Was this mentioned in the announcement of the proposed change?

May 11 2020, 2:06 PM · Community-consensus-needed, Wikidata-Query-Service, Wikidata
Pfps added a comment to T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS.

If 'unskolemizing' is a trivial step then that should be implemented by WDQS, instead of pushing it to every consumer (including indirect consumers) of Wikidata information, if this change is simply a change to make WDQS work faster.

May 11 2020, 1:02 PM · Community-consensus-needed, Wikidata-Query-Service, Wikidata

May 6 2020

Pfps added a comment to T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS.
May 6 2020, 4:32 PM · Community-consensus-needed, Wikidata-Query-Service, Wikidata
Pfps added a comment to T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS.

The difference is not with other SPARQL queries in the WDQS but against SPARQL queries in general (including SPARQL queries that use Wikidata URLs).

May 6 2020, 3:56 PM · Community-consensus-needed, Wikidata-Query-Service, Wikidata
Pfps added a comment to T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS.

Is anyone proposing a change to Wikibase (or Wikidata)?

May 6 2020, 3:49 PM · Community-consensus-needed, Wikidata-Query-Service, Wikidata
Pfps added a comment to T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS.

If divergence between Wikidata and WDQS is bad, then this proposed change has another bad feature as it turns the some value snaks into something that is less like an existential. And this proposed change is for both the RDF dump and the WDQS.
And then there is the problem of the proposed change requiring changes to SPARQL queries - not just a change, but a change from how SPARQL queries are writtern in just about any other context.

May 6 2020, 3:44 PM · Community-consensus-needed, Wikidata-Query-Service, Wikidata

Apr 30 2020

Pfps added a comment to T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS.

My view is that fewer breaking changes are to be preferred, and breaking changes in fewer "products" is to be even more preferred. So, again, I wonder why there is a breaking change proposed for the RDF dump instead of no breaking changes or limiting breaking changes to the WDQS only.

Apr 30 2020, 6:29 PM · Community-consensus-needed, Wikidata-Query-Service, Wikidata
Pfps added a comment to T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS.

I don't understand why it was considered necessary to make a breaking change the RDF dump to improve WDQS performance when there is a solution that does not make a breaking change to the dump.

Apr 30 2020, 2:41 PM · Community-consensus-needed, Wikidata-Query-Service, Wikidata

Apr 17 2020

Pfps added a comment to T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS.

I added some technical content on this issue to https://www.wikidata.org/wiki/Wikidata:Contact_the_development_team/Query_Service_and_search#Blank_node_deprecation_in_WDQS_&_Wikibase_RDF_model

Apr 17 2020, 4:26 PM · Community-consensus-needed, Wikidata-Query-Service, Wikidata

Oct 26 2019

Pfps added a comment to T97566: Provide another way to surface usage instructions besides tacking them onto descriptions.

What's happening to finish this task? P2559 has about 1790 uses, which is not an insignificant number, but P2559 is insignificant because it isn't prominently shown.

Oct 26 2019, 5:45 PM · User-Smalyshev, User-Daniel, Mobile-Apps, MediaWiki-extensions-Wikibase-Repo, Wikidata

Aug 24 2016

Pfps added a comment to T92961: [Story] Versioning in JSON output.

Of course it would by a breaking change. There is no formal spec of the JSON dump beyond the spec for the individual entities, but we have always said that the dump is a set (an array) of entities. Putting something in there that is not an entity will break consumers.

Aug 24 2016, 3:58 PM · Story, Wikidata, MediaWiki-extensions-Wikibase-Repo

Aug 23 2016

Pfps added a comment to T92961: [Story] Versioning in JSON output.

Right now, the JSON dump format is a sequence of JSON objects. Each of these JSON objects is a Wikidata entity. There is nothing preventing the dump format from having the first JSON object be information about the dump, including version of the dump format, version of wikidata format, time of dump, etc.

Aug 23 2016, 11:56 PM · Story, Wikidata, MediaWiki-extensions-Wikibase-Repo