Page MenuHomePhabricator

Pfps (Peter F. Patel-Schneider)
User

Today

  • No visible events.

Tomorrow

  • No visible events.

Friday

  • No visible events.

User Details

User Since
Aug 23 2016, 11:49 PM (504 w, 21 h)
Availability
Available
LDAP User
Unknown
MediaWiki User
Pfps [ Global Accounts ]

Recent Activity

Wed, Apr 15

Pfps added a comment to T423425: finding out whether a change to Wikidata affect systems that use Wikidata.

The accesses should be appropriately anonymized before being made available, of course.

Wed, Apr 15, 2:25 PM · Wikidata-Query-Service, Wikidata
Pfps created T423425: finding out whether a change to Wikidata affect systems that use Wikidata.
Wed, Apr 15, 2:22 PM · Wikidata-Query-Service, Wikidata

Wed, Apr 8

Pfps added a comment to T422097: Create the capability to send queries to 2 backends and compare results.

A slightly different take on this is whether the answers that are being diff'ed are correct/definitive.

Wed, Apr 8, 8:36 PM · Wikidata Platform Team (Sprint 04 (2026/04/08)), OKR-Work

Sun, Mar 29

Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

@Pfps I'm not sure why you ask me; I have nothing to do with running GSoC :)

Sun, Mar 29, 4:58 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps added a comment to T420355: GSoC 2026 - Proposal: Gamifying constraint violation fixes on Wikidata.

If your name on the GSoC site is different from your ID in Phabricator you need to say what your GSoC ID is. Otherwise your interaction here will not be associated with your proposal and it will be rejected.

Sun, Mar 29, 12:02 PM · Google-Summer-of-Code (2026)
Pfps added a comment to T420355: GSoC 2026 - Proposal: Gamifying constraint violation fixes on Wikidata.

A reminder that you need to submit your proposals to Google for them to be considered, in addition to anything done in Phabricator.

Sun, Mar 29, 11:53 AM · Google-Summer-of-Code (2026)

Sat, Mar 28

Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

This project is medium difficulty and large (350 hours) in scale.

Sat, Mar 28, 10:34 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

This project is medium difficulty and large size.

Sat, Mar 28, 10:29 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025

Fri, Mar 27

Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

Hi Meghana: If you email me at pfpschneider@gmail.com we can set up a meeting in the afternoon, US East Coast time.

Fri, Mar 27, 3:19 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

Hi Arina: By this time you should have been interacting with us, the potential mentors, for quite some time and have done some of the microtasks. We will evaluate your proposal even so, but iteration at this late date is not likely.

Fri, Mar 27, 2:35 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025

Thu, Mar 26

Pfps added a comment to T421298: [NEEDS GROOMING] implement a test harness for data quality and query validation.

That's good. The next determination is whether to do a complete comparison or an incomplete one. Then there is the issue of whether to include a third or fourth engine so that compliance can be estimated.

Thu, Mar 26, 1:42 PM · OKR-Work, Wikidata Platform Team (Sprint 04 (2026/04/08))

Wed, Mar 25

Pfps added a comment to T421298: [NEEDS GROOMING] implement a test harness for data quality and query validation.

I don't think that the problem is solvable in general. You can't even rerun queries and always expect the same results because of updates.

Wed, Mar 25, 11:27 PM · OKR-Work, Wikidata Platform Team (Sprint 04 (2026/04/08))

Mar 23 2026

Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

Hi Meghana: The deadline for proposals is next week so you are starting very late in the process. By now you should have tried out some of the suggestions in the initial comment. If you want to proceed we can set up a call as described in my earlier comment.

Mar 23 2026, 5:16 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025

Mar 19 2026

Pfps added a comment to T418167: [Epic] [CV] TBD Quality Constraint Project.

@Olea That's interesting (to me). Feel free to contact me at pfpschneider@gmail.com

Mar 19 2026, 5:54 PM · Wikidata, Wikidata-Omega (Radar/Epics/Stalled), Epic

Mar 18 2026

Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

I'm not looking for a daily or even weekly timeline in your proposal. What I am looking for is a breakdown of the overall task into a few pieces that can be tracked. I am also looking for one or more pieces that can be done by the midterm of the coding period. If your proposal is selected we will be using the familiarization period to further refine the work to be done.

Mar 18 2026, 12:14 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

As far as I am concerned, the minimal deliverable at the end of the project is a playable game that implements fixes to some kinds of constraint violations. It is in your interest to have a proposal that can support this minimal deliverable, and also has significant optional parts.

Mar 18 2026, 12:06 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

The guidance from Google in https://google.github.io/gsocguides/student/writing-a-proposal is to have deliverables and timelines in your proposals. This is a good idea, but it is possible to go too far in this area. What we are looking for is a sense that you can break down the overall project into several pieces, each probably including design, coding, and documentation. What is important is a plan to have something that can be evaluated at the midpoint. That doesn't have to be a full system, but there should be some coding involved.
This project has areas where you can decide how much or how little to do and still have a working result. That may make it different from other GSOC projects. It is a good idea to make pieces of your proposal optional, so that at the end you (and we) can claim victory even if everything in the proposal is not implemented.
In the end, a big part of GSOC is to get people interested in open-source projects. In my view it's a win for GSOC if you end up doing significant open-source work in the future, even if not all of your proposal ends up being implemented.

Mar 18 2026, 2:40 AM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.
Mar 18 2026, 12:58 AM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

Hello,

I would really like to be a contributor for this project. I believe that this sort of game is what really helps the community join in efforts to do systematic fixes.

My approach would be like the following:

  1. The user first sees explanations and examples of the situation that needs resolution.
  2. The user than sees dummy tests that have no effect, and needs to choose the correct options.
  3. After this process ends, the user starts giving actual answers that are recorded somewhere.
  4. In order not to bloat Wikidata with bad answers, the answers are only committed if a few people independently reached the same results. (Things that were controversial are sent to a different set, in which experts deal with those issues.)

So this type of game could not only be useful for mundane tasks, but even more complicated tasks. I hope to be a contributor to this project so that I can implement this system.

Mar 18 2026, 12:53 AM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

@Pfps , I've drafted an initial idea list document for this project, could you please take a look ?

I'd appreciate any feedback you have regarding any ideas I outlined

Document link

Mar 18 2026, 12:46 AM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

Hi @Pfps, @LGoto and @DMartin-WMF,

I'm a student developer and I've made contributions to Scribe-Data. During my work there, I identified some issues in their codebase which resulted in duplication of data and unreliability. After discussion with the maintainers, I migrated it from static Wikidata dump to the live SPARQL query service to solve it.
I'm really interested in this project and the challenge of reliability analysis. Since it was originally drafted for GSoC 2025, I wanted to check if you’re planning to bring it back as a 350h project for GSoC 2026?
In the meantime, I'm setting up the Distributed Game environment to start working on the microtasks.

Edit: Fixed the bugs in T210635 (was the only open ticket) Code is at Github

Mar 18 2026, 12:23 AM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025

Mar 17 2026

Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

Sorry all. Due to some other issues, I have not been adequately responsive. But there is light at the end of the tunnel. I'll get through all the backlog today.

Mar 17 2026, 2:27 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025

Mar 11 2026

Pfps added a comment to T414443: Setup WDQS instances on test eqiad nodes..

I can check triple counts on my benchmark machine when the current benchmark run finishes, which may take another day or so.

Mar 11 2026, 2:33 PM · Wikidata Platform Team (Sprint 03 (2026/03/03)), OKR-Work, Wikidata-Query-Service, Wikidata
Pfps added a comment to T414443: Setup WDQS instances on test eqiad nodes..

Also, I never used munged files with QLever. I don't know whether that would slow down or speed up QLever.

Mar 11 2026, 2:29 PM · Wikidata Platform Team (Sprint 03 (2026/03/03)), OKR-Work, Wikidata-Query-Service, Wikidata
Pfps added a comment to T414443: Setup WDQS instances on test eqiad nodes..

The parsing part of the QLever ingestion process can be heavily parallelized if the input file or files are well-behaved. Because of this need for well-behaved input files a special flag must be set. The RDF dump files are well-behaved.

https://phabricator.wikimedia.org/P89833

2026-03-09 21:42:33.394 - INFO: You specified "parallel-parsing = true", which enables faster parsing for TTL files with a well-behaved use of newlines
2026-03-09 21:42:33.394 - INFO: You specified "num-triples-per-batch = 10,000,000", choose a lower value if the index builder runs out of memory

I think this might indeed an issue with file format (we indexed nt files), but needs validation, And we need to look tunables (num-triples-per-batch) vs input size. We'll monitor more closely at the next indexing run. The runtime is still good.

Mar 11 2026, 2:27 PM · Wikidata Platform Team (Sprint 03 (2026/03/03)), OKR-Work, Wikidata-Query-Service, Wikidata
Pfps added a comment to T414443: Setup WDQS instances on test eqiad nodes..

https://www.wikidata.org/wiki/Wikidata:Scaling_Wikidata/Benchmarking/Virtuoso#Run_the_Virtuoso_server_and_load_the_Wikidata_files_into_the_server already notes the slowdown in ingestion rate with Virtuoso as the process proceeds, and postulates a cause.

The slow down I understand. I don't know the internal of the ingestion process, but your assumption seems consistent with memory allocation patterns we saw in https://phabricator.wikimedia.org/T414559#11537749 . Now, we can't compare because the hardware is different, the wdqs1028 host had flaky drives, the virtuoso config is not properly tuned (especially re-memory, which is key here) for 20B triples, (and, for reference, QLever was reading from a nfs share).

That said, I am surprised by how little time (2.5hours) it took to ingest 8.5B triples. @Pfps did you run any experiment to measure ingestion time as a function of split file size (and thus number of splits) ?

Mar 11 2026, 2:14 PM · Wikidata Platform Team (Sprint 03 (2026/03/03)), OKR-Work, Wikidata-Query-Service, Wikidata

Mar 10 2026

Pfps added a comment to T414443: Setup WDQS instances on test eqiad nodes..

The parsing part of the QLever ingestion process can be heavily parallelized if the input file or files are well-behaved. Because of this need for well-behaved input files a special flag must be set. The RDF dump files are well-behaved.

Mar 10 2026, 1:14 PM · Wikidata Platform Team (Sprint 03 (2026/03/03)), OKR-Work, Wikidata-Query-Service, Wikidata
Pfps added a comment to T414443: Setup WDQS instances on test eqiad nodes..

https://www.wikidata.org/wiki/Wikidata:Scaling_Wikidata/Benchmarking/Virtuoso#Run_the_Virtuoso_server_and_load_the_Wikidata_files_into_the_server already notes the slowdown in ingestion rate with Virtuoso as the process proceeds, and postulates a cause.

Mar 10 2026, 1:05 PM · Wikidata Platform Team (Sprint 03 (2026/03/03)), OKR-Work, Wikidata-Query-Service, Wikidata

Mar 9 2026

Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

@Pfps , Just checking in on the ideas list doc. I wanted to make sure you have everything needed from me. Happy to clarify anything, No rush at all!

Mar 9 2026, 10:05 AM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025

Mar 6 2026

Pfps added a comment to T417504: better information on recent benchmarking + some comments.

The reason that opened this ticket is that the team has been poor at responding to comments in the Wikidata wiki. So I looked in https://www.mediawiki.org/wiki/Wikidata_Platform#How_to_contact_us and the only contact methods there say to open Phabricator tickets. So I did.

Mar 6 2026, 10:09 AM · Essential-Work, Wikidata Platform Team (Sprint 03 (2026/03/03)), Wikidata, Wikidata-Query-Service

Mar 5 2026

Pfps added a comment to T409781: request for access to Wikidata Query logs.

So who can I ask to see the interaction with the Privacy and Security team?

Mar 5 2026, 9:12 PM · Wikidata Platform Team (Sprint 03 (2026/03/03)), Essential-Work, Wikidata
Pfps added a comment to T409781: request for access to Wikidata Query logs.

One problem I have here is that I haven't seen any of the interaction with the privacy and security people so I don't know what their requirements for anonymization are.

Mar 5 2026, 1:55 PM · Wikidata Platform Team (Sprint 03 (2026/03/03)), Essential-Work, Wikidata
Pfps added a comment to T168973: Checking if a book is an instance of work is slow without explicit gearing hint.

My belief is that this is a result of slow processing of transitive closure operations in Blazegraph and is likely to not be a problem in optimized SPARQL engines.

Mar 5 2026, 10:47 AM · WDQS-Optimizer, Upstream, Discovery-ARCHIVED, Wikidata-Query-Service, Wikibase-Quality-Constraints, Wikibase-Quality, Wikidata
Pfps added a comment to T409781: request for access to Wikidata Query logs.

I just learned that the initial work to anonymize the queries was supported by the Wikimedia Foundation, https://meta.wikimedia.org/wiki/Research:Understanding_Wikidata_Queries

Mar 5 2026, 2:13 AM · Wikidata Platform Team (Sprint 03 (2026/03/03)), Essential-Work, Wikidata
Pfps added a comment to T409781: request for access to Wikidata Query logs.

I am disappointed in this abrupt ending, particularly after nearly four months.

Mar 5 2026, 1:49 AM · Wikidata Platform Team (Sprint 03 (2026/03/03)), Essential-Work, Wikidata

Mar 4 2026

Pfps added a comment to T418595: WDQS Label service proxy performance tests for Qlever and Virtuoso.

This is the difference between join and left join. Regular joins (butting two triple patterns together) are associative and commutative, so the join order doesn't matter. Left joins (OPTIONAL) are associative (I think) but not commutative, so the join order does matter. So the order of the OPTIONALs matters. For official information you need to dig deeply into https://www.w3.org/TR/sparql11-query/. To extract information from that document sometimes requires a good grounding in the theory of querying.

Mar 4 2026, 2:08 AM · Wikidata Platform Team (Sprint 04 (2026/04/08)), OKR-Work

Mar 3 2026

Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

@Harikrshnaa I'll take a look this week.

Mar 3 2026, 6:36 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025

Mar 2 2026

Pfps added a comment to T418595: WDQS Label service proxy performance tests for Qlever and Virtuoso.

For a more realistic example, try https://qlever.dev/wikidata/xAiP0y then https://w.wiki/J5kE then https://w.wiki/J5kT then https://w.wiki/J5kH

Mar 2 2026, 2:53 PM · Wikidata Platform Team (Sprint 04 (2026/04/08)), OKR-Work
Pfps added a comment to T418595: WDQS Label service proxy performance tests for Qlever and Virtuoso.

The SCHOLIA replacement also works better if the label variable already has a value.

Mar 2 2026, 2:28 PM · Wikidata Platform Team (Sprint 04 (2026/04/08)), OKR-Work
Pfps added a comment to T418595: WDQS Label service proxy performance tests for Qlever and Virtuoso.

SPARQL defines what the result of evaluation is, just like in a programming language. Implementations are free to do anything so long as the specified result is reached. The above SPARQL must produce "only" de labels if there are any, according to the definition of SPARQL.

Mar 2 2026, 2:15 PM · Wikidata Platform Team (Sprint 04 (2026/04/08)), OKR-Work
Pfps added a comment to T418595: WDQS Label service proxy performance tests for Qlever and Virtuoso.

Moreover, the SCHOLIA work was already reported to the team in https://phabricator.wikimedia.org/T414453

Mar 2 2026, 1:18 PM · Wikidata Platform Team (Sprint 04 (2026/04/08)), OKR-Work
Pfps added a comment to T418595: WDQS Label service proxy performance tests for Qlever and Virtuoso.

It would be worthwhile for you to have more knowledge of the community activities in this area. Your team knows that the SCHOLIA queries have been rewritten into QLever so your team should have either reached out to them to find out what they did or looked at the changes that they made. There you would have seen a better transformation which instead of several OPTIONALs on different variables followed by a BIND uses a sequence of OPTIONALs on the same variable. Further, in some SCHOLIA queries there is an initial BIND that eliminates the problems that happen if the variable is unbound.

Mar 2 2026, 11:43 AM · Wikidata Platform Team (Sprint 04 (2026/04/08)), OKR-Work

Mar 1 2026

Pfps added a comment to T418595: WDQS Label service proxy performance tests for Qlever and Virtuoso.

I strongly suggest reaching out to the community to find out the current best practices for replacing the label service.
It appears from https://gitlab.wikimedia.org/repos/wikidata-platform/wdqs-query-proxy/-/blob/main/src/test/java/org/wikimedia/wdqs/WikibaseLabelParserTest.java?ref_type=heads that the replacement being evaluated is much less than ideal.

Mar 1 2026, 10:11 PM · Wikidata Platform Team (Sprint 04 (2026/04/08)), OKR-Work
Pfps added a comment to T418595: WDQS Label service proxy performance tests for Qlever and Virtuoso.

It would be useful to also rewrite named subqueries, as other SPARQL systems may not handle them either. And also remove Blazegraph optimization hints, which will interfere with results in other systems.

Mar 1 2026, 10:07 PM · Wikidata Platform Team (Sprint 04 (2026/04/08)), OKR-Work
Pfps added a comment to T418595: WDQS Label service proxy performance tests for Qlever and Virtuoso.

Is there a specification of exactly what the label service does?

Mar 1 2026, 10:06 PM · Wikidata Platform Team (Sprint 04 (2026/04/08)), OKR-Work

Feb 27 2026

Pfps created T418598: update https://www.mediawiki.org/wiki/Wikidata_Query_Service.
Feb 27 2026, 2:40 PM · Wikidata, Documentation, Wikidata-Query-Service

Feb 25 2026

Pfps added a comment to T409781: request for access to Wikidata Query logs.

One more thing that would be useful, if possible, is whether the query was syntactically legal according to Blazegraph. I could get this information by running the query through Blazegraph, but if this information is in the log I could use that instead.

Feb 25 2026, 3:49 PM · Wikidata Platform Team (Sprint 03 (2026/03/03)), Essential-Work, Wikidata

Feb 24 2026

Pfps added a comment to T409781: request for access to Wikidata Query logs.

I want to mock up a server to investigate its load, so all I need for that is the anonymized query and a relative (or absolute) timestamp. User agent categories could be useful to better estimate future loads. Anonymizing string literals will be a problem for me, but I understand if this has to be done.

Feb 24 2026, 6:02 PM · Wikidata Platform Team (Sprint 03 (2026/03/03)), Essential-Work, Wikidata
Pfps added a comment to T418167: [Epic] [CV] TBD Quality Constraint Project.

Looking through the code there appears to be use of SPARQL (for at least distinct values). Is this the main place where Wikidata itself depends on the WDQS? How does the split of the WDQS factor into the use of SPARQL? Does this mean that the distinct values constraint can't work on scholarly articles (or, actually, at all if it misses values from the scholarly graph)?

Feb 24 2026, 5:10 PM · Wikidata, Wikidata-Omega (Radar/Epics/Stalled), Epic
Pfps added a comment to T416579: [SPIKE] Making constraint violations queryable on the Query Service.

Is it some better way to process detected constraint violations so that users can better see them in the WDQS?

This one (assuming I understand your phrasing correctly). That’s what we already had in the past, and people found it useful at the time.

Feb 24 2026, 3:18 PM · Wikidata-Omega (The Board), Wikibase-Quality-Constraints, Wikidata, Wikidata-Query-Service
Pfps added a comment to T418167: [Epic] [CV] TBD Quality Constraint Project.

So the feature includes complex constraints?

Feb 24 2026, 2:04 PM · Wikidata, Wikidata-Omega (Radar/Epics/Stalled), Epic
Pfps added a comment to T201150: Regularly run constraint checks for all items.

Is there a good description of how the constraint system is implemented, preferably including the role of third-party tools?

Feb 24 2026, 2:00 PM · User-Addshore, [DEPRECATED] wdwb-tech, Wikidata-Query-Service, Wikibase-Quality, Wikidata, Wikibase-Quality-Constraints
Pfps added a comment to T201150: Regularly run constraint checks for all items.

Given that KrBot is a third-party closed-source (?) tool, I would be happy having it replaced.

Feb 24 2026, 2:00 PM · User-Addshore, [DEPRECATED] wdwb-tech, Wikidata-Query-Service, Wikibase-Quality, Wikidata, Wikibase-Quality-Constraints

Feb 23 2026

Pfps added a comment to T201150: Regularly run constraint checks for all items.

Here's the goal: a SPARQL query should return all violations of a certain kind, with a possible data lag of a few hours.
So you need:

  • a baseline of having processed all items (TODO)
  • processing of changed items (DONE)
  • periodic processing of every item because constraint definitions or implementations can change globally (TODO?)
Feb 23 2026, 7:58 PM · User-Addshore, [DEPRECATED] wdwb-tech, Wikidata-Query-Service, Wikibase-Quality, Wikidata, Wikibase-Quality-Constraints
Pfps added a comment to T418167: [Epic] [CV] TBD Quality Constraint Project.

What is the difference, if any between the "quality constraints feature" and Wikidata property constraints? Having some notion of what is being investigated would be useful if the community is to provide useful input into the process.

Feb 23 2026, 7:47 PM · Wikidata, Wikidata-Omega (Radar/Epics/Stalled), Epic
Pfps added a comment to T416579: [SPIKE] Making constraint violations queryable on the Query Service.

I'm confused as to just what this ticket is about.

Feb 23 2026, 7:25 PM · Wikidata-Omega (The Board), Wikibase-Quality-Constraints, Wikidata, Wikidata-Query-Service
Pfps added a comment to T409781: request for access to Wikidata Query logs.

I just noticed the paper https://arxiv.org/abs/2602.14594

Feb 23 2026, 5:12 PM · Wikidata Platform Team (Sprint 03 (2026/03/03)), Essential-Work, Wikidata

Feb 18 2026

Pfps added a comment to T414453: [SPIKE] How to handle porting of label and mwapi services to the new backend.

@Pfps I am not entirely sure what I am looking at here: https://github.com/ad-freiburg/scholia/blob/qlever/scholia/app/templates/sparql-helpers.sparql. These macros are not used to rewrite existing label service queries, are they?

Feb 18 2026, 5:59 PM · Wikidata Platform Team (Sprint 03 (2026/03/03)), OKR-Work, Wikidata
Pfps added a comment to T414453: [SPIKE] How to handle porting of label and mwapi services to the new backend.

It turns out that fallback functionality has become more important with the introduction of the mul "language". If I want English labels I need to have mull as a fallback or I may miss many labels.

Feb 18 2026, 11:46 AM · Wikidata Platform Team (Sprint 03 (2026/03/03)), OKR-Work, Wikidata

Feb 17 2026

Pfps added a comment to T414453: [SPIKE] How to handle porting of label and mwapi services to the new backend.

Thanks.

Feb 17 2026, 7:48 PM · Wikidata Platform Team (Sprint 03 (2026/03/03)), OKR-Work, Wikidata

Feb 15 2026

Pfps added a comment to T417504: better information on recent benchmarking + some comments.

https://upload.wikimedia.org/wikipedia/commons/4/47/WDQS_Triple_Store_Evaluation_-_Benchmark_Results_Report.pdf

Feb 15 2026, 8:05 PM · Essential-Work, Wikidata Platform Team (Sprint 03 (2026/03/03)), Wikidata, Wikidata-Query-Service
Pfps created T417504: better information on recent benchmarking + some comments.
Feb 15 2026, 5:21 PM · Essential-Work, Wikidata Platform Team (Sprint 03 (2026/03/03)), Wikidata, Wikidata-Query-Service
Pfps added a comment to T409781: request for access to Wikidata Query logs.

It is very frustrating to have this task languish without any way of contacting the team that appears to be blocking progress.

Feb 15 2026, 4:45 PM · Wikidata Platform Team (Sprint 03 (2026/03/03)), Essential-Work, Wikidata
Pfps added a comment to T414453: [SPIKE] How to handle porting of label and mwapi services to the new backend.

The label service is indeed pervasive. How much are the mwapi services used?

Feb 15 2026, 4:43 PM · Wikidata Platform Team (Sprint 03 (2026/03/03)), OKR-Work, Wikidata

Jan 22 2026

Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

Hi, I run the "New Wikiquote article and category matches", "New Wikipedia article and category matches", and "Commons category matches" Wikidata games (connected to Pi bot activities). It's great to see this project taking place, the games definitely need some love and expansion!

I just got a pull request from @Harikrshnaa (I think?) at https://github.com/mpeel/wikicode/pull/17 to remove the brackets from the Wikiquote game, should I accept this, or leave this example open since it's listed in the project description?

If there's anything I can do to help with this project, do let me know. I'd love to see the three games I run expanded to cover more languages, for example - or a 'Commons category matches' version that uses extra checks (precise name matching, LLM pre-evaluation, etc.) to increase the likelihood of the suggested matches being good.

Jan 22 2026, 2:07 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025

Jan 21 2026

Pfps updated the task description for T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.
Jan 21 2026, 4:03 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps updated the task description for T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.
Jan 21 2026, 4:01 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

Hi @Pfps thank you for the proposal. A couple questions:

  • Could you briefly explain in layman terms 1.) what is a constraint violation and 2.) why is it important to fix?
  • I'm not sure that microtask #2 is actually a small enough task, could you consider breaking this down a bit? I'd also encourage you to create the microtasks as subtasks of this one, so that candidates can contribute directly to them.
Jan 21 2026, 3:57 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps updated the task description for T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.
Jan 21 2026, 3:54 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps updated the task description for T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.
Jan 21 2026, 3:01 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025

Jan 20 2026

Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

@Hridyesh_Gupta Thank you for your interest. It might be a bit early to start on the microtasks, as the period where potential contributors interact with mentors isn't for a while. As well, the topic will be getting some updates over the next little while.

Jan 20 2026, 4:12 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025

Jan 14 2026

Pfps added a comment to T409781: request for access to Wikidata Query logs.

How can this be escalated?

Jan 14 2026, 2:45 PM · Wikidata Platform Team (Sprint 03 (2026/03/03)), Essential-Work, Wikidata

Jan 12 2026

Pfps added a comment to T206560: [Epic] Evaluate alternatives to Blazegraph.

OK, there is a newer newsletter. But that's not a newer version of the information in the November newsletter, as far as I can tell. The wording in the inactive banner contains: "Either the page is no longer relevant or consensus on its purpose has become unclear." I don't think that either of these are the case and those who see the wording are likely to be misled.

Jan 12 2026, 5:39 PM · Wikidata, Epic, Wikidata-Query-Service
Pfps updated the task description for T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.
Jan 12 2026, 2:50 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps renamed T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata from GSoC 2025: Gamifying constraint violation fixes on Wikidata to GSoC 2026: Gamifying constraint violation fixes on Wikidata.
Jan 12 2026, 2:46 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

This project is being revived for the 2026 GSoC.

Jan 12 2026, 2:37 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps added a project to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata: Google-Summer-of-Code (2026).
Jan 12 2026, 2:34 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps added a comment to T206560: [Epic] Evaluate alternatives to Blazegraph.

https://www.wikidata.org/wiki/Wikidata:Wikidata_Platform_team/Newsletter_November_2025 is marked as inactive, with rather strong warnings about not being relevant. It this really the case?

Jan 12 2026, 1:10 PM · Wikidata, Epic, Wikidata-Query-Service
Pfps added a comment to T206560: [Epic] Evaluate alternatives to Blazegraph.

https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_update mentions this task so maybe posting this request here will be effective.

Jan 12 2026, 1:08 PM · Wikidata, Epic, Wikidata-Query-Service

Jan 10 2026

Pfps added a comment to T414266: Feature request: Export Wikidata JSON as JSONL.

From https://www.wikidata.org/wiki/Wikidata:Database_download

Jan 10 2026, 8:36 PM · Wikidata, Wikidata data dumps, Data-Engineering, Dumps-Generation

Dec 19 2025

Pfps added a comment to T413097: Raise quota on wikiqlever so that an instance with 256 GB RAM and 3 x 4 TB SSD can be launched.

Having a non-split query service available to interested users is going to be useful during the period from the end of the legacy service to the time that the new WDQS is available. This alternative service probably doesn't need the same uptime characteristics as even the WDQS.

Dec 19 2025, 4:12 PM · cloud-services-team, Cloud-VPS (Quota-requests), WikiCite, Wikidata, Wikidata-Query-Service

Dec 5 2025

Pfps added a comment to T409781: request for access to Wikidata Query logs.

It's well past 11/21. Has there been any progress?

Dec 5 2025, 5:37 PM · Wikidata Platform Team (Sprint 03 (2026/03/03)), Essential-Work, Wikidata

Nov 10 2025

Pfps renamed T409781: request for access to Wikidata Query logs from reqeust for access to Wikidata Query logs to request for access to Wikidata Query logs.
Nov 10 2025, 8:39 PM · Wikidata Platform Team (Sprint 03 (2026/03/03)), Essential-Work, Wikidata
Pfps created T409781: request for access to Wikidata Query logs.
Nov 10 2025, 8:39 PM · Wikidata Platform Team (Sprint 03 (2026/03/03)), Essential-Work, Wikidata

Oct 7 2025

Pfps added a comment to T405395: DPE SRE work to enable testing of Blazegraph alternatives.

We are looking into AWS as a possible way to bootstrap experiments while we wait for on prem hardware. Given that we are memory bound, we are considering these two high-memory instances as a baseline:

  • r8i.4xlarge: 16vcpus, 128 GiB ram.
  • r8i.12xlarge: 48cpus, 384 GiB ram.
Oct 7 2025, 2:17 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), Wikidata-Query-Service, Wikidata
Pfps added a comment to T405395: DPE SRE work to enable testing of Blazegraph alternatives.

The evaluation of QLever on doubled Wikidata has some decent data to report on a preliminary basis. See https://www.wikidata.org/wiki/Wikidata:Scaling_Wikidata/Benchmarking/Phase_2_Preliminary_Report for the report.

Oct 7 2025, 2:12 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), Wikidata-Query-Service, Wikidata

Oct 2 2025

Pfps added a comment to T405395: DPE SRE work to enable testing of Blazegraph alternatives.

My benchmarking used a machine with 16 cores (Ryzen 9950X) and 192GB of main memory, but I only ran one query at a time. Having lots of main memory is useful for measuring throughput with multiple queries running in parallel.

Oct 2 2025, 10:38 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), Wikidata-Query-Service, Wikidata
Pfps added a comment to T405395: DPE SRE work to enable testing of Blazegraph alternatives.

If you want to play around with loading Wikidata into QLever a 16-core machine is very useful as it can considerably cut down loading time.

Oct 2 2025, 3:37 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), Wikidata-Query-Service, Wikidata
Pfps added a comment to T405395: DPE SRE work to enable testing of Blazegraph alternatives.

The main constraint is that qlever is designed for Ubuntu, not Debian, which presents some challenges.

I think that this is probably not as much of a barrier as you might think.

It's true that the Dockerfiles that are distributed with qlever are built using Ubuntu, but I'm fairly confident that we could make our own qlever image fairly easily, based on Debian.

We could set up a blubber/kokkuri pipeline that replicates the actions of the Dockerfiles, but with a Debian base image.

Oct 2 2025, 3:31 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), Wikidata-Query-Service, Wikidata

Oct 1 2025

Pfps added a comment to T406140: problems with https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format.

FWIW, in T384344: Wikibase/Wikidata and WDQS disagree about statement, reference and value namespace prefixes we held that changing the prefixes in the TTL dump files (without changing the resulting URIs) was a significant change, not a breaking change.

Oct 1 2025, 3:55 PM · Wikidata data dumps, Documentation, Wikidata, Wikidata-Query-Service
Pfps added a comment to T406140: problems with https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format.

I'm looking at the dumps from 20241028 and thereabouts (because those are the ones that I have benchmark data about and I'm doing some more benchmarking). Maybe some of the prefixes have changed since then and only data: is problematic.

Oct 1 2025, 3:52 PM · Wikidata data dumps, Documentation, Wikidata, Wikidata-Query-Service
Pfps created T406140: problems with https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format.
Oct 1 2025, 3:28 PM · Wikidata data dumps, Documentation, Wikidata, Wikidata-Query-Service

Aug 29 2025

Pfps created T403249: Example queries on both WDQS endpoints have problems due to split.
Aug 29 2025, 10:29 AM · Wikidata Omega Product, Wikidata Query UI, Wikidata

Feb 27 2025

Pfps added a comment to T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.

I added links to the Phabricator pages of the mentors.
I added pointers to several Phabricator tasks related to the Distributed Game. These links can be used to find games implemented in the Distributed Game and other information that would be useful in the microtasks and throughout the project.

Feb 27 2025, 3:46 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps updated the task description for T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.
Feb 27 2025, 3:43 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps updated the task description for T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.
Feb 27 2025, 3:43 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps updated the task description for T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.
Feb 27 2025, 3:24 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025
Pfps updated the task description for T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.
Feb 27 2025, 3:23 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025

Feb 25 2025

Pfps created T387248: GSoC 2026: Gamifying constraint violation fixes on Wikidata.
Feb 25 2025, 9:51 PM · Wikidata Omega Product, Wikibase-Quality-Constraints, Google-Summer-of-Code (2026), Outreach-Programs-Projects, Wikidata, Google-Summer-of-Code-2025

Feb 21 2025

Pfps added a comment to T330525: Migrate Wikidata off of Blazegraph.

@Hanna_Bast Thanks for the detailed comments. I have updated the benchmarking code, which does output TSV files that are later analyzed to produce statistics. Many of the benchmarks are run in three variants - as-is, with only counts returned, and with DISTINCT added. The benchmarking code also records a bit of information about the output - counts for multiple results and a single value for single results. The latter provided the first indication that different engines produce different results for numeric and GeoSPARQL values.

Feb 21 2025, 4:45 PM · Wikidata, Wikidata-Query-Service