Page MenuHomePhabricator

[RFC] Workaround for checking the format constraint
Closed, ResolvedPublic


The Format constraint had to be removed (see T101467), but it should be checked somehow nevertheless.

Event Timeline

Jonaskeutel raised the priority of this task from to High.
Jonaskeutel updated the task description. (Show Details)
Jonaskeutel subscribed.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Lydia_Pintscher renamed this task from Find a workaround for checking the format constraint to [RFC] Workaround for checking the format constraint.Aug 17 2015, 4:17 PM

We could check this constraint on the Wikidata Query Service. @Smalyshev WDYT?

@LucasWerkmeister interesting question. There's two things we need to check I guess:

  1. Write the SPARQL query and see if it performs properly.
  2. See if that doesn't open us to the same issues as before.

In general, Blazegraph has timeouts and memory limits on queries, and does not use PCRE engine (it uses java.util.regex AFAIK). But in theory there could be some problem there. Since it's just generic query, that problem would be present regardless of constrains, though, so we should not be a concern.

So, I'd write the queries and test if they perform well, and if so, I think it's ok to add query constraints for this one.

The template already includes a link to wdqs. It works for most constraints.

I think I had to add some LUA to make it work.

Change 363605 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseQualityConstraints@master] Implement Format constraint with SPARQL

Change 363605 merged by jenkins-bot:
[mediawiki/extensions/WikibaseQualityConstraints@master] Implement Format constraint with SPARQL

Lucas_Werkmeister_WMDE claimed this task.
Lucas_Werkmeister_WMDE moved this task from Review to Done on the Wikidata-Former-Sprint-Board board.


Admin’s note: if this turns out to cause problems, it can be disabled with

$wgWBQualityConstraintsCheckFormatConstraint = false;

Can you insert a screenshot of the gadget in the issue on top?

Note: In the current version, if the check is not satisfied, the user gets shown a regex.

Basic Problems:

  • We can't expect users to know regex (of our 5 example users/personas, only 1 or 2 know what it is.
  • Even if you know regex, they are hard to read even for experienced people

So, usability heuristics to apply here:

  • "Match between system and the real world" (we should use concepts familiar to the user)
  • "Consistency and standards" – our other constraint infos are pretty well to understand, this one is not
  • "Help users recognize, diagnose, and recover from errors: Error messages should be expressed in plain language (no codes), precisely indicate the problem, and constructively suggest a solution."

For the latter, we don't satisfy any of the user needs. We should:

  • Say that it did not match [whateveritchecksfor], so like "The check for a URL failed.
  • say what the problem is: It seems that your url does not have an "https://" in the begin
  • suggest a fix, like "Try to add http:// or https:// in the beginning, if the URLs are otherwise correct"

So the error message would be: "Your input was checked and was not recognized as a URL

Maybe the qualifier "syntax clarification" could be displayed.

Sample from local dialing code (P473):

"string combining digits, spaces, - (All else excluded, such as: ,/;()+ )"

Alternatively, "string doesn't match expected form, see property constraint."

Let’s move the discussion over to T170374: Format constraint UX, because this task has already been repurposed enough as it is.