It would nice to have an idea of the percentage of queries that use the isBlank function. It might be interesting to try to identify tools using this function so that we could contact their maintainer if we were to introduce a new function to replace isBlank.
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | Gehel | T244590 [Epic] Rework the WDQS updater as an event driven application | |||
Open | None | T244341 Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS | |||
Resolved | JAllemandou | T246237 Extract some statistics on the use of the isBlank() function in wdqs query logs |
Event Timeline
As I was working on getting a better idea of the queries, I got some results relatively easily:
Since beginning of year:
- Internal cluster: No request using isBlank(), 481202298 requests total
- External cluster: 54669 requests using isBlank(), 202695416 requests total (0.03%)
I can provide more details as needed :)
@Lea_Lacroix_WMDE the use of isBlank seems pretty low, do you think we should still try to identify bots by grouping by user-agent and see if something is identifiable?
Yes please :) It's a low percentage but it's still far from zero. Can we also look at the example queries?
Events using isBlank since the beginning of year are now stored here: /user/joal/wdqs_queries/2020_use_isBlank/wdqs_use_is_blank_202002.json.
There are ~56k events stored in json format in a single file to facilitate analysis.
For example queries there are a few: (easily findable here: https://www.wikidata.org/w/index.php?search=Wikidata%3Asubpageof%3ASPARQL_query_service%2Fqueries+insource%3Aisblank+inlanguage%3Aen&title=Special:Search&profile=default&fulltext=1).
- https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples#Humans_whose_gender_we_know_we_don't_know
- https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples/human#Humans_whose_gender_we_know_we_don't_know
- https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples/maintenance#Fathers_with_non-existent_or_unusual_gender_statements
@JAllemandou thanks for the dataset!
As for bot identification I'll to do some basic aggregation on UA to see if something comes out.