Page MenuHomePhabricator

dcausse (David Causse)
User

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Wednesday

  • Clear sailing ahead.

User Details

User Since
Jun 9 2015, 9:03 AM (259 w, 5 d)
Availability
Available
IRC Nick
dcausse
LDAP User
DCausse
MediaWiki User
DCausse (WMF) [ Global Accounts ]

Recent Activity

Thu, May 28

dcausse added a comment to T252091: RFC: Site-wide edit rate limiting with PoolCounter.

Load is average queue size, if you take the currently running batch as being part of the queue. WDQS currently does not monitor the queue size. I gather (after an hour or so of research, I'm new to all this) that with some effort, KafkaPoller could obtain an estimate of the queue size by subtracting the current partition offsets from KafkaConsumer.endOffsets().

Thu, May 28, 4:42 PM · Sustainability (Incident Prevention), User-Addshore, Wikidata-Campsite, Wikidata, TechCom-RFC
dcausse added a comment to T251497: Adapt munging process for SDoC.

I think that schema:ImageObject should be kept since we may have AudioObject and VideoObject

Thu, May 28, 8:58 AM · Wikidata, Wikidata-Query-Service
dcausse added a comment to T253753: Increase retention for mediawiki.revision-create on the kafka jumbo cluster.

@JAllemandou I think that is an option as well, the thing is that is it is transitional to help to bootstrap a test of the full pipeline. In the end we won't be using jumbo and thus won't be able to rely on a 30days retention on main so hopefully we'll be able to reset the retention back to 7days once we're done with the test.
To circumvent this particular problem (time to make the dumps available > retention) we could either:

  • send back the events that matter back to kafka and have higher retention like you suggest
  • create a dedicated job running on the analytics network to read the events stored in HDFS and figure out a way to make the resulting data available in kafka main
Thu, May 28, 8:27 AM · Analytics-Kanban, Analytics, Wikidata-Query-Service, Wikidata

Wed, May 27

dcausse closed T253798: Commons RDF dump should use specific prefixes not the ones used by wikidata as Declined.

Wikibase has now a way to override the default namespaces, for commons it should happen thanks to https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/569260 .

Wed, May 27, 7:20 PM · Wikidata, WikibaseMediaInfo
dcausse added a comment to T221917: Create RDF dump of structured data on Commons.

@WMDE-leszek oops, sorry I replied before reading you comment and was reading an old code base... if this is just a config change it can hopefully be merged soon. Thanks!

Wed, May 27, 7:17 PM · Patch-For-Review, Dumps-Generation, MW-1.34-notes (1.34.0-wmf.10; 2019-06-18), Wikidata-Query-Service, Commons, Wikidata
dcausse updated subscribers of T221917: Create RDF dump of structured data on Commons.

Looks like it was decided not to use wikidata specific prefixes for MediaInfo exports but uses a more specific sdc for these (see: T222995).
The code does still look to be hardcoded with wikidata specific prefixes.
It does not look to me like that we could make this happen quickly.
I created T253798 to track this work. Since it seems that some refactoring will have to happen (initially we thought it might just be a config change) I wonder if making the dumps available should be blocked by T253798 or go ahead and make them available with a short notice explaining that prefixes might change in the future with a link to that same ticket.
@CBogen I'll leave that decision to you.

Wed, May 27, 7:11 PM · Patch-For-Review, Dumps-Generation, MW-1.34-notes (1.34.0-wmf.10; 2019-06-18), Wikidata-Query-Service, Commons, Wikidata
dcausse created T253798: Commons RDF dump should use specific prefixes not the ones used by wikidata.
Wed, May 27, 7:07 PM · Wikidata, WikibaseMediaInfo
dcausse added a comment to T221917: Create RDF dump of structured data on Commons.

Just a note on the current problem:
the prefixes defined in ttl dumps are identical to the ones used by wikidata e.g.:

Wed, May 27, 5:40 PM · Patch-For-Review, Dumps-Generation, MW-1.34-notes (1.34.0-wmf.10; 2019-06-18), Wikidata-Query-Service, Commons, Wikidata
dcausse added a comment to T221917: Create RDF dump of structured data on Commons.

@ArielGlenn we plan to make a subtle change to the dump (prefixes), this won't be technically a breaking change but could cause some confusion if users start to assume the presence of some prefixes. Would it be possible to pause the publication of the dumps while we change this? Sorry for the late notice.

Wed, May 27, 3:29 PM · Patch-For-Review, Dumps-Generation, MW-1.34-notes (1.34.0-wmf.10; 2019-06-18), Wikidata-Query-Service, Commons, Wikidata
dcausse added a subtask for T244590: EPIC: Rework the WDQS updater as an event driven application: T253753: Increase retention for mediawiki.revision-create on the kafka jumbo cluster.
Wed, May 27, 1:37 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service, Epic
dcausse added a parent task for T253753: Increase retention for mediawiki.revision-create on the kafka jumbo cluster: T244590: EPIC: Rework the WDQS updater as an event driven application.
Wed, May 27, 1:37 PM · Analytics-Kanban, Analytics, Wikidata-Query-Service, Wikidata
dcausse created T253753: Increase retention for mediawiki.revision-create on the kafka jumbo cluster.
Wed, May 27, 1:37 PM · Analytics-Kanban, Analytics, Wikidata-Query-Service, Wikidata

Tue, May 26

dcausse created P11301 SEfC federated query.
Tue, May 26, 7:37 AM

Mon, May 25

dcausse added a comment to T251497: Adapt munging process for SDoC.

The munger should exclude rdf:type statement by default:

SELECT ?o {
  wd:M19705716 a ?o .
}

returns :

schema:ImageObject
schema:MediaObject
wikibase:Mediainfo
Mon, May 25, 4:15 PM · Wikidata, Wikidata-Query-Service

Tue, May 19

dcausse closed T230754: WDQS labs role role::wdqs::labs fails when not finding /srv/wdqs as Resolved.
Tue, May 19, 7:18 PM · Wikidata, Wikidata-Query-Service

Fri, May 15

Mahir256 awarded T243292: Fix the munger to support commons RDF dump a Party Time token.
Fri, May 15, 7:17 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

Thu, May 14

dcausse moved T251275: [WDQS Streaming Updater] Update blazegraph based on the content present in the streaming updater output kafka stream from In Progress to Needs review on the Discovery-Search (Current work) board.
Thu, May 14, 6:46 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
dcausse moved T243292: Fix the munger to support commons RDF dump from In Progress to To Be Deployed on the Discovery-Search (Current work) board.
Thu, May 14, 4:40 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
dcausse assigned T243292: Fix the munger to support commons RDF dump to Zbyszko.
Thu, May 14, 4:40 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

Tue, May 12

dcausse added a comment to T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS.

I thus view it misleading to state in this Phabricator ticket that "performance issues [of the WDQS] cause edits on wikidata to be throttled", which gives the impression that the WDQS forms a part of the Wikidata editing process or some other essential part of Wikidata itself.

Tue, May 12, 2:29 PM · Community-consensus-needed, Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
dcausse added a comment to T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS.

I was completely unaware that WDQS is so integrated into the inner workings of Wikidata. Where is this described? Was this mentioned in the announcement of the proposed change?

Tue, May 12, 8:13 AM · Community-consensus-needed, Discovery-Search (Current work), Wikidata-Query-Service, Wikidata

Mon, May 11

dcausse added a comment to T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS.

If 'unskolemizing' is a trivial step then that should be implemented by WDQS, instead of pushing it to every consumer (including indirect consumers) of Wikidata information, if this change is simply a change to make WDQS work faster.

Mon, May 11, 2:02 PM · Community-consensus-needed, Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
dcausse added a comment to T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS.

Is anyone proposing a change to Wikibase (or Wikidata)?

Yes – the goal is that the RDF in the query service, the RDF dumps, and the output of Special:EntityData all change.

Mon, May 11, 12:17 PM · Community-consensus-needed, Discovery-Search (Current work), Wikidata-Query-Service, Wikidata

Fri, May 8

Harej awarded T244590: EPIC: Rework the WDQS updater as an event driven application a Like token.
Fri, May 8, 4:35 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service, Epic

Wed, May 6

dcausse closed T169798: Create UDFs for analyzing SPARQL queries, a subtask of T143819: Data request for logs from SparQL interface at query.wikidata.org, as Declined.
Wed, May 6, 12:59 PM · Analytics, Discovery, Wikidata-Query-Service, Wikidata
dcausse closed T169798: Create UDFs for analyzing SPARQL queries as Declined.

Closing this, @JAllemandou has done plenty of work on this already.

Wed, May 6, 12:58 PM · User-Smalyshev, Discovery, Wikidata, Wikidata-Query-Service
dcausse added a comment to T249260: SUPPORT: wikibase update from 1.33 to 1.34 error message elastic search.

Is there a way to reindex?

If the index already exists perhaps forcing a reindex might help.
For this you need to run:
php updateSearchIndexConfig.php --reindexAndRemoveOk --indexIdentifier now

Wed, May 6, 12:01 PM · Wikidata-Campsite (Wikidata-Campsite-Iteration-∞), User-Addshore, Wikibase-Containers, Wikidata

Tue, May 5

dcausse moved T245541: Add a new munge option to do blank node skolemization from Needs review to To Be Deployed on the Discovery-Search (Current work) board.
Tue, May 5, 12:54 PM · Wikidata-Campsite (Wikidata-Campsite-Iteration-∞), Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
dcausse claimed T251275: [WDQS Streaming Updater] Update blazegraph based on the content present in the streaming updater output kafka stream.
Tue, May 5, 12:49 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
dcausse closed T249097: [WDQS Streaming Updater] Fix pipeline checkpointing, a subtask of T244590: EPIC: Rework the WDQS updater as an event driven application, as Declined.
Tue, May 5, 12:47 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service, Epic
dcausse closed T249097: [WDQS Streaming Updater] Fix pipeline checkpointing as Declined.

checkpointing works as expected now

Tue, May 5, 12:47 PM · Wikidata, Wikidata-Query-Service
dcausse added a comment to T249099: [WDQS Streaming Updater] Error during munging process.

happened a couple of times on a test run:

FailedOp(FullImport(Q93246620,2020-05-04T12:57:49Z,1173447691),org.wikidata.query.rdf.tool.exception.ContainedException: Didn't get a revision id for [])
FailedOp(FullImport(Q12439094,2020-05-04T13:23:11Z,1173459859),org.wikidata.query.rdf.tool.exception.ContainedException: Didn't get a revision id for [])
FailedOp(Diff(Q93114550,2020-05-04T14:37:37Z,1173500364,1173485266),org.wikidata.query.rdf.tool.exception.ContainedException: Didn't get a revision id for [])
FailedOp(Diff(Q93265456,2020-05-04T15:13:36Z,1173518285,1173505327),org.wikidata.query.rdf.tool.exception.ContainedException: Didn't get a revision id for [])
FailedOp(Diff(Q93265456,2020-05-04T15:16:04Z,1173520772,1173518285),org.wikidata.query.rdf.tool.exception.ContainedException: Didn't get a revision id for [])
FailedOp(Diff(Q93265456,2020-05-04T15:36:20Z,1173531303,1173520772),org.wikidata.query.rdf.tool.exception.ContainedException: Didn't get a revision id for [])
FailedOp(Diff(Q93335412,2020-05-05T09:49:16Z,1174201293,1174194685),org.wikidata.query.rdf.tool.exception.ContainedException: Didn't get a revision id for [])
FailedOp(Diff(Q93335412,2020-05-05T09:49:16Z,1174201299,1174201293),org.wikidata.query.rdf.tool.exception.ContainedException: Didn't get a revision id for [])
FailedOp(Diff(Q93248652,2020-05-05T09:48:09Z,1174200244,1173531244),org.wikidata.query.rdf.tool.exception.ContainedException: Didn't get a revision id for [])
Tue, May 5, 12:42 PM · Wikidata, Wikidata-Query-Service
dcausse moved T251270: The streaming updater should produce its events to kafka from In Progress to Needs review on the Discovery-Search (Current work) board.
Tue, May 5, 12:36 PM · Patch-For-Review, Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

Apr 30 2020

dcausse updated the task description for T245541: Add a new munge option to do blank node skolemization.
Apr 30 2020, 5:19 PM · Wikidata-Campsite (Wikidata-Campsite-Iteration-∞), Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
dcausse updated the task description for T245541: Add a new munge option to do blank node skolemization.
Apr 30 2020, 5:18 PM · Wikidata-Campsite (Wikidata-Campsite-Iteration-∞), Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
dcausse added a comment to T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS.

I don't understand why it was considered necessary to make a breaking change the RDF dump to improve WDQS performance when there is a solution that does not make a breaking change to the dump.

Apr 30 2020, 5:13 PM · Community-consensus-needed, Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
dcausse renamed T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS from Wikibase RDF dump: stop using blank nodes for encoding SomeValue and OWL constraints to Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS.
Apr 30 2020, 5:12 PM · Community-consensus-needed, Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
dcausse added a comment to T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS.

@Multichill the discussion seems to have stalled. Thanks to Peter the pros and cons have been well summarized now. I also understand that part of the misunderstanding of this change was the lack of clarity on the motivations as to why we require a breaking change like that. I hope it had been addressed in the linked discussion.
Do you have additional comments to make here? Thanks!

Apr 30 2020, 12:39 PM · Community-consensus-needed, Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
dcausse triaged T251387: Missing sitelinks for some wikibase items as High priority.
Apr 30 2020, 9:23 AM · Wikidata, Wikidata-Query-Service
dcausse added a comment to T251387: Missing sitelinks for some wikibase items.

I think the best approach here is to wait for the cleanup in T249613 and its report then make sure that true duplicates are removed and then schedule a new full reload of all the servers.
In the meantime items can manually be fixed by doing a null edit, this is far from ideal but I don't think we have a better option at the moment.

Apr 30 2020, 9:14 AM · Wikidata, Wikidata-Query-Service

Apr 29 2020

dcausse added a comment to T251387: Missing sitelinks for some wikibase items.

@Peter_James thanks! The current update strategy assumes that entity <> sitelink pairs are unique and thus when a sitelink is removed it blindly assumes that it's not used elsewhere. Not doing so would require a much more costly update process that would have to verify if it's being used by other entities.
T44325 (perhaps exacerbated by T249565) is probably the root cause.

Apr 29 2020, 12:29 PM · Wikidata, Wikidata-Query-Service
dcausse updated the task description for T251387: Missing sitelinks for some wikibase items.
Apr 29 2020, 10:26 AM · Wikidata, Wikidata-Query-Service
dcausse updated the task description for T251387: Missing sitelinks for some wikibase items.
Apr 29 2020, 10:25 AM · Wikidata, Wikidata-Query-Service
dcausse created T251387: Missing sitelinks for some wikibase items.
Apr 29 2020, 10:23 AM · Wikidata, Wikidata-Query-Service

Apr 28 2020

dcausse added a comment to P11067 Unit entity vs propety confusion (P199 vs Q199) on reloaded test server.

running:

curl -d 'query=SELECT * WHERE { wd:Q163320 p:P1106 / psv:P1106 ?a .  ?a wikibase:quantityUnit ?unit .  }&format=json' http://localhost/bigdata/namespace/wdq/sparql

on wdqs1010.eqiad.wmnet.

Apr 28 2020, 7:22 PM
dcausse created P11067 Unit entity vs propety confusion (P199 vs Q199) on reloaded test server.
Apr 28 2020, 7:09 PM
dcausse renamed T251275: [WDQS Streaming Updater] Update blazegraph based on the content present in the streaming updater output kafka stream from Add a new updater component to update blazegraph based on the content present in the streaming updater output kafka stream to Update blazegraph based on the content present in the streaming updater output kafka stream.
Apr 28 2020, 1:56 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
dcausse added a subtask for T244590: EPIC: Rework the WDQS updater as an event driven application: T251275: [WDQS Streaming Updater] Update blazegraph based on the content present in the streaming updater output kafka stream.
Apr 28 2020, 1:54 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service, Epic
dcausse added a parent task for T251275: [WDQS Streaming Updater] Update blazegraph based on the content present in the streaming updater output kafka stream: T244590: EPIC: Rework the WDQS updater as an event driven application.
Apr 28 2020, 1:54 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
dcausse created T251275: [WDQS Streaming Updater] Update blazegraph based on the content present in the streaming updater output kafka stream.
Apr 28 2020, 1:54 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
dcausse moved T248464: [WDQS Streaming Updater] Implement ouput format in Streaming Updater from In Progress to Needs review on the Discovery-Search (Current work) board.
Apr 28 2020, 1:43 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
dcausse triaged T251270: The streaming updater should produce its events to kafka as Medium priority.
Apr 28 2020, 1:43 PM · Patch-For-Review, Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
dcausse added a subtask for T244590: EPIC: Rework the WDQS updater as an event driven application: T251270: The streaming updater should produce its events to kafka.
Apr 28 2020, 1:42 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service, Epic
dcausse added a parent task for T251270: The streaming updater should produce its events to kafka: T244590: EPIC: Rework the WDQS updater as an event driven application.
Apr 28 2020, 1:42 PM · Patch-For-Review, Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
dcausse created T251270: The streaming updater should produce its events to kafka.
Apr 28 2020, 1:42 PM · Patch-For-Review, Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
dcausse added a project to T251257: Create a Kerberos identity for zpapierski: Analytics.
Apr 28 2020, 12:24 PM · Analytics
dcausse triaged T242453: Deadlock in blazegraph blocking all queries and updates as High priority.

Raising to high, this issue might be hard to solve as it sounds related to the blazegraph design flaw of running with unbounded thread pools.
We might perhaps at least try to add some debugging code to isolate the request that's causing the deadlock and sees if we find a pattern.

Apr 28 2020, 9:44 AM · Wikidata, Wikidata-Query-Service
dcausse added a subtask for T251149: [epic] Ryan's onboarding to the Search Platform team: T250140: icinga: WDQS high update lag should alert when the service times out.
Apr 28 2020, 9:36 AM · Discovery-Search (Current work), Epic
dcausse added a parent task for T250140: icinga: WDQS high update lag should alert when the service times out: T251149: [epic] Ryan's onboarding to the Search Platform team.
Apr 28 2020, 9:36 AM · observability, Wikidata-Query-Service, Wikidata
dcausse triaged T250140: icinga: WDQS high update lag should alert when the service times out as High priority.
Apr 28 2020, 9:36 AM · observability, Wikidata-Query-Service, Wikidata

Apr 27 2020

dcausse moved T245728: Add a component to generate a diff between two entity revisions from In Progress to Needs review on the Discovery-Search (Current work) board.
Apr 27 2020, 5:42 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service

Apr 24 2020

Addshore awarded T242453: Deadlock in blazegraph blocking all queries and updates a The World Burns token.
Apr 24 2020, 3:13 PM · Wikidata, Wikidata-Query-Service

Apr 21 2020

dcausse moved T245541: Add a new munge option to do blank node skolemization from In Progress to Needs review on the Discovery-Search (Current work) board.
Apr 21 2020, 4:38 PM · Wikidata-Campsite (Wikidata-Campsite-Iteration-∞), Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
dcausse triaged T248464: [WDQS Streaming Updater] Implement ouput format in Streaming Updater as Medium priority.
Apr 21 2020, 4:38 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
dcausse triaged T245728: Add a component to generate a diff between two entity revisions as Medium priority.
Apr 21 2020, 4:35 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
dcausse claimed T245728: Add a component to generate a diff between two entity revisions.
Apr 21 2020, 4:35 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
dcausse created T250806: Fix CirrusSearch maint scripts call sites to use file names compliant with autoloader.
Apr 21 2020, 2:05 PM · MW-1.35-notes (1.35.0-wmf.34; 2020-05-26), Discovery-Search (Current work), CirrusSearch
dcausse updated the title for P11033 cindy and maint script namespaces from untitled to cindy and maint script namespaces.
Apr 21 2020, 1:41 PM
dcausse created P11033 cindy and maint script namespaces.
Apr 21 2020, 1:40 PM
dcausse added a comment to T221917: Create RDF dump of structured data on Commons.

@ArielGlenn no not yet, this is still blocked on T243292 which requires some investigation to determine which component (dump or the wdqs transformation process) is wrong.

Apr 21 2020, 9:00 AM · Patch-For-Review, Dumps-Generation, MW-1.34-notes (1.34.0-wmf.10; 2019-06-18), Wikidata-Query-Service, Commons, Wikidata

Apr 20 2020

dcausse created P11026 import wikidata ttl dumps to hdfs.
Apr 20 2020, 5:57 PM
dcausse added a comment to T228348: Category graph includes deleted categories.

merged in T246568 which is where we'll announce that the full reload has been done.

Apr 20 2020, 5:39 PM · Discovery-Search (Current work), MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), User-Smalyshev, Wikidata-Query-Service, Wikidata
dcausse merged T228348: Category graph includes deleted categories into T246568: Deepcategory returns only very few results.
Apr 20 2020, 5:37 PM · Discovery-Search (Current work), CirrusSearch, Commons
dcausse merged task T228348: Category graph includes deleted categories into T246568: Deepcategory returns only very few results.
Apr 20 2020, 5:37 PM · Discovery-Search (Current work), MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), User-Smalyshev, Wikidata-Query-Service, Wikidata
dcausse moved T246882: commonswiki shard size grew more than 50G in eqiad and codfw from To Be Deployed to Done on the Discovery-Search (Current work) board.
Apr 20 2020, 5:30 PM · Discovery-Search (Current work), Elasticsearch, Discovery
dcausse moved T240550: Add mapping for ORES topic field in ElasticSearch from To Be Deployed to Done on the Discovery-Search (Current work) board.
Apr 20 2020, 5:30 PM · MW-1.35-notes (1.35.0-wmf.15; 2020-01-14), Discovery-Search (Current work)
dcausse moved T249196: Test the impact of the wdqs updater performance by disabling values cleanup from To Be Deployed to Done on the Discovery-Search (Current work) board.
Apr 20 2020, 5:30 PM · Wikidata, Wikidata-Query-Service, Discovery-Search (Current work)
dcausse moved T248101: WDQS query logs lack http.client_ip from To Be Deployed to Done on the Discovery-Search (Current work) board.
Apr 20 2020, 5:30 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
dcausse removed a subtask for T244590: EPIC: Rework the WDQS updater as an event driven application: T243603: Create a way to deploy WDQS artifacts to Archiva with Jenkins.
Apr 20 2020, 4:19 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service, Epic
dcausse removed a parent task for T243603: Create a way to deploy WDQS artifacts to Archiva with Jenkins: T244590: EPIC: Rework the WDQS updater as an event driven application.
Apr 20 2020, 4:19 PM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
dcausse added a comment to T242453: Deadlock in blazegraph blocking all queries and updates.

This just happened again, so depooled and restarted 1006, and switched traffic over to codfw.
Seems to always be 1006?

Apr 20 2020, 3:29 PM · Wikidata, Wikidata-Query-Service
dcausse created P11024 cindy failure.
Apr 20 2020, 2:11 PM

Apr 18 2020

William_Avery awarded T250140: icinga: WDQS high update lag should alert when the service times out a Orange Medal token.
Apr 18 2020, 3:16 PM · observability, Wikidata-Query-Service, Wikidata
eranroz awarded T242453: Deadlock in blazegraph blocking all queries and updates a Burninate token.
Apr 18 2020, 2:37 PM · Wikidata, Wikidata-Query-Service

Apr 17 2020

dcausse added a comment to T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS.

Many queries use the optimizer hint hint:Prior hint:rangeSafe true. when e.g. comparing date or number values with constants in a filter as suggested at https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/query_optimization#Fixed_values_and_ranges. Is there a risc that such queries will fail or give wrong results when somevalue become IRI's, and thus the values will be of different types?

Apr 17 2020, 2:26 PM · Community-consensus-needed, Discovery-Search (Current work), Wikidata-Query-Service, Wikidata

Apr 16 2020

dcausse added a comment to T244341: Stop using blank nodes for encoding SomeValue and OWL constraints in WDQS.

Yes, isLiteral should still work for properties where the real values are literals. Without knowing the internal workings of Blazegraph I would guess that it is more efficient than STRSTARTS( STR(?o), 'http://www.wikidata.org/prop/somevalue/' ) . Maybe that could be used in some way?

Apr 16 2020, 3:04 PM · Community-consensus-needed, Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
dcausse added projects to T192297: Deepcategory results in empty results if trying to exclude a category: Discovery-Search, CirrusSearch.
Apr 16 2020, 12:52 PM · CirrusSearch, Discovery-Search, Discovery

Apr 15 2020

dcausse closed T239397: Wikibase RDF output does not link to the same blank node truthy statement values and the value from the reified statement as Declined.
Apr 15 2020, 12:32 PM · MediaWiki-extensions-WikibaseRepository, Wikidata-Query-Service, Wikidata

Apr 14 2020

dcausse added a comment to T246568: Deepcategory returns only very few results.

We deployed a fix to blazegraph so the wikidata/query/deploy patch should no longer be needed. Last failure occurrence is around 2020-03-21 and the fix was deployed last week on 2020-04-09, it's probably too early to claim victory.
Next step is to do a full reload of the category graph and monitor carefully that the daily dumps are properly applied.

Apr 14 2020, 10:18 AM · Discovery-Search (Current work), CirrusSearch, Commons
Addshore awarded T250140: icinga: WDQS high update lag should alert when the service times out a Like token.
Apr 14 2020, 9:56 AM · observability, Wikidata-Query-Service, Wikidata
dcausse created T250140: icinga: WDQS high update lag should alert when the service times out.
Apr 14 2020, 9:48 AM · observability, Wikidata-Query-Service, Wikidata
dcausse awarded T250137: Requesting access to wdqs-admins for Addshore a Like token.
Apr 14 2020, 8:47 AM · User-Addshore, Operations, SRE-Access-Requests

Apr 10 2020

dcausse created P10951 Wikibase/munge issues.
Apr 10 2020, 9:11 AM

Apr 8 2020

dcausse moved T249435: Search index for page 5 days out of date from In Progress to Needs review on the Discovery-Search (Current work) board.
Apr 8 2020, 7:29 PM · Discovery-Search (Current work)
dcausse triaged T248363: Haslabel treats aliases as labels as Medium priority.

Aliases were put in the labels field for performance reasons, we need to investigated whether it's feasible or not to add an alias field (note that due to the multilingual nature of wikidata this is something we have to ponder carefully because the current strategy is to add one field per language).

Apr 8 2020, 6:40 PM · Wikidata, CirrusSearch, Discovery-Search
dcausse triaged T248365: New keyword hasalias and inalias as Medium priority.
Apr 8 2020, 6:37 PM · CirrusSearch, Discovery-Search, Wikidata
dcausse moved T248740: No instant search suggestion shown for a specific user after entering search terms on www.wikipedia.org from needs triage to UI tickets on the Discovery-Search board.
Apr 8 2020, 6:36 PM · Patch-For-Review, Discovery-Search, User-DannyS712, Wikimedia-Portals
dcausse moved T248618: logspam: ReindexTask.php causing a bunch of Undefined index notices from elastic / cirrus to Current work on the Discovery-Search board.
Apr 8 2020, 6:33 PM · MW-1.35-notes (1.35.0-wmf.28; 2020-04-14), Discovery-Search (Current work), CirrusSearch, Wikimedia-production-error
dcausse moved T248618: logspam: ReindexTask.php causing a bunch of Undefined index notices from needs triage to elastic / cirrus on the Discovery-Search board.
Apr 8 2020, 6:33 PM · MW-1.35-notes (1.35.0-wmf.28; 2020-04-14), Discovery-Search (Current work), CirrusSearch, Wikimedia-production-error
dcausse moved T249435: Search index for page 5 days out of date from elastic / cirrus to Current work on the Discovery-Search board.
Apr 8 2020, 6:31 PM · Discovery-Search (Current work)