Page MenuHomePhabricator

Smalyshev (Stas Malyshev)
Engineer in Search Platform team

Projects (7)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Thursday

  • Clear sailing ahead.

User Details

User Since
Nov 28 2014, 7:04 AM (250 w, 4 d)
Availability
Available
IRC Nick
Smalyshev
LDAP User
Smalyshev
MediaWiki User
Laboramus [ Global Accounts ]

Recent Activity

Mon, Sep 9

Smalyshev added a comment to T232212: QuantityValue quantityUnit contains both Q and P value in Wikidata Query Service - P value is wrong.

I don't think it's worth bothering with depooling, unless the number of affected items is very large, it should be quick enough so nobody should really notice.

Mon, Sep 9, 4:48 PM · Wikidata-Query-Service, Wikidata

Sat, Sep 7

Smalyshev updated subscribers of T232212: QuantityValue quantityUnit contains both Q and P value in Wikidata Query Service - P value is wrong.
Sat, Sep 7, 5:01 AM · Wikidata-Query-Service, Wikidata
Smalyshev updated subscribers of T232212: QuantityValue quantityUnit contains both Q and P value in Wikidata Query Service - P value is wrong.
Sat, Sep 7, 5:00 AM · Wikidata-Query-Service, Wikidata
Smalyshev added a comment to T232212: QuantityValue quantityUnit contains both Q and P value in Wikidata Query Service - P value is wrong.

This may happen because value nodes are not updated when data is updated (since they are supposed to be immutable). So if some bad data sneaked in when the problem was there, the bad value (and possibly reference since they behave the same way) nodes are still there. The best way to do it would be:

Sat, Sep 7, 5:00 AM · Wikidata-Query-Service, Wikidata

Wed, Sep 4

Smalyshev placed T221917: Create RDF dump of structured data on Commons up for grabs.
Wed, Sep 4, 5:52 AM · Dumps-Generation, MW-1.34-notes (1.34.0-wmf.10; 2019-06-18), Patch-For-Review, WikibaseMediaInfo, Wikidata-Query-Service, SDC General, Commons, Wikidata
Smalyshev added a comment to T221631: Dedicated servers on WMCS to test WDQS scalability strategy.

Both evaluating Virtuoso and other solutions (like JanusGraph) would require that. @Gehel should know the details.

Wed, Sep 4, 12:55 AM · cloud-services-team (Kanban), Wikidata, Wikidata-Query-Service, Discovery-Search

Thu, Aug 29

Smalyshev moved T159723: NotMaterializedException when one branch of UNION binds ?variable and other branch binds ?variableLabel and label service is used from Needs review to Done on the Discovery-Wikidata-Query-Service-Sprint board.
Thu, Aug 29, 9:44 PM · Discovery-Wikidata-Query-Service-Sprint, Upstream, Discovery, Wikidata-Query-Service, Wikidata
Smalyshev moved T170704: NME when using label service and rdfs:label predicate with the same variable from Needs review to Done on the Discovery-Wikidata-Query-Service-Sprint board.
Thu, Aug 29, 9:44 PM · Discovery-Wikidata-Query-Service-Sprint, Upstream, Wikidata-Query-Service, Wikidata, Discovery
Smalyshev removed a project from T168876: MWAPI service throws “could not find binding for parameter” if optimizer is not disabled: Patch-For-Review.
Thu, Aug 29, 9:42 PM · Discovery-Wikidata-Query-Service-Sprint, WDQS-Optimizer, Upstream, Wikidata-Query-Service, Discovery, Wikidata
Smalyshev moved T168876: MWAPI service throws “could not find binding for parameter” if optimizer is not disabled from Backlog to Done on the Discovery-Wikidata-Query-Service-Sprint board.
Thu, Aug 29, 9:42 PM · Discovery-Wikidata-Query-Service-Sprint, WDQS-Optimizer, Upstream, Wikidata-Query-Service, Discovery, Wikidata
Smalyshev added a project to T168876: MWAPI service throws “could not find binding for parameter” if optimizer is not disabled: Discovery-Wikidata-Query-Service-Sprint.
Thu, Aug 29, 9:41 PM · Discovery-Wikidata-Query-Service-Sprint, WDQS-Optimizer, Upstream, Wikidata-Query-Service, Discovery, Wikidata
Smalyshev moved T170704: NME when using label service and rdfs:label predicate with the same variable from Backlog to Needs review on the Discovery-Wikidata-Query-Service-Sprint board.
Thu, Aug 29, 9:41 PM · Discovery-Wikidata-Query-Service-Sprint, Upstream, Wikidata-Query-Service, Wikidata, Discovery
Smalyshev added a project to T170704: NME when using label service and rdfs:label predicate with the same variable: Discovery-Wikidata-Query-Service-Sprint.
Thu, Aug 29, 9:40 PM · Discovery-Wikidata-Query-Service-Sprint, Upstream, Wikidata-Query-Service, Wikidata, Discovery
Smalyshev moved T165559: HAVING in named subquery results in “non-aggregate variable in select expression” error from Needs review to Done on the Discovery-Wikidata-Query-Service-Sprint board.
Thu, Aug 29, 9:39 PM · Discovery-Wikidata-Query-Service-Sprint, Upstream, Discovery, Wikidata-Query-Service, Wikidata
Smalyshev moved T168741: SELECT * on query with no variables and property path results in NotMaterializedException from Needs review to Done on the Discovery-Wikidata-Query-Service-Sprint board.
Thu, Aug 29, 9:39 PM · Discovery-Wikidata-Query-Service-Sprint, Upstream, Wikidata-Query-Service, Discovery, Wikidata
Smalyshev moved T173243: UnsupportedOperationException on property path in EXISTS from Needs review to Done on the Discovery-Wikidata-Query-Service-Sprint board.
Thu, Aug 29, 9:39 PM · Discovery-Wikidata-Query-Service-Sprint, Upstream, Wikidata, Discovery, Wikidata-Query-Service
Smalyshev moved T172113: ConcurrentModificationException on non-grouping query with aggregates in SELECT from Needs review to Done on the Discovery-Wikidata-Query-Service-Sprint board.
Thu, Aug 29, 9:39 PM · Discovery-Wikidata-Query-Service-Sprint, Upstream, Wikidata-Query-Service, Discovery, Wikidata
Smalyshev moved T173243: UnsupportedOperationException on property path in EXISTS from Backlog to Needs review on the Discovery-Wikidata-Query-Service-Sprint board.
Thu, Aug 29, 7:44 AM · Discovery-Wikidata-Query-Service-Sprint, Upstream, Wikidata, Discovery, Wikidata-Query-Service
Smalyshev moved T231411: Test new Updater service from Backlog to In progress on the Discovery-Wikidata-Query-Service-Sprint board.
Thu, Aug 29, 7:44 AM · Discovery-Wikidata-Query-Service-Sprint, Performance, Wikidata-Query-Service, Wikidata
Smalyshev moved T228348: Category graph includes deleted categories from Needs review to Backlog on the Discovery-Wikidata-Query-Service-Sprint board.
Thu, Aug 29, 7:44 AM · MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), User-Smalyshev, Discovery-Wikidata-Query-Service-Sprint, Wikidata, Wikidata-Query-Service
Smalyshev placed T228348: Category graph includes deleted categories up for grabs.
Thu, Aug 29, 7:44 AM · MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), User-Smalyshev, Discovery-Wikidata-Query-Service-Sprint, Wikidata, Wikidata-Query-Service
Smalyshev added a comment to T231515: Duplicate wdno: clauses on edited properties.

Looks like new updater actually handles it better, but we need to verify that.

Thu, Aug 29, 7:04 AM · Wikidata, Wikidata-Query-Service
Smalyshev created T231515: Duplicate wdno: clauses on edited properties.
Thu, Aug 29, 7:04 AM · Wikidata, Wikidata-Query-Service
Smalyshev triaged T231411: Test new Updater service as Normal priority.
Thu, Aug 29, 6:12 AM · Discovery-Wikidata-Query-Service-Sprint, Performance, Wikidata-Query-Service, Wikidata
Smalyshev added a comment to T231411: Test new Updater service.

Loading 1 hour 25 mins of updates from 201908010000 under both updaters shows no differences except ones that can be attributed to edits (since we always load the latest version on old changes). So this first test seems to be a success.

Thu, Aug 29, 6:10 AM · Discovery-Wikidata-Query-Service-Sprint, Performance, Wikidata-Query-Service, Wikidata
Smalyshev updated the task description for T231411: Test new Updater service.
Thu, Aug 29, 6:06 AM · Discovery-Wikidata-Query-Service-Sprint, Performance, Wikidata-Query-Service, Wikidata
Smalyshev added a comment to T231411: Test new Updater service.

Procedure for comparing journals:

Thu, Aug 29, 5:45 AM · Discovery-Wikidata-Query-Service-Sprint, Performance, Wikidata-Query-Service, Wikidata
Smalyshev updated the task description for T168876: MWAPI service throws “could not find binding for parameter” if optimizer is not disabled.
Thu, Aug 29, 5:22 AM · Discovery-Wikidata-Query-Service-Sprint, WDQS-Optimizer, Upstream, Wikidata-Query-Service, Discovery, Wikidata

Wed, Aug 28

Smalyshev moved T168741: SELECT * on query with no variables and property path results in NotMaterializedException from Backlog to Needs review on the Discovery-Wikidata-Query-Service-Sprint board.
Wed, Aug 28, 11:02 PM · Discovery-Wikidata-Query-Service-Sprint, Upstream, Wikidata-Query-Service, Discovery, Wikidata
Smalyshev moved T172113: ConcurrentModificationException on non-grouping query with aggregates in SELECT from Backlog to Needs review on the Discovery-Wikidata-Query-Service-Sprint board.
Wed, Aug 28, 11:02 PM · Discovery-Wikidata-Query-Service-Sprint, Upstream, Wikidata-Query-Service, Discovery, Wikidata
Smalyshev moved T165559: HAVING in named subquery results in “non-aggregate variable in select expression” error from Backlog to Needs review on the Discovery-Wikidata-Query-Service-Sprint board.
Wed, Aug 28, 11:02 PM · Discovery-Wikidata-Query-Service-Sprint, Upstream, Discovery, Wikidata-Query-Service, Wikidata
Smalyshev moved T159723: NotMaterializedException when one branch of UNION binds ?variable and other branch binds ?variableLabel and label service is used from Backlog to Needs review on the Discovery-Wikidata-Query-Service-Sprint board.
Wed, Aug 28, 11:02 PM · Discovery-Wikidata-Query-Service-Sprint, Upstream, Discovery, Wikidata-Query-Service, Wikidata
Smalyshev added a project to T159723: NotMaterializedException when one branch of UNION binds ?variable and other branch binds ?variableLabel and label service is used: Discovery-Wikidata-Query-Service-Sprint.
Wed, Aug 28, 10:52 PM · Discovery-Wikidata-Query-Service-Sprint, Upstream, Discovery, Wikidata-Query-Service, Wikidata
Smalyshev added a project to T172113: ConcurrentModificationException on non-grouping query with aggregates in SELECT: Discovery-Wikidata-Query-Service-Sprint.
Wed, Aug 28, 10:38 PM · Discovery-Wikidata-Query-Service-Sprint, Upstream, Wikidata-Query-Service, Discovery, Wikidata
Smalyshev added a project to T168741: SELECT * on query with no variables and property path results in NotMaterializedException: Discovery-Wikidata-Query-Service-Sprint.
Wed, Aug 28, 10:37 PM · Discovery-Wikidata-Query-Service-Sprint, Upstream, Wikidata-Query-Service, Discovery, Wikidata
Smalyshev added a project to T173243: UnsupportedOperationException on property path in EXISTS: Discovery-Wikidata-Query-Service-Sprint.
Wed, Aug 28, 10:35 PM · Discovery-Wikidata-Query-Service-Sprint, Upstream, Wikidata, Discovery, Wikidata-Query-Service
Smalyshev added a project to T165559: HAVING in named subquery results in “non-aggregate variable in select expression” error: Discovery-Wikidata-Query-Service-Sprint.
Wed, Aug 28, 10:34 PM · Discovery-Wikidata-Query-Service-Sprint, Upstream, Discovery, Wikidata-Query-Service, Wikidata
Smalyshev updated subscribers of T212826: Create dedicated Updater service in Blazegraph.
Wed, Aug 28, 10:32 PM · Patch-For-Review, Discovery-Wikidata-Query-Service-Sprint, Epic, Performance, Wikidata-Query-Service, Wikidata
Smalyshev added a comment to T212826: Create dedicated Updater service in Blazegraph.

Testing on wdqs-test shows new Updater is 2x faster than old one. Didn't verify validity yet but speed looks good :)

Wed, Aug 28, 10:32 PM · Patch-For-Review, Discovery-Wikidata-Query-Service-Sprint, Epic, Performance, Wikidata-Query-Service, Wikidata
Smalyshev added a comment to P8995 Khmer samples.

Mac OS 10.13.6 (High Sierra), Firefox 68.0.2

Wed, Aug 28, 4:02 PM · Discovery-Search
Smalyshev committed rECIRe4fe4f1609a3: Use makeTitleSafe to normalize deepcat inputs (authored by EBernhardson).
Use makeTitleSafe to normalize deepcat inputs
Wed, Aug 28, 8:07 AM
Nicolas_Raoul awarded T141602: [Story] Provide a SPARQL query service for structured data on Commons a Love token.
Wed, Aug 28, 8:01 AM · Epic, Wikidata-Query-Service, SDC General, Commons, Wikidata
Smalyshev created T231411: Test new Updater service.
Wed, Aug 28, 7:05 AM · Discovery-Wikidata-Query-Service-Sprint, Performance, Wikidata-Query-Service, Wikidata
Smalyshev triaged T230175: Provide search functionality to find all files that have at least 1 structured data statement as Normal priority.
Wed, Aug 28, 7:01 AM · MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), Discovery-Search (Current work), Structured-Data-Backlog, SDC General, Wikidata
Smalyshev moved T230175: Provide search functionality to find all files that have at least 1 structured data statement from Needs review to Done on the Discovery-Search (Current work) board.
Wed, Aug 28, 7:01 AM · MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), Discovery-Search (Current work), Structured-Data-Backlog, SDC General, Wikidata
Smalyshev closed T222306: RDF export generates wrong IDs for federated entities as Resolved.
Wed, Aug 28, 6:37 AM · MW-1.34-notes (1.34.0-wmf.19; 2019-08-20), User-Smalyshev, WikibaseMediaInfo, Wikidata-Query-Service, SDC General, Commons, Wikidata
Smalyshev closed T222306: RDF export generates wrong IDs for federated entities, a subtask of T221916: Create RDF export for structured data stored for files, as Resolved.
Wed, Aug 28, 6:37 AM · MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), Discovery-Wikidata-Query-Service-Sprint, User-Smalyshev, WikibaseMediaInfo, Wikidata-Query-Service, SDC General, Commons, Wikidata

Tue, Aug 27

Smalyshev moved T228348: Category graph includes deleted categories from In progress to Needs review on the Discovery-Wikidata-Query-Service-Sprint board.
Tue, Aug 27, 11:51 PM · MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), User-Smalyshev, Discovery-Wikidata-Query-Service-Sprint, Wikidata, Wikidata-Query-Service
Smalyshev moved T228348: Category graph includes deleted categories from Next to In review on the User-Smalyshev board.
Tue, Aug 27, 11:24 PM · MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), User-Smalyshev, Discovery-Wikidata-Query-Service-Sprint, Wikidata, Wikidata-Query-Service
Smalyshev updated subscribers of T228348: Category graph includes deleted categories.

After the patch is merged and deployed, categories DB needs to be re-loaded according to procedure here: https://wikitech.wikimedia.org/wiki/Wikidata_query_service#Categories_reload_procedure

Tue, Aug 27, 11:24 PM · MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), User-Smalyshev, Discovery-Wikidata-Query-Service-Sprint, Wikidata, Wikidata-Query-Service
Smalyshev added a comment to T228348: Category graph includes deleted categories.

Looks like DELETE SPARQL clauses that the daily dump is generating are wrong... Weird I haven't noticed it.

Tue, Aug 27, 11:16 PM · MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), User-Smalyshev, Discovery-Wikidata-Query-Service-Sprint, Wikidata, Wikidata-Query-Service
Smalyshev moved T228348: Category graph includes deleted categories from Backlog to In progress on the Discovery-Wikidata-Query-Service-Sprint board.
Tue, Aug 27, 10:49 PM · MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), User-Smalyshev, Discovery-Wikidata-Query-Service-Sprint, Wikidata, Wikidata-Query-Service
Smalyshev added a comment to T228348: Category graph includes deleted categories.

Looks like there's some problem with deletion handling. E.g. https://en.wikipedia.org/wiki/Category:Delaware_elections,_2006 has been deleted and is listed in enwiki-20190826-daily.sparql.gz dump as deleted, but still present in the database. Strangely enough, the log shows the file was successfully processed - but somehow the results are not there. Will investigate further.

Tue, Aug 27, 10:49 PM · MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), User-Smalyshev, Discovery-Wikidata-Query-Service-Sprint, Wikidata, Wikidata-Query-Service
Smalyshev merged task T223773: MWAPI requests for external links return fewer than expected into T231390: MWAPI can only match one result per page.
Tue, Aug 27, 9:46 PM · Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service, Wikidata
Smalyshev merged T223773: MWAPI requests for external links return fewer than expected into T231390: MWAPI can only match one result per page.
Tue, Aug 27, 9:46 PM · Wikidata, Wikidata-Query-Service
Smalyshev added a comment to T223773: MWAPI requests for external links return fewer than expected.

I've created T231390: MWAPI can only match one result per page for handling the multiple values in one result issue.

Tue, Aug 27, 9:45 PM · Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service, Wikidata
Smalyshev created T231390: MWAPI can only match one result per page.
Tue, Aug 27, 9:44 PM · Wikidata, Wikidata-Query-Service
Smalyshev moved T230750: dpkg error when using role::wdqs::labs role from All WDQS-related tasks to Operations on the Wikidata-Query-Service board.
Tue, Aug 27, 9:40 PM · Wikidata-Query-Service, Wikidata
Smalyshev moved T230754: WDQS labs role role::wdqs::labs fails when not finding /srv/wdqs from All WDQS-related tasks to Operations on the Wikidata-Query-Service board.
Tue, Aug 27, 9:40 PM · Wikidata, Wikidata-Query-Service
Smalyshev moved T230755: WDQS labs role role::wdqs::labs creates /srv/wdqs/blazegraph with wrong permissions from All WDQS-related tasks to Operations on the Wikidata-Query-Service board.
Tue, Aug 27, 9:39 PM · Wikidata, Wikidata-Query-Service
Smalyshev moved T230840: Set up proper prefix configuration for RDF export on Commons from All WDQS-related tasks to Wikidata & SDC on the Wikidata-Query-Service board.
Tue, Aug 27, 9:39 PM · Patch-For-Review, WikibaseMediaInfo, Wikidata-Query-Service, SDC General, Commons, Wikidata
Smalyshev moved T230856: RDF dump performance for SDC from All WDQS-related tasks to Wikidata & SDC on the Wikidata-Query-Service board.
Tue, Aug 27, 9:39 PM · Dumps-Generation, WikibaseMediaInfo, Wikidata-Query-Service, SDC General, Commons, Wikidata
Smalyshev triaged T230862: Create a way to filter only WB-related changes from Commons recentchanges as High priority.
Tue, Aug 27, 9:39 PM · Structured Data Engineering, Structured-Data-Backlog, MediaWiki-API, Wikidata-Query-Service, SDC General, Commons, Wikidata
Smalyshev moved T230862: Create a way to filter only WB-related changes from Commons recentchanges from All WDQS-related tasks to Wikidata & SDC on the Wikidata-Query-Service board.
Tue, Aug 27, 9:39 PM · Structured Data Engineering, Structured-Data-Backlog, MediaWiki-API, Wikidata-Query-Service, SDC General, Commons, Wikidata
Smalyshev updated the task description for T222321: Make /entity/ alias work for Commons.
Tue, Aug 27, 7:49 PM · Discovery-Wikidata-Query-Service-Sprint, Patch-For-Review, Wikimedia-Apache-configuration, User-Smalyshev, WikibaseMediaInfo, Wikidata-Query-Service, SDC General, Commons, Wikidata
Smalyshev reassigned T222321: Make /entity/ alias work for Commons from Smalyshev to Gehel.
Tue, Aug 27, 7:47 PM · Discovery-Wikidata-Query-Service-Sprint, Patch-For-Review, Wikimedia-Apache-configuration, User-Smalyshev, WikibaseMediaInfo, Wikidata-Query-Service, SDC General, Commons, Wikidata
Smalyshev added a comment to T230862: Create a way to filter only WB-related changes from Commons recentchanges.

RecentChanges has many flaws (for example, it is not a reliable stream as timestamps are not sequential and it can't be queried by RC ID - see https://gerrit.wikimedia.org/r/c/mediawiki/core/+/302368) but it is the only way to get change stream for a wiki without setting up Kafka, etc. as I understand. So I imagine until we get containers with all that stuff working we're stuck with RC as the only option to get changes in public.

Tue, Aug 27, 7:40 PM · Structured Data Engineering, Structured-Data-Backlog, MediaWiki-API, Wikidata-Query-Service, SDC General, Commons, Wikidata
Smalyshev added a comment to T230288: Allow CHUNK value to be passed in as an option for munge.sh.

@Addshore 0.3.2 should be up already.

Tue, Aug 27, 7:00 PM · Discovery-Wikidata-Query-Service-Sprint, User-Addshore, Wikidata-Query-Service, Wikidata
Smalyshev claimed T228348: Category graph includes deleted categories.
Tue, Aug 27, 5:33 PM · MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), User-Smalyshev, Discovery-Wikidata-Query-Service-Sprint, Wikidata, Wikidata-Query-Service
Smalyshev triaged T231264: Lexeme tests produce errors on merge as High priority.
Tue, Aug 27, 5:17 AM · Wikidata-Campsite, Wikimedia-production-error (Shared Build Failure), Lexicographical data, Wikidata
Smalyshev created T231264: Lexeme tests produce errors on merge.
Tue, Aug 27, 5:17 AM · Wikidata-Campsite, Wikimedia-production-error (Shared Build Failure), Lexicographical data, Wikidata

Mon, Aug 26

Smalyshev closed T229377: Make WDQS deploy not require chrome tests as Resolved.
Mon, Aug 26, 6:36 AM · Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service, Wikidata
Smalyshev closed T230288: Allow CHUNK value to be passed in as an option for munge.sh as Resolved.
Mon, Aug 26, 6:35 AM · Discovery-Wikidata-Query-Service-Sprint, User-Addshore, Wikidata-Query-Service, Wikidata
Smalyshev added a comment to T221917: Create RDF dump of structured data on Commons.

I tried to manually dump the mediainfo entries over the weekend, it took 376 minutes for 4 shards (a lot, but less than I expected) and produces 1724656 items. Does not seem to produce significant load on DB so far - but it gives about 20 items/second, which seems to be too slow. If we ever get all files having items, that'd take 4 days to process over 8 shards, probably more since DB access will get slower, right now they are not to slow because there's only 2% of files that have items, so not too many DB queries.

Mon, Aug 26, 6:35 AM · Dumps-Generation, MW-1.34-notes (1.34.0-wmf.10; 2019-06-18), Patch-For-Review, WikibaseMediaInfo, Wikidata-Query-Service, SDC General, Commons, Wikidata

Sun, Aug 25

Smalyshev added a comment to T229608: Support SDC URIs in WDQS URI schemes.

@Multichill eventually yes, but since they are not being used anywhere yet it's too early to document them. Once RDF export is properly set up to use these prefixes then we can document them officially.

Sun, Aug 25, 8:17 PM · Discovery-Wikidata-Query-Service-Sprint, Patch-For-Review, User-Smalyshev, Wikidata-Query-Service, SDC General, Wikidata

Fri, Aug 23

Smalyshev closed T230244: Restore wdqs1009 to its role as auto-deploy testing as Resolved.
Fri, Aug 23, 11:40 PM · Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service, Wikidata
Smalyshev triaged T230244: Restore wdqs1009 to its role as auto-deploy testing as Normal priority.
Fri, Aug 23, 11:39 PM · Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service, Wikidata
Smalyshev closed T229608: Support SDC URIs in WDQS URI schemes, a subtask of T141602: [Story] Provide a SPARQL query service for structured data on Commons, as Resolved.
Fri, Aug 23, 12:07 AM · Epic, Wikidata-Query-Service, SDC General, Commons, Wikidata
Smalyshev closed T229608: Support SDC URIs in WDQS URI schemes as Resolved.
Fri, Aug 23, 12:07 AM · Discovery-Wikidata-Query-Service-Sprint, Patch-For-Review, User-Smalyshev, Wikidata-Query-Service, SDC General, Wikidata
Smalyshev closed T230974: New lexemes missing in Wikidata Query Service as Resolved.

All should be updated now.

Fri, Aug 23, 12:06 AM · MW-1.34-notes (1.34.0-wmf.20; 2019-08-27), Wikidata, Lexicographical data, Wikidata-Query-Service

Thu, Aug 22

Smalyshev lowered the priority of T230974: New lexemes missing in Wikidata Query Service from Unbreak Now! to Normal.

Immediate RDF breakage fixed, now I'll have to update lexemes that were affected.

Thu, Aug 22, 11:44 PM · MW-1.34-notes (1.34.0-wmf.20; 2019-08-27), Wikidata, Lexicographical data, Wikidata-Query-Service
Smalyshev added a comment to T222757: quibble-vendor-mysql-hhvm-docker for WikibaseCirrusSearch takes over 40 minutes.

Ok I am getting multiple builds taking 50+ minutes again for Wikibase, e.g.:

Thu, Aug 22, 11:19 PM · Release-Engineering-Team (CI & Testing services), Release-Engineering-Team-TODO, Continuous-Integration-Infrastructure, Discovery-Search
Smalyshev moved T229608: Support SDC URIs in WDQS URI schemes from Needs review to Done on the Discovery-Wikidata-Query-Service-Sprint board.
Thu, Aug 22, 10:59 PM · Discovery-Wikidata-Query-Service-Sprint, Patch-For-Review, User-Smalyshev, Wikidata-Query-Service, SDC General, Wikidata
Smalyshev moved T230244: Restore wdqs1009 to its role as auto-deploy testing from Backlog to In progress on the Discovery-Wikidata-Query-Service-Sprint board.
Thu, Aug 22, 10:58 PM · Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service, Wikidata
Smalyshev added a project to T230244: Restore wdqs1009 to its role as auto-deploy testing: Discovery-Wikidata-Query-Service-Sprint.
Thu, Aug 22, 10:55 PM · Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service, Wikidata
Smalyshev updated the task description for T230244: Restore wdqs1009 to its role as auto-deploy testing.
Thu, Aug 22, 10:54 PM · Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service, Wikidata
Smalyshev updated subscribers of T230974: New lexemes missing in Wikidata Query Service.
Thu, Aug 22, 6:07 PM · MW-1.34-notes (1.34.0-wmf.20; 2019-08-27), Wikidata, Lexicographical data, Wikidata-Query-Service
Smalyshev updated subscribers of T230974: New lexemes missing in Wikidata Query Service.
Thu, Aug 22, 5:20 PM · MW-1.34-notes (1.34.0-wmf.20; 2019-08-27), Wikidata, Lexicographical data, Wikidata-Query-Service
Smalyshev triaged T230974: New lexemes missing in Wikidata Query Service as Unbreak Now! priority.
Thu, Aug 22, 5:20 PM · MW-1.34-notes (1.34.0-wmf.20; 2019-08-27), Wikidata, Lexicographical data, Wikidata-Query-Service
Smalyshev added a comment to T230974: New lexemes missing in Wikidata Query Service.

RDF generated is:

Thu, Aug 22, 5:19 PM · MW-1.34-notes (1.34.0-wmf.20; 2019-08-27), Wikidata, Lexicographical data, Wikidata-Query-Service
Smalyshev added a comment to T230974: New lexemes missing in Wikidata Query Service.

We're getting:

wdq6: 17:16:23.890 [update 0] WARN  org.wikidata.query.rdf.tool.Updater - Contained error syncing.  Giving up on L60296
org.wikidata.query.rdf.tool.exception.ContainedException: RDF parsing error for https://www.wikidata.org/wiki/Special:EntityData/L60296.ttl?flavor=dump&nocache=1566494183408
	at org.wikidata.query.rdf.tool.wikibase.WikibaseRepository.collectStatementsFromUrl(WikibaseRepository.java:401)
	at org.wikidata.query.rdf.tool.wikibase.WikibaseRepository.fetchRdfForEntity(WikibaseRepository.java:457)
	at org.wikidata.query.rdf.tool.wikibase.WikibaseRepository.fetchRdfForEntity(WikibaseRepository.java:433)
	at org.wikidata.query.rdf.tool.Updater.handleChange(Updater.java:362)
	at org.wikidata.query.rdf.tool.Updater.lambda$handleChanges$0(Updater.java:236)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.openrdf.rio.RDFParseException: Default namespace used but not defined [line 42]
	at org.openrdf.rio.helpers.RDFParserHelper.reportFatalError(RDFParserHelper.java:440)
	at org.openrdf.rio.helpers.RDFParserBase.reportFatalError(RDFParserBase.java:685)
	at org.openrdf.rio.turtle.TurtleParser.reportFatalError(TurtleParser.java:1405)
	at org.openrdf.rio.helpers.RDFParserBase.getNamespace(RDFParserBase.java:342)
	at org.openrdf.rio.turtle.TurtleParser.parseQNameOrBoolean(TurtleParser.java:1032)
	at org.openrdf.rio.turtle.TurtleParser.parseValue(TurtleParser.java:643)
	at org.openrdf.rio.turtle.TurtleParser.parseSubject(TurtleParser.java:474)
	at org.openrdf.rio.turtle.TurtleParser.parseTriples(TurtleParser.java:407)
	at org.openrdf.rio.turtle.TurtleParser.parseStatement(TurtleParser.java:259)
	at org.openrdf.rio.turtle.TurtleParser.parse(TurtleParser.java:214)
	at org.wikidata.query.rdf.tool.wikibase.WikibaseRepository.collectStatementsFromUrl(WikibaseRepository.java:392)
	... 8 common frames omitted

So that's why it's not updated. I'll check why this happens.

Thu, Aug 22, 5:17 PM · MW-1.34-notes (1.34.0-wmf.20; 2019-08-27), Wikidata, Lexicographical data, Wikidata-Query-Service

Wed, Aug 21

Smalyshev added a comment to T230862: Create a way to filter only WB-related changes from Commons recentchanges.

Tags should work, at least for now, I think, if I can filter by tag efficiently. There's not a lot of data edits so far, compared to overall Commons edit volume.

Wed, Aug 21, 3:56 PM · Structured Data Engineering, Structured-Data-Backlog, MediaWiki-API, Wikidata-Query-Service, SDC General, Commons, Wikidata
Smalyshev moved T228348: Category graph includes deleted categories from Backlog to Next on the User-Smalyshev board.
Wed, Aug 21, 7:29 AM · MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), User-Smalyshev, Discovery-Wikidata-Query-Service-Sprint, Wikidata, Wikidata-Query-Service
Smalyshev added a project to T228348: Category graph includes deleted categories: User-Smalyshev.
Wed, Aug 21, 7:29 AM · MW-1.34-notes (1.34.0-wmf.21; 2019-09-03), User-Smalyshev, Discovery-Wikidata-Query-Service-Sprint, Wikidata, Wikidata-Query-Service
Smalyshev moved T222306: RDF export generates wrong IDs for federated entities from Waiting/Blocked to In review on the User-Smalyshev board.
Wed, Aug 21, 7:29 AM · MW-1.34-notes (1.34.0-wmf.19; 2019-08-20), User-Smalyshev, WikibaseMediaInfo, Wikidata-Query-Service, SDC General, Commons, Wikidata
Smalyshev moved T229608: Support SDC URIs in WDQS URI schemes from Done to In review on the User-Smalyshev board.
Wed, Aug 21, 7:29 AM · Discovery-Wikidata-Query-Service-Sprint, Patch-For-Review, User-Smalyshev, Wikidata-Query-Service, SDC General, Wikidata
Smalyshev renamed T229608: Support SDC URIs in WDQS URI schemes from Allow WDQS default prefixes to work with federated data to Support SDC URIs in WDQS URI schemes.
Wed, Aug 21, 7:28 AM · Discovery-Wikidata-Query-Service-Sprint, Patch-For-Review, User-Smalyshev, Wikidata-Query-Service, SDC General, Wikidata
Smalyshev closed T230410: wdqs updater processing events but not finding anything useful as Resolved.
Wed, Aug 21, 7:27 AM · Discovery-Wikidata-Query-Service-Sprint, User-Smalyshev, Wikidata, Wikidata-Query-Service
Smalyshev created T230862: Create a way to filter only WB-related changes from Commons recentchanges.
Wed, Aug 21, 6:38 AM · Structured Data Engineering, Structured-Data-Backlog, MediaWiki-API, Wikidata-Query-Service, SDC General, Commons, Wikidata
Smalyshev added a comment to T230856: RDF dump performance for SDC.

Probably not a lot. Search for English labels returns 188 results, unfortunately search for statements and every label doesn't seem to work (probably needs a reindex?) so I don't know how many but probably also not a lot. I'll check tomorrow if I can get more specific figures.

Wed, Aug 21, 5:57 AM · Dumps-Generation, WikibaseMediaInfo, Wikidata-Query-Service, SDC General, Commons, Wikidata