Page MenuHomePhabricator

Igorkim78 (Igor Kim)
User

Projects

User does not belong to any projects.

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Saturday

  • Clear sailing ahead.

User Details

User Since
Apr 2 2019, 6:24 PM (33 w, 1 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
Igorkim78 [ Global Accounts ]

Recent Activity

Tue, Nov 19

Igorkim78 added a comment to T231411: Test new Updater service.

output of
iostat -x 1
and
sudo iotop
?

Tue, Nov 19, 11:35 AM · Patch-For-Review, Discovery-Search (Current work), Performance Issue, Wikidata-Query-Service, Wikidata
Igorkim78 added a comment to T231411: Test new Updater service.

Are there thread dumps from Blazegraph available?
What about new logger UPDATED_ENTITY_IDS does it track updated entity IDs? How many per minute/hour?

Tue, Nov 19, 11:32 AM · Patch-For-Review, Discovery-Search (Current work), Performance Issue, Wikidata-Query-Service, Wikidata
Ghuron awarded T212826: Create dedicated Updater service in Blazegraph a Like token.
Tue, Nov 19, 3:58 AM · Discovery-Search (Current work), Epic, Performance Issue, Wikidata-Query-Service, Wikidata

Mon, Nov 18

Igorkim78 added a subtask for T231411: Test new Updater service: T238555: Create endpoint to extract low level data for a list of entity IDs..
Mon, Nov 18, 3:55 PM · Patch-For-Review, Discovery-Search (Current work), Performance Issue, Wikidata-Query-Service, Wikidata
Igorkim78 added a parent task for T238555: Create endpoint to extract low level data for a list of entity IDs.: T231411: Test new Updater service.
Mon, Nov 18, 3:55 PM · Wikidata, Wikidata-Query-Service
Igorkim78 added a subtask for T231411: Test new Updater service: T238557: Allow for logging recently updated entities.
Mon, Nov 18, 3:54 PM · Patch-For-Review, Discovery-Search (Current work), Performance Issue, Wikidata-Query-Service, Wikidata
Igorkim78 added a parent task for T238557: Allow for logging recently updated entities: T231411: Test new Updater service.
Mon, Nov 18, 3:54 PM · Wikidata-Query-Service, Wikidata
Igorkim78 added a project to T238557: Allow for logging recently updated entities: Wikidata-Query-Service.

Thanks! Yes it is Wikidata-Query-Service

Mon, Nov 18, 3:53 PM · Wikidata-Query-Service, Wikidata
Igorkim78 added a project to T238555: Create endpoint to extract low level data for a list of entity IDs.: Wikidata-Query-Service.

Thanks, yes it is Wikidata-Query-Service

Mon, Nov 18, 3:53 PM · Wikidata, Wikidata-Query-Service
Igorkim78 created T238557: Allow for logging recently updated entities.
Mon, Nov 18, 2:36 PM · Wikidata-Query-Service, Wikidata
Igorkim78 created T238555: Create endpoint to extract low level data for a list of entity IDs..
Mon, Nov 18, 2:27 PM · Wikidata, Wikidata-Query-Service

Wed, Nov 13

Igorkim78 added a comment to T238232: blazegraph journal on wdqs1005 is oversized.

Wdqs1006 reports 574.6GiB are reserved for the journal and 544.3GiB are actually used (~5% of space unused).
While Wdqs1005 reports 1037.7GiB are reserved and only 543.5 are actully used (~47% of space unused).
Most of the %FileWaste or reserved for 8K allocators, but %SlotWaste is also higher than usual for 4k (10 times higher than usual), 2k, 64 (3 times), 320 and 768 allocators (2 times).

Wed, Nov 13, 6:43 PM · Wikidata-Query-Service, Wikidata, Discovery-Search (Current work)

Wed, Oct 23

Igorkim78 updated the task description for T234968: Measure performance impact of code optimization and/or blazegraph settings on real traffic data.
Wed, Oct 23, 4:34 PM · Discovery-Search (Current work), Wikidata-Query-Service, Wikidata
Igorkim78 added a comment to T101013: Log Wikidata Query Service queries to the event gate infrastructure.

Added link to the task T236251: Add header returning time millis to first solution similar to TTFB measured in Blazegraph.
The corresponding header X-FIRST-SOLUTION-MILLIS might be very useful while analyzing long-running queries and also comparing queries performance. If the time reported by Blazegraph is significantly less than total time of the query execution, it might be caused by:

  1. Total result is very large one, and it has consumed much time on serialization/deserialization (that is basically OK situation, if the number of results are large)
  2. Some connectivity issues, over network and/or inter-process. In this case the metric X-FIRST-SOLUTION-MILLIS will be the same for subsequent calls, but total query time vary over time.
  3. Query might be very unselective, but additional constraints filter out many potential solutions, so the first solution is computed fast but to collect all the asked results it takes much time. Such queries are subject to analysis and might need fixing in the Blazegraph code or data layout.
Wed, Oct 23, 12:49 PM · Discovery-Search (Current work), Patch-For-Review, Event-Platform, Analytics, Wikidata, Wikidata-Query-Service, Discovery
Igorkim78 created T236251: Add header returning time millis to first solution similar to TTFB measured in Blazegraph.
Wed, Oct 23, 12:31 PM · WDQS-Optimizer

Oct 16 2019

Igorkim78 added a comment to T235540: SPARQL query causes StackOverflowError and fails to execute.

The LabelService optimizer was fixed (so it will not throw NPEs) this August, by reusing Blazegraph core utility com.bigdata.rdf.sparql.ast.StaticAnalysis.getVarsFromArguments(BOp) to run an introspection on variables used in filters and other clauses, so LabelService call placement could be properly adjusted, this introspection seems to come into infinite loop over the AST tree. Vars reuse to label aggregation after the original var is a common practice, so, yes it should be fixed. Looking on the workaround to extract referenced vars without catching into the infinite loop.

Oct 16 2019, 12:34 PM · Wikidata, Wikidata-Query-Service

Oct 9 2019

Igorkim78 claimed T231411: Test new Updater service.
Oct 9 2019, 8:22 AM · Patch-For-Review, Discovery-Search (Current work), Performance Issue, Wikidata-Query-Service, Wikidata

Oct 7 2019

Igorkim78 added a comment to T227365: WDQS/Blazegraph data loading has timeout.

There is a context param queryTimeout set to 10 minutes in web.xml, which is applied for all Blazegraph servlets. Stas prepared a patch, extending it 10x times, https://gerrit.wikimedia.org/r/#/c/wikidata/query/rdf/+/520948/ you might apply it locally (or just edit web.xml file) to resolve your issue, as the change has not been applied to the WDQS master due to this timeout is system-wide and extending it might result in unexpected consumption of resources (this timeout will be also applied to queries, including very heave ones, thus allowing them running much longer before generating timeout).

Oct 7 2019, 10:44 AM · Upstream, Wikidata, Wikidata-Query-Service

Sep 30 2019

Igorkim78 added a comment to T233204: Mixup of unicode characters in Query Service.

These characters are indeed mapped to the same term in the DB.

Sep 30 2019, 4:13 PM · Wikidata, Wikidata-Query-Service

Sep 12 2019

Igorkim78 created T232768: Branching factors configuration for Blazegraph instances.
Sep 12 2019, 6:24 PM · WDQS-Optimizer
Igorkim78 created T232739: Requesting access to wmcs beta cluster for igorkim78.
Sep 12 2019, 1:00 PM · Beta-Cluster-Infrastructure, Release-Engineering-Team

Aug 29 2019

Igorkim78 added a comment to T231411: Test new Updater service.

Differences in bnodes might be tolerated with additional replacement. The cleanup stage could be merged with initial sed+sort

Aug 29 2019, 6:46 AM · Patch-For-Review, Discovery-Search (Current work), Performance Issue, Wikidata-Query-Service, Wikidata

Aug 2 2019

Igorkim78 added a comment to T229655: bad interaction of lang() with wikibase:label.

Looking at query exetution plans, ProjectionOp for the query with lang() for coDescription got arranged prior to materialization of coDescription, so it (along with its lang) has not got the way to the projection. The reason for such behavior needs some more research. Will update on that.

Aug 2 2019, 7:20 PM · Wikidata, Wikidata-Query-Service

Jul 1 2019

Igorkim78 added a comment to T175840: Using label service twice in one query results in obscure error message.

Fixed optional support and added testcase for that code path.
Service projectedVars actually include both inbound and outbound variables (those which are params for the service and those which are produced by labels lookup. But for the check if service node could be reordered prior to any clauses placed at the bottom of the query, we need to consider only inbound variables, so they would be available for the service call, and all outbound vars available for the latter filters and other clauses.

Jul 1 2019, 3:46 PM · Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service, Wikidata, Discovery

Jun 25 2019

Igorkim78 added a comment to T175840: Using label service twice in one query results in obscure error message.

The idea for the change is to replace runLast hint with more complicated logic. So there are 3 steps:

  • first 'most probable optimal' placement to allow for EmptyLabelServiceOptimizer to see the variables to process.
  • then EmptyLabelServiceOptimizer adds statement patterns for resolutions.
  • and then additional optimizer step rearranges LabelService to the latest possible step before any clauses, which might use the variables bound by LabelService.
Jun 25 2019, 9:05 PM · Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service, Wikidata, Discovery

May 7 2019

Igorkim78 added a comment to T153353: Blazegraph not properly using labels from sub-queries for filtering (omitting rows), unless they're selected.

The EmptyLabelServiceOptimizer running optimizeJoinGroup(AST2BOpContext, StaticAnalysis, IBindingSet[], JoinGroupNode) as of current takes projection from StaticAnalisys.getQueryRoot() as parent of JoinGroupNode wrapping statement pattern of the LabelService clause is unavailable.

May 7 2019, 9:35 PM · Discovery-Wikidata-Query-Service-Sprint, User-Smalyshev, Upstream, Discovery, Wikidata, Wikidata-Query-Service

May 6 2019

Igorkim78 added a comment to T213375: Inline value and reference URIs.

Additionally tested configuration option with only Raw records disabled, comparing to original baseline:

May 6 2019, 4:49 PM · Patch-For-Review, Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service, Wikidata
Igorkim78 added a comment to T213375: Inline value and reference URIs.

Configuration options are assigned in RWStore.properties. Particular options are:

May 6 2019, 4:43 PM · Patch-For-Review, Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service, Wikidata
Igorkim78 added a comment to T153353: Blazegraph not properly using labels from sub-queries for filtering (omitting rows), unless they're selected.

This seems to be optimizers order problem.
CompareBOp executes to check if "Ada"@en equals to ?langLabel several times but the ?langLabel is not bound on all occasions:
while running ASTDeferredIVResolution
while running com.bigdata.rdf.sparql.ast.optimizers.ASTSetValueExpressionsOptimizer
then while running ConditionalRoutingOp for ChunkedRunningQuery

May 6 2019, 4:36 PM · Discovery-Wikidata-Query-Service-Sprint, User-Smalyshev, Upstream, Discovery, Wikidata, Wikidata-Query-Service

Apr 29 2019

Igorkim78 added a comment to T213375: Inline value and reference URIs.

Complete test logs attached

Apr 29 2019, 5:00 PM · Patch-For-Review, Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service, Wikidata
Igorkim78 added a comment to T213375: Inline value and reference URIs.

Load performance for the tested configurations on isolated environment (i7-7700HQ, 8 cores 2.8GHz, 32GB RAM, SSD Samsung 960 PRO)

Apr 29 2019, 4:50 PM · Patch-For-Review, Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service, Wikidata
Igorkim78 added a comment to T213375: Inline value and reference URIs.

Attached results of the load 100 ttl.gz files with different configurations

Apr 29 2019, 4:41 PM · Patch-For-Review, Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service, Wikidata

Apr 22 2019

Igorkim78 claimed T213375: Inline value and reference URIs.

Changeset created to support reference URIs inlining:
https://gerrit.wikimedia.org/r/#/c/wikidata/query/blazegraph/+/505642

Apr 22 2019, 4:55 PM · Patch-For-Review, Discovery-Wikidata-Query-Service-Sprint, Wikidata-Query-Service, Wikidata