Page MenuHomePhabricator

Elasticsearch errors about BulkShardRequest
Open, LowPublic

Description

elastic2014 seems to have a lot of errors related to BulkShardRequest (see example below).

[2017-06-06T03:08:04,546][DEBUG][org.elasticsearch.action.bulk.TransportShardBulkAction] [jawiki_content_1487427148][4] failed to execute bulk item (update) BulkShardRequest [[jawiki_content_1487427148][4
]] containing [134] requests
org.elasticsearch.index.engine.DocumentMissingException: [page][1615532]: document missing
        at org.elasticsearch.action.update.UpdateHelper.prepare(UpdateHelper.java:92) ~[elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.action.update.UpdateHelper.prepare(UpdateHelper.java:81) ~[elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.action.bulk.TransportShardBulkAction.executeUpdateRequest(TransportShardBulkAction.java:269) ~[elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.action.bulk.TransportShardBulkAction.executeBulkItemRequest(TransportShardBulkAction.java:159) [elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:113) [elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:69) [elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:939) [elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:908) [elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.action.support.replication.ReplicationOperation.execute(ReplicationOperation.java:113) [elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:322) [elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.onResponse(TransportReplicationAction.java:264) [elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:888) [elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.action.support.replication.TransportReplicationAction$1.onResponse(TransportReplicationAction.java:885) [elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.index.shard.IndexShardOperationsLock.acquire(IndexShardOperationsLock.java:147) [elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.index.shard.IndexShard.acquirePrimaryOperationLock(IndexShard.java:1654) [elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.action.support.replication.TransportReplicationAction.acquirePrimaryShardReference(TransportReplicationAction.java:897) [elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.action.support.replication.TransportReplicationAction.access$400(TransportReplicationAction.java:93) [elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.doRun(TransportReplicationAction.java:281) [elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:260) [elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:252) [elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) [elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:618) [elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:638) [elasticsearch-5.3.2.jar:5.3.2]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-5.3.2.jar:5.3.2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]

Details

Related Gerrit Patches:

Event Timeline

Gehel created this task.Jun 6 2017, 8:07 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

The doc 1615532 is in the general index at ノート:速水太郎 but a page with the same title exists at 速水太郎.
Another example is page_id 12 from commons, this time I don't see a similar page in general.
Looking at the oozie job to populate pageviews I don't see anything related to filtering pages in the content namespace, I'd suspect that the error is not new but simply that we now log these DEBUG messages?

Mentioned in SAL (#wikimedia-operations) [2017-06-06T08:39:22Z] <gehel> raise log level to WARN for TransportShardBulkAction on elasticsearch cirrus - T167091

Change 357371 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] elasticsearch - raise logging of TransportShardBulkAction to WARN

https://gerrit.wikimedia.org/r/357371

Change 357371 merged by Gehel:
[operations/puppet@production] elasticsearch - raise logging of actions to INFO

https://gerrit.wikimedia.org/r/357371

debt triaged this task as Medium priority.Jun 8 2017, 5:07 PM
debt edited projects, added Discovery-Search (Current work); removed Discovery-Search.
debt moved this task from In Progress to Needs review on the Discovery-Search (Current work) board.
Gehel added a comment.Jun 19 2017, 8:27 AM

@dcausse, @EBernhardson: this error is now filtered in the logs. Do we want to address the root cause? Or is this just a side issue that is safe to ignore?

debt assigned this task to EBernhardson.Jun 20 2017, 5:12 PM
debt reassigned this task from EBernhardson to dcausse.
dcausse lowered the priority of this task from Medium to Low.Jun 21 2017, 2:36 PM
dcausse moved this task from Needs review to In Progress on the Discovery-Search (Current work) board.

This would be interesting to know why we get these errors but I don't think it's very urgent... I'm pretty sure that these errors are not new...
There is definitely something in our indexing pipeline that is sending invalid docs.
Lowering prio and moving to backlog, please change if you think it's important to address.

debt added a subscriber: debt.

Gotcha, thanks for taking a look, @dcausse, I'll move it to the backlog board for up next work (trying to keep our backlog column on the sprint board as stuff we need to tackle first).

debt moved this task from needs triage to Up Next on the Discovery-Search board.Jun 21 2017, 2:49 PM
debt moved this task from Up Next to This Quarter on the Discovery-Search board.Aug 3 2017, 9:42 PM
Dinacel added a subscriber: Dinacel.May 9 2018, 5:20 PM