Page MenuHomePhabricator

Investigate what is creating Redis transactions and whether it can be fixed
Closed, ResolvedPublic

Description

In T122676#3411664, we see that HA redundancy using twemproxy won't support the MULTI command. Find why we're using transactions, fix it or otherwise work around this incompatibility. I can't think of any reason that we wouldn't want at-least-once task consumption, if this turns out to be related to Celery task management.

Event Timeline

awight triaged this task as Medium priority.Jun 11 2018, 11:26 AM
awight created this task.

This recent discussion makes it look like Celery is responsible for the transaction, and that it's a side-effect of using pipelines: https://github.com/celery/celery/issues/3500

Vvjjkkii renamed this task from Investigate what is creating Redis transactions and whether it can be fixed to 69aaaaaaaa.Jul 1 2018, 1:04 AM
Vvjjkkii raised the priority of this task from Medium to High.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed subscribers: Aklapper, gerritbot.
CommunityTechBot renamed this task from 69aaaaaaaa to Investigate what is creating Redis transactions and whether it can be fixed.Jul 2 2018, 9:44 AM
CommunityTechBot lowered the priority of this task from High to Medium.
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot added subscribers: Aklapper, gerritbot.

Running redis-cli monitor on deployment-ores01 gives these kind of transactions:

1541697362.325121 [0 10.68.16.235:35308] "BRPOP" "celery" "celery\x06\x163" "celery\x06\x166" "celery\x06\x169" "1"
1541697362.326760 [0 10.68.16.235:35310] "MULTI"
1541697362.326771 [0 10.68.16.235:35310] "ZREM" "unacked_index" "cba164cf-1a37-4f1e-8d5d-2019788166be"
1541697362.326782 [0 10.68.16.235:35310] "HDEL" "unacked" "cba164cf-1a37-4f1e-8d5d-2019788166be"
1541697362.326791 [0 10.68.16.235:35310] "EXEC"


1541697362.327471 [0 10.68.16.235:35670] "GET" "celery-task-meta-20299d80-5b2b-4d49-a38f-42b2620fef1e"
1541697362.341398 [0 10.68.16.235:35332] "MULTI"
1541697362.341424 [0 10.68.16.235:35332] "SETEX" "celery-task-meta-20299d80-5b2b-4d49-a38f-42b2620fef1e" "86400" "\x80\x02}q\x00(X\x06\x00\x00\x00resultq\x01}q\x02X\t\x00\x00\x00goodfaithq\x03}q\x04X\x05\x00\x00\x00scoreq\x05}q\x06(X\n\x00\x00\x00predictionq\a\x88X\x0b\x00\x00\x00probabilityq\b}q\t(\x89G=\xe2\xde\xac\x00\x00\x00\x00\x88G?\xef\xff\xff\xff\xed!TuussX\t\x00\x00\x00tracebackq\nNX\b\x00\x00\x00childrenq\x0b]q\x0cX\x06\x00\x00\x00statusq\rX\a\x00\x00\x00SUCCESSq\x0eX\a\x00\x00\x00task_idq\x0fX$\x00\x00\x0020299d80-5b2b-4d49-a38f-42b2620fef1eq\x10u."
1541697362.341471 [0 10.68.16.235:35332] "PUBLISH" "celery-task-meta-20299d80-5b2b-4d49-a38f-42b2620fef1e" "\x80\x02}q\x00(X\x06\x00\x00\x00resultq\x01}q\x02X\t\x00\x00\x00goodfaithq\x03}q\x04X\x05\x00\x00\x00scoreq\x05}q\x06(X\n\x00\x00\x00predictionq\a\x88X\x0b\x00\x00\x00probabilityq\b}q\t(\x89G=\xe2\xde\xac\x00\x00\x00\x00\x88G?\xef\xff\xff\xff\xed!TuussX\t\x00\x00\x00tracebackq\nNX\b\x00\x00\x00childrenq\x0b]q\x0cX\x06\x00\x00\x00statusq\rX\a\x00\x00\x00SUCCESSq\x0eX\a\x00\x00\x00task_idq\x0fX$\x00\x00\x0020299d80-5b2b-4d49-a38f-42b2620fef1eq\x10u."
1541697362.341509 [0 10.68.16.235:35332] "EXEC"

These are vital parts of celery and I doubt they would be easily fixable. The issue on github basically implies the same thing.

This is done, we are not fixing transcations. We'll work on sentinel