Page MenuHomePhabricator

Increase kafka event retention to 31
Closed, ResolvedPublic2 Estimated Story Points

Description

In order for kafka cluster to be useable as reliable WDQS event source, it'd be good to have events for more than 7 days. This is necessary since dumps are produced only weekly and loading and catching up can take several days, so by the time all the data is loaded, we may be very close or over 7 days behind.

14 days should be enough for most cases, though 21 days would be even better.

Event Timeline

Smalyshev triaged this task as Medium priority.Feb 14 2018, 8:51 AM
Smalyshev created this task.
Smalyshev raised the priority of this task from Medium to Needs Triage.Feb 14 2018, 8:51 AM
Smalyshev moved this task from Incoming to Watching / Waiting on the Wikidata-Query-Service board.

I think we can do this just for the mediawiki eventbus topics on the jumbo cluster.

Currently on kafka-main machines the disk utiliation is really low, so I think we can easily do it without kafka-jumbo.

 dsh --group kafka_eqiad --remoteshell ssh -- 'df -h' | grep '/srv'

/dev/md1        7.2T  737G  6.1T  11% /srv
/dev/md1        7.2T  736G  6.1T  11% /srv
/dev/md1        7.2T  822G  6.0T  12% /srv

dsh --group kafka_codfw --remoteshell ssh -- 'df -h' | grep '/srv' 

/dev/md1        7.2T  361G  6.5T   6% /srv
/dev/md1        7.2T  362G  6.5T   6% /srv
/dev/md1        7.2T  362G  6.5T   6% /srv

Event increasing 3 times gets us at max 33% disk utilization with should be ok.

Since we only need super-high retention for a subset of topics (not needed for jobs, retry topics, internal topics etc) ideally these could be topic-level configs, but we don't currently have proper support for specifying per-topic configuration T157092

Is there a reason we want to do this on main instead of jumbo? Stas will be consuming from jumbo, since it has timestamp offset support.

I need it only on jumbo I think, that's where I'll be connecting.

We'll have this on our radar, until things are stable.

I'll make this 31 days just to bump it up to a month. We have plenty of space for this.

Doing the following for all main-eqiad and main-codfw:

for t in \
    mediawiki.page-create                    \
    mediawiki.page-delete                    \
    mediawiki.page-edit                      \
    mediawiki.page-move                      \
    mediawiki.page-properties-change         \
    mediawiki.page-restrictions-change       \
    mediawiki.page-undelete                  \
    mediawiki.recentchange                   \
    mediawiki.resource-change                \
    mediawiki.resource_change                \
    mediawiki.resourcechange                 \
    mediawiki.revision-create                \
    mediawiki.revision-score                 \
    mediawiki.revision-visibility-change     \
    mediawiki.user-blocks-change             ; do
    for dc in eqiad codfw; do
        topic="${dc}.${t}"
        echo kafka configs --alter --entity-type topics --entity-name $topic --add-config retention.ms=2678400000
    done
done

kafka configs --alter --entity-type topics --entity-name eqiad.mediawiki.page-create --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name codfw.mediawiki.page-create --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name eqiad.mediawiki.page-delete --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name codfw.mediawiki.page-delete --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name eqiad.mediawiki.page-edit --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name codfw.mediawiki.page-edit --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name eqiad.mediawiki.page-move --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name codfw.mediawiki.page-move --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name eqiad.mediawiki.page-properties-change --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name codfw.mediawiki.page-properties-change --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name eqiad.mediawiki.page-restrictions-change --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name codfw.mediawiki.page-restrictions-change --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name eqiad.mediawiki.page-undelete --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name codfw.mediawiki.page-undelete --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name eqiad.mediawiki.recentchange --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name codfw.mediawiki.recentchange --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name eqiad.mediawiki.resource-change --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name codfw.mediawiki.resource-change --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name eqiad.mediawiki.resource_change --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name codfw.mediawiki.resource_change --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name eqiad.mediawiki.resourcechange --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name codfw.mediawiki.resourcechange --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name eqiad.mediawiki.revision-create --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name codfw.mediawiki.revision-create --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name eqiad.mediawiki.revision-score --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name codfw.mediawiki.revision-score --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name eqiad.mediawiki.revision-visibility-change --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name codfw.mediawiki.revision-visibility-change --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name eqiad.mediawiki.user-blocks-change --add-config retention.ms=2678400000
kafka configs --alter --entity-type topics --entity-name codfw.mediawiki.user-blocks-change --add-config retention.ms=2678400000
[@kafka1001:/home/otto] $

mediawiki eventbus topics should now be retained for 31 days in main Kafka clusters. If we add a new mediawiki topic, we need to remember to run this command for it.

Ottomata renamed this task from Increase kafka event retention to 14 or 21 days to Increase kafka event retention to 31.Jun 12 2018, 7:22 PM
Ottomata claimed this task.
Ottomata set the point value for this task to 2.
Ottomata moved this task from Next Up to Done on the Analytics-Kanban board.

If we add a new mediawiki topic, we need to remember to run this command for it.

Or implement T157092 :)