Page MenuHomePhabricator

Druid Webrequest sampled 128 has missing data data for 1 hour
Closed, ResolvedPublic

Description

I've noticed that tonight the data in druid in the webrequest_sampled_128 dataset has some data problem for the slot between 2023-05-20 00:00:00 and 2023-05-20 01:00:00.

Screenshot 2023-05-20 at 10.43.59.png (1×2 px, 250 KB)

When grouping by webrequest_source It looks like almost all of text data is missing while the upload one seems ok:

image.png (1×2 px, 307 KB)

As a comparison the data in the the live dataset populated by benthos looks totally fine.
Did something break in the pipeline for that hour?
I'm also curious to know if there is any monitoring in place to detect this kind of situations.

Event Timeline

Mentioned in SAL (#wikimedia-analytics) [2023-05-25T08:37:43Z] <joal> rerun druid_load_webrequest_sampled_128_daily 2023-05-20 to reload missing hour (T337088)

I have tried rerunning the loading job but this has not solved the issue. More investigation is needed.

Thanks a lot for having raised the issue, this is now solved (see parent ticket).

@JAllemandou thanks a lot for fixing it, I'm resolving this as you have the other one open for potential follow ups on the monitoring side.