Page MenuHomePhabricator

banner and landing page data ingress (pgehres) is broken
Closed, ResolvedPublic1 Estimated Story Points

Description

The timestamp format in log files changed, and is no longer compatible with the ingress script's regex.

Current regex snipped:

[[ https://github.com/wikimedia/wikimedia-fundraising-tools-DjangoBannerStats/blob/master/fundraiser/analytics/regex.py#L10 |(?P<timestamp>[0-9-]+T[0-9:.]+)]]

The new timestamp format includes a "Z" at the end, so we just need to adjust for that!

Thanks @Ejegg for finding this!

Event Timeline

AndyRussG renamed this task from pgehres data ingress broken to banner and landing page data ingress (pgehres) is broken.Jul 3 2019, 3:44 PM
AndyRussG updated the task description. (Show Details)

Change 520344 had a related patch set uploaded (by Ejegg; owner: Ejegg):
[wikimedia/fundraising/tools/DjangoBannerStats@master] Allow 'Z' at end of timestamp

https://gerrit.wikimedia.org/r/520344

Change 520476 had a related patch set uploaded (by AndyRussG; owner: AndyRussG):
[wikimedia/fundraising/tools/DjangoBannerStats@master] Update regex for new timestamp format in log lines

https://gerrit.wikimedia.org/r/520476

Change 520476 abandoned by AndyRussG:
[WIP] Update regex for new timestamp format in log lines

Reason:
Iffc9d2599a is better!

https://gerrit.wikimedia.org/r/520476

Change 520344 merged by jenkins-bot:
[wikimedia/fundraising/tools/DjangoBannerStats@master] Allow 'Z' at end of timestamp

https://gerrit.wikimedia.org/r/520344

The job should be processing correctly now. We also backfilled the files that weren't processed properly due to this error. Most of the data since the outage, and going forward, should be OK now.

However, there is a short period for which the data is still not OK. It is: 2019-07-01 between 13:45:01 and 14:30:00.

This is due to the transition period during which files included some lines with the old timestamp format and others with the, new incompatible one.

Again, data before and after that should be fine, though.

Thanks all, and thanks @Ejegg for help with deployment and backfill!