Page MenuHomePhabricator

Hashtags tool has not been collecting data since September 30th
Closed, ResolvedPublic

Description

Since around last summer, when we use the hashtag tool, it does not display results on « recent » time slots.
This seems to be getting worse.
For example, I currently run a #shesaid campaign on Wikiquote. I got results till end of September, but not after the 30th. And we are end of November
Last summer, we were running the #WPWP campaign, closing Aug 30, and most results of the last results only showed up in the tool mid to end of September

Event Timeline

Samwalton9 renamed this task from Collection script seemingly stops working (Nov 2021) to Hashtags tool has not been collecting data since September 30th.Dec 13 2021, 11:49 AM
Samwalton9 triaged this task as High priority.
Samwalton9 added a subscriber: KPX8.

Sorry for the delay in looking into this - the tool isn't officially maintained and I missed this.

It looks like the last data we have is September 30th.

I'm trying to fix this today; the problem appears to be an SSL error in the collection script container - it can't connect to the eventstream. Suspect this has to do with us using an old VM so I'm going to aim to hit two birds with one stone and upgrade production to Debian 11.

Once this gets up and running again we'll be able to collect historical data from up to 30 days ago, but unfortunately will have lost anything between then and September 30th.

Samwalton9 claimed this task.

OK I think this is all up and running again.

I updated Python to 3.9 and Django to 3.2, and set everything up in a new Debian 11 instance. This seems to have solved the SSL Error, and we now have data coming in.

Unfortunately we are unable to retrieve data from September 30th to some time on November 13th because the EventStream only goes back 30 days. Data collection is currently catching up to the present day, and is part way through 14th November now.

The official URL of the tool is now https://hashtags.wmcloud.org/, but links to .wmflabs.org will redirect you to the correct place.

Hi, it's not working, again. It was working for a few days, but not now. See
here:

https://hashtags.wmcloud.org/?query=soutezceskoslovensko2021&project=cs.
wikipedia.org&startdate=2021-11-17&enddate=2022-01-10&search_type=or&user=
(https://hashtags.wmcloud.org/?query=soutezceskoslovensko2021&project=cs.wikipedia.org&startdate=2021-11-17&enddate=2022-01-10&search_type=or&user=)

and here:

https://hashtags.wmcloud.org/graph/?query=soutezceskoslovensko2021&project=
cs.wikipedia.org&startdate=2021-11-17&enddate=2022-01-10&search_type=or&user

(https://hashtags.wmcloud.org/graph/?query=soutezceskoslovensko2021&project=cs.wikipedia.org&startdate=2021-11-17&enddate=2022-01-10&search_type=or&user=)
, it was collecting the hashtaged data until December 7th, and nothing after
that. It's frustrating.

Best regards,

KPX8

  • Původní e-mail ----------

Od: Samwalton9 <no-reply@phabricator.wikimedia.org>
Komu: Phabricator <no-reply@phabricator.wikimedia.org>
Datum: 14. 12. 2021 15:12:44
Předmět: [Maniphest] [Closed] T296410: Hashtags tool has not been collecting
data since September 30th
"View Task(https://phabricator.wikimedia.org/T296410)

Samwalton9 closed this task as "Resolved".
Samwalton9 claimed this task.
Samwalton9 added a comment.

OK I think this is all up and running again.

I updated Python to 3.9 and Django to 3.2, and set everything up in a new
Debian 11 instance. This seems to have solved the SSL Error, and we now have
data coming in.

Unfortunately we are unable to retrieve data from September 30th to some
time on November 13th because the EventStream only goes back 30 days. Data
collection is currently catching up to the present day, and is part way
through 14th November now.

The official URL of the tool is now https://hashtags.wmcloud.org/
(https://hashtags.wmcloud.org/), but links to .wmflabs.org will redirect you
to the correct place.

TASK DETAIL
https://phabricator.wikimedia.org/T296410
(https://phabricator.wikimedia.org/T296410)

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
(https://phabricator.wikimedia.org/settings/panel/emailpreferences/)

To: Samwalton9
Cc: FRomeo_WMF, Samwalton9, KPX8, Anthere, Devnull, Nintendofan885,
PrakharGurunani, ParthS007, skpuneethumar, Zylc, 1978Gage2001, Operator873,
Bsandipan, DSquirrelGM, Jayprakash12345, Chicocvenancio, Tbscho, JJMC89,
Jitrixis, Gryllida, scfc, Mbch331, Krenair

"

Hi @KPX8, it looks to me like the data collection has been functioning as normal. The two links you shared show data throughout the period, with the latest on Dec 31.

Do you have examples of any edits you're expecting to see that you're not?