Hashtags not reflecting new changes
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Ammarpad
	Jul 30 2023, 11:06 PM

Description

It has been showing 'latest edits may not currently be reflected in the tool' and is lagging behind almost 2 days.

Related Objects
Search...

Status	Assigned	Task
Resolved	jsn.sherman	T343104 Hashtags not reflecting new changes
Open	None	T361567 Identify bottlenecks in Hashtags tool data collection performance
Resolved	jsn.sherman	T361848 port wikilink events collection optimizations to hashtags collection script
Resolved	jsn.sherman	T361675 Stop tracking quickstatements in the Hashtags tool

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Sorry for not responding earlier, but this looks to be resolved now.

Hi @Samwalton9. Yes it worked. But it's stuck again since August 4 (the day you made this comment)

I am at Wikimania this week but will investigate next week. @Surlycyborg may have time to look into this in the meantime.

Sophivorus reopened this task as Open.Aug 15 2023, 11:58 AM

A simple query like https://hashtags.wmcloud.org/?query=proveit can reproduce this. Right now I don't see any edits post August 4th.

Running locally, I am able to fetch edits from later than that, which leads me to suspect something is wrong in production.

I also don't see errors in the developer console for the URL above, so I think it's not very likely to be a client-side issue.

Sam, maybe you can check the status of the scripts container in production please?

In T343104#9094524, @Surlycyborg wrote:

A simple query like https://hashtags.wmcloud.org/?query=proveit can reproduce this. Right now I don't see any edits post August 4th.

Running locally, I am able to fetch edits from later than that, which leads me to suspect something is wrong in production.

I also don't see errors in the developer console for the URL above, so I think it's not very likely to be a client-side issue.

Sam, maybe you can check the status of the scripts container in production please?

Thanks for confirming. I found time to take a look at the issue is pretty clear-cut:

mysql.connector.errors.DatabaseError: 1114 (HY000): The table 'hashtags_hashtag' is full

I don't have time to backup/recreate a new instance with more disk space until next week. In the meantime I deleted an old backup freeing up a small amount of space, and restarted the scripts container, which means things should start catching up a little.

Hi, thanks for identifying the cause and the temporary fix! Just to keep you updated, now it seems the tool has stopped recording changes since August 18, cheers!

Sophivorus awarded a token.Aug 21 2023, 11:09 AM

JJMC89 merged a task: T345020: Hashtag tool not updating .Aug 26 2023, 8:20 AM

JJMC89 added a subscriber: Timzy_D_Great.

The hashtag tool seems not to be updating. The last update was on the 20th of August. Please look into this too.Thank you.

Samwalton9-WMF merged a task: T345280: Zero results for four days.Aug 30 2023, 4:47 PM

Samwalton9-WMF added a subscriber: 23artashes.

So the best solution might be to attach a new volume to the Cloud instance, and store the database there (if such a thing is possible) but that goes beyond my current abilities, and I'd rather not have to drag our team's engineers into working on this. An alternative option would be to trim the database a little. The current top hashtag is flickr2commons (2.7 million entries; 38% of database rows). The tool already times out if you try to load a search for flickr2commons, and flickr2commons edits can be easily tracked via tags since they're only happening on Wikimedia Commons.

Given this, I think I'm going to delete all current flickr2commons database rows and exclude flickr2commons from tracking. This frees up nearly 40% of database capacity, giving us a lot more breathing room for a better solution in the future.

Dropped all flickr2commons data, so the script is already starting to play catch up.

Added flickr2commons to the hashtags exclusion list. We'll give this a week or so and see if we've caught up.

Because I was having some issues with disk space and backups I also added a new volume for backups, which is located at /backups

Looks like this is resolved, though the flickr2commons exclusion didn't go through for some reason. I'll look into that soon.

Hi @Samwalton9-WMF, we've used the Hashtag tool for one of WMCZ's competitions and we are just trying to pick winners. We found out that some edits were not detected by the tool until now. Please see the query and list of some manually picked examples of not detected edits below. Is there anything we can do about it?

https://hashtags.wmcloud.org/?query=WPWP&project=cs.wikipedia.org&startdate=2023-05-01&enddate=2023-12-10&search_type=or&user=&page=2

8. 8. 2023, 01:15‎ KKDAII https://cs.wikipedia.org/w/index.php?title=Babi%C4%8Dky_(Babice)&action=history
8. 8. 2023, 19:26‎ KKDAII https://cs.wikipedia.org/w/index.php?title=Buda_(Muka%C5%99ov)&action=history
8. 8. 2023, 19:10‎ KKDAII https://cs.wikipedia.org/w/index.php?title=Brusy_(Lib%C4%9Bdice)&action=history
7. 8. 2023, 22:03‎ KKDAII https://cs.wikipedia.org/w/index.php?title=Vi%C5%A1%C5%88ovka&action=history

Thank you Sam for the fix.

In T343104#9159425, @Janbery wrote:

Hi @Samwalton9-WMF, we've used the Hashtag tool for one of WMCZ's competitions and we are just trying to pick winners. We found out that some edits were not detected by the tool until now. Please see the query and list of some manually picked examples of not detected edits below. Is there anything we can do about it?

https://hashtags.wmcloud.org/?query=WPWP&project=cs.wikipedia.org&startdate=2023-05-01&enddate=2023-12-10&search_type=or&user=&page=2

8. 8. 2023, 01:15‎ KKDAII https://cs.wikipedia.org/w/index.php?title=Babi%C4%8Dky_(Babice)&action=history

8. 8. 2023, 19:26‎ KKDAII https://cs.wikipedia.org/w/index.php?title=Buda_(Muka%C5%99ov)&action=history

8. 8. 2023, 19:10‎ KKDAII https://cs.wikipedia.org/w/index.php?title=Brusy_(Lib%C4%9Bdice)&action=history

7. 8. 2023, 22:03‎ KKDAII https://cs.wikipedia.org/w/index.php?title=Vi%C5%A1%C5%88ovka&action=history

This issue actually affected everyone and for even longer period. The data seem to be missing for almost four days. I have filed T346206 to track that separately since this one effectively resolved the main issue of failing continuous update.

Hi! The hashtags tool seems to have stopped recording changes again, since December 11. See for example https://hashtags.wmcloud.org/?query=proveit

@Samwalton9 Ping?

FRomeo_WMF added a subscriber: SEgt-WMF.Jan 8 2024, 3:41 PM

Samwalton9-WMF merged a task: T355116: Wikimedia hashtag search - not updated? .Jan 16 2024, 6:09 PM

Samwalton9-WMF added subscribers: klaul, FRomeo_WMF, Strainu.

Apologies for the delay on this, I've been busy and this tool isn't one I'm particularly well suited to maintain at the moment. I'm seeing lots of this in the scripts container log:

Traceback (most recent call last):
  File "/scripts/scripts/collect_hashtags.py", line 144, in <module>
    hashtag_matches = hashtag_match(change['comment'])
KeyError: 'comment'

Not immediately obvious to me why this would be. Maybe 'comment' data isn't sent for stream events with no edit summary anymore. Looks like we have data up to 11th December, then a little on the 15th and 20th, perhaps small gaps where there were edits which all had edit summaries. Trying a hotfix.

It doesn't help that I can't rebuild the app because of insufficient disk space. That was going to come back to bite us one way or another so this probably warrants a better fix.

Samwalton9-WMF mentioned this in T355176: Resolve disk space issues with Hashtags tool.Jan 16 2024, 6:25 PM

The disk space issue (T355176) is resolved.

In T343104#9463411, @Samwalton9-WMF wrote:
Apologies for the delay on this, I've been busy and this tool isn't one I'm particularly well suited to maintain at the moment. I'm seeing lots of this in the scripts container log:
Traceback (most recent call last):
  File "/scripts/scripts/collect_hashtags.py", line 144, in <module>
    hashtag_matches = hashtag_match(change['comment'])
KeyError: 'comment'
Not immediately obvious to me why this would be. Maybe 'comment' data isn't sent for stream events with no edit summary anymore. Looks like we have data up to 11th December, then a little on the 15th and 20th, perhaps small gaps where there were edits which all had edit summaries. Trying a hotfix.

This is also resolved courtesy of a quick hotfix to check for comment in the response data.

SELECT COUNT(*) as cnt, DATE(timestamp) FROM hashtags_hashtag WHERE timestamp > '2023-12-01' GROUP BY DATE(timestamp) ORDER BY DATE(timestamp) ASC;
+------+-----------------+
| cnt  | DATE(timestamp) |
+------+-----------------+
| 2413 | 2023-12-01      |
| 1882 | 2023-12-02      |
| 3010 | 2023-12-03      |
| 3591 | 2023-12-04      |
| 2918 | 2023-12-05      |
| 3005 | 2023-12-06      |
| 2792 | 2023-12-07      |
| 2489 | 2023-12-08      |
| 2900 | 2023-12-09      |
| 1830 | 2023-12-10      |
| 1832 | 2023-12-11      |
|    5 | 2023-12-15      |
|   97 | 2023-12-20      |
|   37 | 2024-01-10      |
+------+-----------------+

Data is now collecting from the 10th January and will gradually catch up to recording live data again.

Thank you so much!

Thanks for looking into this! However, I'm currently getting a 502 Bad Gateway error.

Hm, yep, seeing that.

No errors in the app container, but the nginx container is showing connection refused. We'll look into it.

Samwalton9-WMF removed Samwalton9-WMF as the assignee of this task.Jan 19 2024, 10:05 AM

Samwalton9-WMF moved this task from Ready to In Progress on the Moderator-Tools-Team (Kanban) board.

Samwalton9-WMF mentioned this in T355406: Hashtag tool down (Error 502).Jan 19 2024, 10:45 AM

This seems to still be on a happy journey back to being up to date:

+------+-----------------+
| cnt  | DATE(timestamp) |
+------+-----------------+
| 2413 | 2023-12-01      |
| 1882 | 2023-12-02      |
| 3010 | 2023-12-03      |
| 3591 | 2023-12-04      |
| 2918 | 2023-12-05      |
| 3005 | 2023-12-06      |
| 2792 | 2023-12-07      |
| 2489 | 2023-12-08      |
| 2900 | 2023-12-09      |
| 1830 | 2023-12-10      |
| 1832 | 2023-12-11      |
|    5 | 2023-12-15      |
|   97 | 2023-12-20      |
|  253 | 2024-01-10      |
|  526 | 2024-01-11      |
| 1142 | 2024-01-12      |
| 1813 | 2024-01-13      |
|  324 | 2024-01-14      |
+------+-----------------+

It looks like data from the 10th and 11th may be incomplete - the event streams only officially have 7 days of historical data so this would make sense. Data should be complete from the 12th onwards.

In T343104#9472039, @Samwalton9-WMF wrote:

Hm, yep, seeing that.

No errors in the app container, but the nginx container is showing connection refused. We'll look into it.

Oh and this issue is now resolved after a quick restart of the nginx container.

I added the #1Lib1Ref hashtag in multiple edits of mine for a competition that's currently taking place on the Czech Wikipedia, but my edits (with the first edit made on Jan 18) aren't showing up in the list. Apparently, I am not the only user whose edits aren't showing up as they should (see here). I was advised by Janbery to let you know.

In T343104#9475098, @V0lkanic wrote:

I added the #1Lib1Ref hashtag in multiple edits of mine for a competition that's currently taking place on the Czech Wikipedia, but my edits (with the first edit made on Jan 18) aren't showing up in the list. Apparently, I am not the only user whose edits aren't showing up as they should (see here). I was advised by Janbery to let you know.

The data collection process is still catching up to live data - we're currently collecting data for the 18th January:

| 1813 | 2024-01-13      |
| 1138 | 2024-01-14      |
| 2673 | 2024-01-15      |
| 3081 | 2024-01-16      |
| 2105 | 2024-01-17      |
|   83 | 2024-01-18      |
+------+-----------------+

Comparing to above, we've moved up 4 days in a bit less than 3, so by my estimate we should catch up to live data around the start of February if this pace keeps up. I don't remember it being quite this slow in the past but it is what it is. Either way all the data from Jan 18 should be in the tool by the end of today.

Samwalton9-WMF removed a subscriber: Samwalton9.Jan 22 2024, 10:57 AM

Hi! Thanks for fixing it, but it seems like its broken again, since January 28. Cheers!

@Samwalton9-WMF Ah, I see you've unassigned yourself from this task. Does that mean we should find someone else to fix this?

@Sophivorus I checked the server and it is still processing events, I believe that it simply has not finished catching up. It is almost done with 2024-01-28. Take any time estimate to catch up with a grain of salt: the volume of events that need processed can vary greatly over time.

V0lkanic unsubscribed.Jan 30 2024, 7:30 PM

@jsn.sherman in ballpark figures in a week it recovered a day, during what I suspect is peak usage. Maybe it makes sense to give a few more resources to the process, at least while recovering from an outage?

| 5144 | 2024-01-28      |
| 4169 | 2024-01-29      |
| 4429 | 2024-01-30      |
| 1442 | 2024-01-31      |
+------+-----------------+

We're at yesterday's data now so I don't think it's worth our while to put effort into increasing resources here right now :)

As of Mon Feb 5 18:08:21 UTC 2024, the system was at Mon Feb 5 16:10:31 UTC 2024. With less than 2 hours of lag, I think we can consider this resolved.

I'm afraid the hashtag tool seems to not be updating again (since Feb 20). :-(

In T343104#9583930, @Sophivorus wrote:

I'm afraid the hashtag tool seems to not be updating again (since Feb 20). :-(

Per T358547 this is already fixed and data is catching up again.

Any estimate for when the data will be up to date?

It took about two weeks for data to be synchronous with the current date last time.

I'm sure the tool used to be faster than this in the past, so I'm not sure what the bottleneck is.

In T343104#9584331, @Samwalton9-WMF wrote:

It took about two weeks for data to be synchronous with the current date last time.

I'm sure the tool used to be faster than this in the past, so I'm not sure what the bottleneck is.

I noticed both on the first stoppage and this one that system load is pretty low on the server:

I didn't know if that was expected since this system uses a different method of deploying containers from our other servers

@Samwalton9 I just checked in on the the database, and we're not catching up yet. It looks like we're gaining 10 seconds about every 10 minutes.

system time	latest timestamp
Wed Feb 28 17:36:23 UTC 2024	2024-02-20 23:25:41.000000
Wed Feb 28 17:38:33 UTC 2024	2024-02-20 23:25:45.000000
Wed Feb 28 17:38:47 UTC 2024	2024-02-20 23:25:47.000000
Wed Feb 28 17:42:56 UTC 2024	2024-02-20 23:25:47.000000
Wed Feb 28 17:45:49 UTC 2024	2024-02-20 23:25:51.000000

In T343104#9584772, @jsn.sherman wrote:

@Samwalton9 I just checked in on the the database, and we're not catching up yet. It looks like we're gaining 10 seconds about every 10 minutes.

system time latest timestamp

Wed Feb 28 17:36:23 UTC 2024 2024-02-20 23:25:41.000000

Wed Feb 28 17:38:33 UTC 2024 2024-02-20 23:25:45.000000

Wed Feb 28 17:38:47 UTC 2024 2024-02-20 23:25:47.000000

Wed Feb 28 17:42:56 UTC 2024 2024-02-20 23:25:47.000000

Wed Feb 28 17:45:49 UTC 2024 2024-02-20 23:25:51.000000

Hm, not ideal. We've only introduced one trivial additional check, so I don't know why that would have slowed things down so much.

I just checked in on it, and it looks like progress was made overnight:

system time	latest timestamp
Thu Feb 29 12:36:13 UTC 2024	2024-02-23 04:38:05.000000

jsn.sherman moved this task from Done to QA on the Moderator-Tools-Team (Kanban) board.Feb 29 2024, 2:31 PM

progress update: it looks like we're running at about 1.89x real time; e.g. we're catching up.

system time	latest timestamp
Fri Mar 1 17:26:37 UTC 2024	2024-02-24 17:56:35.000000

That makes the current gap:
5 days 23 hours 30 minutes
(or ~143 hours)

once we were back, we were here:

system time	latest timestamp
Wed Feb 28 17:36:23 UTC 2024	2024-02-20 23:25:41.000000

which was a gap of:
7 days 18 hours 11 minutes 0 second
(or ~186 hours)

That means we've reduced the gap by 43 hours between the first and last check

The time between the checks was 1 days 23 hours 50 minutes
(or ~48 hours)

That puts our overal average at about 1.89x real time since the first check mentioned here.

I just checked in on it, and it looks like the gap widened again:

system time	latest timestamp
Mon Mar 4 14:16:05 UTC 2024	2024-02-29 19:09:49.000000

It's back up to 5 days 19 hours 7 minutes

The script container has been running and logging, so things are working on a system level. System load is really low. Uptime shows 46 days, 22:45. I'm restarting the server as a basic troubleshooting step that does not require understanding the problem.

It's back up and I am seeing higher system utilization now:

I just checked in on it, and it looks like the gap narrowed again after the restart:

system time	latest timestamp
Tue Mar 5 17:52:24 UTC 2024	2024-03-02 23:58:08.000000

It's currently 2 days 17 hours 54 minutes

I'll continue monitoring it.

Tool is no longer displaying a data notice and data is within 1 minute of the live time.

Seems broken again, since March 28. :-(

Hmm. The scripts container is still happy. We seem to have an inordinately large number of edits tracked on 28th March.

Screenshot 2024-04-02 at 10.12.51.png (564×398 px, 186 KB)

Seems to mostly be Quickstatements

Data capture is continuing, it's just very slow. We may need to prioritise some work to understand what the bottleneck is.

Samwalton9-WMF mentioned this in T361567: Identify bottlenecks in Hashtags tool data collection performance.Tue, Apr 2, 9:27 AM

Samwalton9-WMF edited projects, added Moderator-Tools-Team; removed Moderator-Tools-Team (Kanban).Tue, Apr 2, 4:18 PM

jsn.sherman changed the status of subtask T361675: Stop tracking quickstatements in the Hashtags tool from Open to In Progress.Wed, Apr 3, 6:51 PM

Samwalton9-WMF closed subtask T361675: Stop tracking quickstatements in the Hashtags tool as Resolved.Thu, Apr 4, 2:38 PM

jsn.sherman mentioned this in T361848: port wikilink events collection optimizations to hashtags collection script.Thu, Apr 4, 3:01 PM

Data has caught up again.

	F44142376: Screenshot 2024-04-02 at 10.15.01.png
	Tue, Apr 2, 9:23 AM

	F44142119: Screenshot 2024-04-02 at 10.12.51.png
	Tue, Apr 2, 9:23 AM

	F42394558: image.png
	Mar 4 2024, 2:36 PM

	F42217603: image.png
	Feb 28 2024, 5:17 PM

Hashtags not reflecting new changesClosed, ResolvedPublicActions

Description

Related ObjectsSearch...

Event Timeline

Hashtags not reflecting new changes
Closed, ResolvedPublic
Actions

Related Objects
Search...