Page MenuHomePhabricator

Stop serving slowparse logs from dumps distribution servers
Closed, ResolvedPublic

Description

Was: Replace cron that syncs archived slow-parse logs to dataset host with server side fetch job

The rsync_slow_parse in role::logging::mediawiki::udp2log currently rsyncs data to dataset1001, which is served at (https://dumps.wikimedia.org/other/slow-parse/)

dataset1001 is being replaced by labstore1006|7, and as part of the migration we're converting all of the datasets to be fetched directly on the labstore end.

I'd like to figure out what directory, server and frequency to read the slow-parse logs from so we can set up a straight rsync to pick up this data.

Update: The performance team is not using these logs anymore, so updating task description

Related Objects

StatusSubtypeAssignedTask
Resolvedbd808
ResolvedArielGlenn
Resolved madhuvishy
Resolved madhuvishy
Resolved madhuvishy
Resolved madhuvishy
ResolvedArielGlenn
ResolvedArielGlenn
ResolvedArielGlenn
Resolved madhuvishy
Resolved madhuvishy
Resolved madhuvishy
Resolved madhuvishy
Resolved madhuvishy
Resolved madhuvishy
Resolved madhuvishy
Resolved madhuvishy
Resolved madhuvishy
Resolved madhuvishy
Resolved madhuvishy

Event Timeline

madhuvishy created this task.

@Ottomata Hey! Do you know anything about these logs? :) I'd like to make it so that we can fetch from the mwlog server when we move to the new dumps set up.

Hi. As far as I'm concerned we do not need to keep the public slow-parse dumps any longer.

These logs are a public and sanitized copy of the "slow-parse" warning messages logged by MediaWiki PHP to Logstash (via mwlog1001).

In 2015, as part of T98563, Performance Team set up a public dump of them for use by the Performance Inspector (T117411, mw docs). However, we never ended up using them.

We have another future project (T102899), that might need these at some point, but it's unlikely that we'll use the dumps system for that, so there really isn't any use in these now.

@Legoktm I saw that you created a proof-of-concept consumer at https://tools.wmflabs.org/slow-parse/, based on T98563#2869436 it sounds like you wouldn't mind this being removed, but can you confirm?

Change 420408 had a related patch set uploaded (by Madhuvishy; owner: Madhuvishy):
[operations/puppet@production] slow-parse: Turn off rsync from mwlog1001 to dumps servers

https://gerrit.wikimedia.org/r/420408

Change 420410 had a related patch set uploaded (by Madhuvishy; owner: Madhuvishy):
[operations/puppet@production] slowparse: Remove code for rsync to dumps servers

https://gerrit.wikimedia.org/r/420410

Change 420411 had a related patch set uploaded (by Madhuvishy; owner: Madhuvishy):
[operations/puppet@production] dumps: Absent slowparse logs rsync config

https://gerrit.wikimedia.org/r/420411

Change 420415 had a related patch set uploaded (by Madhuvishy; owner: Madhuvishy):
[operations/puppet@production] dumps: Remove slowparse rsync related code

https://gerrit.wikimedia.org/r/420415

madhuvishy renamed this task from Replace cron that syncs archived slow-parse logs to dataset host with server side fetch job to Stop serving slowparse logs from dumps distribution servers.Mar 19 2018, 7:52 PM
madhuvishy updated the task description. (Show Details)

Change 420408 merged by Madhuvishy:
[operations/puppet@production] slow-parse: Turn off rsync from mwlog1001 to dumps servers

https://gerrit.wikimedia.org/r/420408

Change 420410 merged by Madhuvishy:
[operations/puppet@production] slow-parse: Remove code for rsync to dumps servers

https://gerrit.wikimedia.org/r/420410

Change 420411 merged by Madhuvishy:
[operations/puppet@production] dumps: Absent slowparse logs rsync config

https://gerrit.wikimedia.org/r/420411

Change 420415 merged by Madhuvishy:
[operations/puppet@production] dumps: Remove slowparse rsync related code

https://gerrit.wikimedia.org/r/420415

Update: I've removed all rsync related jobs and code from puppet on both dumps servers and mwlog servers. To do: stop serving at https://dumps.wikimedia.org/other/slow-parse/, and cleanup existing data from other/ on the dumps servers.

Did we get signoff from @Legoktm? I'm guessing yes but I didn't see it here or on the changesets.

Hi. As far as I'm concerned we do not need to keep the public slow-parse dumps any longer.

These logs are a public and sanitized copy of the "slow-parse" warning messages logged by MediaWiki PHP to Logstash (via mwlog1001).

In 2015, as part of T98563, Performance Team set up a public dump of them for use by the Performance Inspector (T117411, mw docs). However, we never ended up using them.

IIRC the slow-parse logs were made public because there was some issue with pages being slow. Maybe @MZMcBride remembers the context when they filed the task. PerformanceInspector was an idea that came out of making the logs useful.

@Legoktm I saw that you created a proof-of-concept consumer at https://tools.wmflabs.org/slow-parse/, based on T98563#2869436 it sounds like you wouldn't mind this being removed, but can you confirm?

I guess no one was using my tool since it broke/stopped updating in mid-2017. At first glance I didn't find the data useful, but I've never done template performance/optimization work so I wasn't even the right audience to begin with. I don't remember what my motivation in creating the tool was, maybe something we had discussed as part of parsing team.

@Legoktm cool! Thanks for weighing in. Looks like we're good to continue deprecating serving these from the servers then.

I'll defer to @Krinkle's comment in T189284#4057666.

The broad idea here was that we have (or perhaps had) pages that were slow to parse, we were logging this information, but the log was not easily accessible and I believe previously the log mixed private and public Wikimedia wikis. Some of these issues have been fixed.

@Legoktm I saw that you created a proof-of-concept consumer at https://tools.wmflabs.org/slow-parse/, based on T98563#2869436 it sounds like you wouldn't mind this being removed, but can you confirm?

I guess no one was using my tool since it broke/stopped updating in mid-2017. At first glance I didn't find the data useful, but I've never done template performance/optimization work so I wasn't even the right audience to begin with. I don't remember what my motivation in creating the tool was, maybe something we had discussed as part of parsing team.

I don't remember discussing this ... but even if we did, clearly this is not something that registered in my mind, and I don't see any immediate use for it.

I'll make sure to clean up the old logs before the migration. One less thing to worry about!

Change 421489 had a related patch set uploaded (by ArielGlenn; owner: ArielGlenn):
[operations/puppet@production] Remove the slow-parse logs dataset cleanup job

https://gerrit.wikimedia.org/r/421489

Change 421489 merged by ArielGlenn:
[operations/puppet@production] Remove the slow-parse logs dataset cleanup job

https://gerrit.wikimedia.org/r/421489

ArielGlenn claimed this task.

These logs are now gone. Closing.