Page MenuHomePhabricator

archiverbot: add "archive top threads only" mode/flag
Closed, ResolvedPublicFeature

Description

Feature summary:

When humans archive threads, they may cut and paste one big oldest block of threads. This tends to ensure the archived threads are ordered like how they were ordered originally. I want to emulate that by archivebot.

Conceptually, the new mode will look this: the bot will first select the candidate threads to archive as usual, and divide them into continuous groups. Only the group that includes the top thread will remain as candidates. (This is just for explanation, there may be a simpler algorithm to achieve the same result.)

I think it makes sense for this option to be configurable via the marker template, page by page, because the effect is merely delaying the archiving of some threads. It may also be good enough to have it as a command line option, though.

Use case(s):

Let' say it's 2021-02-01, the algorithm is old(365d), and we have 4 threads like this:

== Thread A (updated in 2021-01-01) ==
....
== Thread B (updated in 2020-01-01) ==
....
== Thread C (updated in 2021-01-01) ==
....
== Thread D (updated in 2021-02-01) ==
....

Normally, the bot will archive Thread B immediately and leave the other 3, and wait until 2022-01-01 to archive A and C. The archive page will have B-A-C in that order.

In the "archive top threads only" mode, I want it to leave the top 3 threads unchanged on 2021-02-01, because removing B while leaving A and C will make the archive unsorted. It will wait until 2022-01-01 and then it will archive the top 3 threads A-C together, assuming no update to the threads has been made in the 11 months. The archive page will have A-B-C in that order.

Benefits:

This will help the bot to more strictly preserve the order of threads, which will make the archive cleaner to some people (at the expense of making archiving slower).

Event Timeline

Xqt triaged this task as Low priority.Aug 4 2022, 1:56 PM
Xqt changed the task status from Open to In Progress.Aug 4 2022, 2:40 PM
Xqt claimed this task.

Change 820469 had a related patch set uploaded (by Xqt; author: Xqt):

[pywikibot/core@master] [IMPR] Preserve thread order in archive even if threads are archived later

https://gerrit.wikimedia.org/r/820469

Is this the same as T312773?

No, although the two may achieve similar results. The method described here will preserve whatever the original order was, while the other task's case the bot will rearrange threads by first posts in threads. (Thank you for working on it, by the way!)

Is this the same as T312773?

No, although the two may achieve similar results. The method described here will preserve whatever the original order was, while the other task's case the bot will rearrange threads by first posts in threads. (Thank you for working on it, by the way!)

This patch keeps the order and in most cases this order corresponds with the first timestamp of a thread. Therefore this patch also solves the other task in most cases except there are threads added on top or in the middle of a talk page.

Change 820469 merged by Xqt:

[pywikibot/core@master] [IMPR] Preserve thread order in archive even if threads are archived later

https://gerrit.wikimedia.org/r/820469

@Xqt You are right, the patch works as described in this task's description. I only skimmed the code and misunderstood how it works. (I should have tested it at least before commenting, sorry.)