Page MenuHomePhabricator

[Migrated] Add a "skip if page </> x bytes" option, where x is user-defined
Open, MediumPublic

Description

The ability to skip pages larger than x bytes or smaller than x bytes. @Tom.Reding 15:50, 12 February 2015 (UTC)

Event Timeline

Reguyla raised the priority of this task from to Needs Triage.
Reguyla updated the task description. (Show Details)
Reguyla added a project: AutoWikiBrowser.
Reguyla moved this task to Feature request (unsorted) on the AutoWikiBrowser board.
Reguyla added subscribers: Reguyla, Aklapper.

@Magioladitis 12:42, 13 February 2015 (UTC) wrote:
@Tom.Reding try: Skip if contains .{1,x} with Regex on.

@Tom.Reding 15:32, 13 February 2015 (UTC) wrote:
Nice! I like it.

@Tom.Reding 20:21, 17 March 2015 (UTC) wrote:
There seems to be a limit at "Skip if Contains: .{935,}". This 17 kB page is skipped only if ".{935,}" and lower, but not ".{936,}" and higher. ".{936,100000}" doesn't skip either. Is this an AWB problem or a regex engine one?

I haven't looked through bug reports yet, but will post {{done}} there if I can't fine one.

@John_of_Reading 20:32, 17 March 2015 (UTC) wrote:
I think the "skip" regex is being run against each paragraph separately, not the whole article. The longest paragraphs in that article have 935-ish characters.

@Tom.Reding 20:42, 17 March 2015 (UTC) wrote:
Nope, [.\s]{936,} doesn't skip either.

John_of_Reading added a comment.EditedMay 30 2015, 8:51 PM

Further discussion took place on my talk page; it's now at https://en.wikipedia.org/wiki/User_talk:John_of_Reading/Archive_18#Relevent_regex_question

Something strange was indeed happening on Tom.Reding's machine, but it cleared after a reboot.

Josve05a triaged this task as Medium priority.Aug 22 2015, 11:59 PM
Josve05a added a subscriber: Josve05a.