Page MenuHomePhabricator

Strange URL pattern after search https://en.wikipedia.org/w/index.php?sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance ...
Closed, ResolvedPublic

Description

This is technically not a bug, but a request for research in case there is an underlying bug.

As early as 2019-11-23T05:22:22, but in higher volume since 2019-12-16T12:52:34, varnish receives the following HTTP GET:

https://en.wikipedia.org/w/index.php?sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=rele

(log shows it cut, but I don't know if it is cut by logging, varnish, or was originally cut like that on source)

https://logstash.wikimedia.org/goto/41c5065999052896acdd6835273a0ca6

The requests seem to be in bulks of ~500 requests over 5 minutes for each IP, from different IPs, regular interactive browser sessions like those coming from Safari/Edge and with regular Search actions as referrers. They don't create any issues because they are just redirected to the Main Page, parameters ignored. There is around 10000 requests like those in the last 45 days, not a large number.

My question is if there is some kind of UI, javascript, gadget, etc. on our side that could cause accidental request of those URLs as maybe a UI bug, or something else we could do something about. This is not high priority as it is not causing production issues, but it causes log spam which would be nice to avoid to not mask other more important issues.

Event Timeline

jcrespo created this task.Jan 29 2020, 10:25 AM
Restricted Application added a project: Discovery-Search. · View Herald TranscriptJan 29 2020, 10:25 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
jijiki added a subscriber: jijiki.Jan 29 2020, 10:37 AM
Jdlrobson added a subscriber: Jdlrobson.

Seems to be set in the WikibaseCirrusSearch extension which we (reading web) know little about.

Thanks @Jdlrobson I hoped that either Search or Readers could know a source, but I it wasn't clear to me on filing.

Special:EntitiesWithoutLabel and Special:EntitiesWithoutDescription redirect to Special:Search with the parameter.

But there is no recursive seen.

Both special page are not active on en.wikipedia

CBogen triaged this task as Low priority.Aug 27 2020, 9:15 PM

As of 2020-09-08, this is the top url (discarding bot and wikidata and analytics queries) causing 50X errors: https://logstash.wikimedia.org/goto/71fca45fae2764f87b0cb7ebabc5dad2

650 requests in the last 4 hours with:

GET https://en.wikipedia.org/w/index.php?sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=relevance&sort=rele

I am CCing traffic in case they have some insight and for awareness, although I would guess this is more of an app-level small bug, if it is a bug on our side.

Restricted Application added a project: Operations. · View Herald TranscriptSep 8 2020, 3:29 PM

It only impacts English Wikipedia so this tells me it's a gadget.

Could it be explained by one of these gadget distributed by @PrimeHunter / @Amorymeltzer ?

From what I can see the add a more menu with links and the search URL grows with every page view which would support the growing URL from the same IP if a user is trying lots of different searches in succession:

Amorymeltzer added a comment.EditedSep 8 2020, 5:53 PM

That's a good bet, AFAICT, but hundreds of times?! It's clicking the button that adds the term and reloads, so getting to hundreds or even dozens seems unlikely from manual use. There are currently only seven users importing it so if so it should be easy to nail down, especially given the tight timeframe today.

Change 626179 had a related patch set uploaded (by Thiemo Kreuz (WMDE); owner: Thiemo Kreuz (WMDE)):
[mediawiki/extensions/AdvancedSearch@master] Don't put default …&sort=relevance in the URL

https://gerrit.wikimedia.org/r/626179

Change 626179 merged by jenkins-bot:
[mediawiki/extensions/AdvancedSearch@master] Don't put default …&sort=relevance in the URL

https://gerrit.wikimedia.org/r/626179

I think this is resolved. The logstash doesn't bring anything in the past seven days: https://logstash.wikimedia.org/goto/6dd5e7cdd1eec5714171e3125d13fe2c

jcrespo closed this task as Resolved.Oct 20 2020, 4:17 PM
jcrespo assigned this task to thiemowmde.

Not sure who owns this to declare it resolved, but as the original reporter, I think it is, using the above link, there was no incident in the last 4 weeks. Thanks to everyone that helped here.