Page MenuHomePhabricator

400 - Bad Request on any Global Search
Closed, ResolvedPublic

Description

URL: https://global-search.toolforge.org/?q=.*&regex=1&namespaces=8&title=Disclaimerpage

Getting a "400: Bad Request" error on any search that I try.

Might be a server issue - but was not getting it for a window of time yesterday (after T358061 was resolved) - have been getting it for several hours now, so figured I would report it (just in case).

Thanks!

Event Timeline

I've tried a number of variations and simpler queries, but it seems nothing is getting through, e.g. https://global-search.toolforge.org/?q=%22git.wikimedia.org%22

The full error:

{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"unknown host [cloudelastic1001.wikimedia.org]"}],"type":"illegal_argument_exception","reason":"unknown host [cloudelastic1001.wikimedia.org]","caused_by":{"type":"unknown_host_exception","reason":"cloudelastic1001.wikimedia.org"}},"status":400

I've searched the code, and there's no reference to cloudelastic1001.wikimedia.org.

Might be a server issue - but was not getting it for a window of time yesterday (after T358061 was resolved)

That is my suspicion as well. Time to poke @EBernhardson again! I'm guessing something in CloudElastic is still referencing the decommissioned servers. I get the same error when running a one-off query like:

curl -XGET https://cloudelastic.wikimedia.org:8243/*,*:*/_search?q=example

Also poking @VRiley-WMF in case she knows what the problem might be.

@bking this is likely related to the transition of cloudelastic to private ips? I'll take a look later if you don't have ideas.

This is an important tool for fighting cross-wiki vandalism and spam. I hope it would be fixed soon.

MusikAnimal triaged this task as Unbreak Now! priority.Mar 4 2024, 9:47 PM

I'll be bold and elevate this to UBN. I'm continually getting complaints from the communities and staff alike who say it is important to their daily work. At the least, the UBN signifies the level of severity for Global Search tool itself. I can't speak for Data Platform SRE and whomever else may be involved in actually fixing the underlying problem. For the curious, it looks like T358802 is tracking the work needed that would fix Global Search.

Hello everyone,

I apologize for the delayed response. We changed the cross cluster settings last week, which should have fixed the issue. I'll take another look now.

The query in the task description now throws a "500: Internal Server Error" for me.

I'm actually seeing it timeout now, so not the same issue as what was reported here, but it still breaks Global Search.

tools.global-search@tools-sgebastion-10:~$ curl -XGET https://cloudelastic.wikimedia.org:8243/*,*:*/_search?q=example
{"error":{"root_cause":[{"type":"connect_transport_exception","reason":"[][10.64.48.24:9600] connect_timeout[30s]"}],"type":"connect_transport_exception","reason":"[][10.64.48.24:9600] connect_timeout[30s]"},"status":500}

The query in the task description now throws a "500: Internal Server Error" for me.

On the other hand, the link shared by Krinkle actually loads reasonably quickly with no errors.

Per IRC conversation with @MusikAnimal , we believe this to be fixed now. Please respond here if this is not the case.

Apologies, as this was a missed step during the migration. We believed it was fixed Feb 29 , but I used the wrong ports in my cluster update command.

Thanks to the community for raising this issue. We should have improved monitoring for this issue available soon; see T358802 for more details.

MusikAnimal assigned this task to bking.

Thanks @bking! Resolving.

Confirming this is working for me - thank you all! :)

Thanks to everyone for the fix. I use it every day, I realized how important it was when it was broken.

I don't know if it's related but there are many duplicate items in the results now.

In T358541#9598745, @T wrote:

I don't know if it's related but there are many duplicate items in the results now.

Experiencing that as well - looks like others are as well: T359136

Change 1012703 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] cloudelastic: check/alert on cluster inconsistencies

https://gerrit.wikimedia.org/r/1012703

Change 1012703 merged by Bking:

[operations/puppet@production] cloudelastic: check/alert on cluster inconsistencies

https://gerrit.wikimedia.org/r/1012703