Page MenuHomePhabricator

helppanelquestionposter API module uses HTML as error codes
Closed, ResolvedPublic3 Estimated Story PointsBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

What happens?:

  • Observe the fact that besides typical error codes like notloggedin or abusefilter_disallowed_abusefilter_warning, the results contain "codes" like p_br_b_La_possibilit_di_effettuare_modifiche_all_enciclopedia_con_l_utenza_Augusto_La_Spina_b_nome_utente_a_href_wiki_Indirizzo_IP_title_Indirizzo_IP_indirizzo_IP_a_o_range_di_indirizzi_IP_b_sospesa_b_p_p_Il_tuo_indirizzo_IP_attuale_130_25_184_189_p_p_La_sospensione_i_blocco_i_stata_attivata_dall_amministratore_a_href_wiki_Utente_Filtro_anti_abusi_title_Utente_Filtro_anti_abusi_Filtro_anti_abusi_a_per_... (truncated, the actual value is much longer)

What should have happened instead?:

  • Error codes should be elements of a fixed set, they should not be internaltionalized, and they should not contain HTML. Random long codes drive up the cardinality of metrics in Thanos, which may degrate query performance. They are also extremely awkward inthe Grafana UI.

Other information (browser name/version, screenshots, etc.):

  • The error codes to emit the stats are returned by ApiMain::substituteResultWithError.
  • The bad codes look like HTML sanitized to conform to error-code syntax. It's not clear where that happens, or why

Event Timeline

Michael triaged this task as High priority.Aug 4 2025, 9:52 AM
Michael moved this task from Inbox to Up Next (estimated tasks) on the Growth-Team board.
Michael subscribed.

We probably should fix this quickly. At a quick look, this seems to come from Abuse filter maybe? It will take a bit of diving deeper where in that module Status-instances are created and with what values, likely by hooks.

This is a suspicious candidate:

QuestionPoster::runEditFilterMergedContentHook
$hookRunner->onEditFilterMergedContent(
	$derivativeContext,
	$content,
	$status,
	$summary,
	$derivativeContext->getUser(),
	false
)

But I have not yet traced all the calls from that API module, so there might be others.

KStoller-WMF set the point value for this task to 3.Aug 25 2025, 4:17 PM

According to AI, the English translation of the HTML "error code" is:

The possibility to make edits to the encyclopedia with the account Augusto La Spina (username), IP address or range of IP addresses is suspended.
Your current IP address is (REDACTED).
The suspension (block) was activated by the administrator Abuse filter.

This seems like an AF-caused block, but I'm not sure if this is MediaWiki Core's response to the block, or AF behaviour when creating the block.

This seems like an AF-caused block, but I'm not sure if this is MediaWiki Core's response to the block, or AF behaviour when creating the block.

Considering this matches https://it.wikipedia.org/wiki/MediaWiki:Blockedtext pretty much exactly, I would assume this comes from Core. The AI-transformed Italian version for reference:

La possibilità di effettuare modifiche all'enciclopedia con l'utenza Augusto La Spina (nome utente), indirizzo IP o range di indirizzi IP è sospesa.

Il tuo indirizzo IP attuale è (REDACTED).

La sospensione (blocco) è stata attivata dall’amministratore Filtro anti abusi.

I started looking into this, so I might just as well finish this.

Note my fix changes the user experience at Special:Homepage when you try to ask a question while blocked.

Current behaviour

image.png (836×1 px, 182 KB)

New behaviour

image.png (838×984 px, 210 KB)

This is because we were accidentally (?) adding HTML code as the message key, which does not make sense. This is visible from the errorneous < displayed at the of the message in the first screenshot (there is also a > at the end), which is what happens when MediaWiki uses an i18n message it does not define (then, the user sees <nonexistent-message). You can see a similar experience at http://en.wikipedia.org/wiki/Main_Page?uselang=qqx.

I suggest we accept the degraded message (at least for now), we can improve it in a separate task if we deem that appropriate.

Change #1182809 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[mediawiki/extensions/GrowthExperiments@master] fix(QuestionPoster): Use getPermissionStatus

https://gerrit.wikimedia.org/r/1182809

Note my fix changes the user experience at Special:Homepage when you try to ask a question while blocked.

Current behaviour

image.png (836×1 px, 182 KB)

New behaviour

image.png (838×984 px, 210 KB)

This is because we were accidentally (?) adding HTML code as the message key, which does not make sense. This is visible from the errorneous < displayed at the of the message in the first screenshot (there is also a > at the end), which is what happens when MediaWiki uses an i18n message it does not define (then, the user sees <nonexistent-message). You can see a similar experience at http://en.wikipedia.org/wiki/Main_Page?uselang=qqx.

I suggest we accept the degraded message (at least for now), we can improve it in a separate task if we deem that appropriate.

That tradeoff makes sense to me. Let's stop this bad behavior now and create a new task for making the experience for blocked users a bit nicer again. We can tackle that when we work again on that module for other reasons.

Would this also happen when trying to ask a question to mentor with a blocked talk-page? (That should become a non-issue with T244258, but for now it might still exist?)

Would this also happen when trying to ask a question to mentor with a blocked talk-page? (That should become a non-issue with T244258, but for now it might still exist?)

At the very least, not because of what I did. My change only touches permission-related code, and it strictly operates with PermissionStatus. It might be something to QA and validate if needed.

Change #1182840 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[mediawiki/extensions/GrowthExperiments@master] tests: Do not pass null value to checkPermissions

https://gerrit.wikimedia.org/r/1182840

Change #1182809 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] fix(QuestionPoster): Use getPermissionStatus

https://gerrit.wikimedia.org/r/1182809

Change #1182840 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] tests: Do not pass null value to checkPermissions

https://gerrit.wikimedia.org/r/1182840

@daniel Is there anything we should do to drop the buggy metrics? Or will that happen automagically in the future?

@daniel Is there anything we should do to drop the buggy metrics? Or will that happen automagically in the future?

It will clean itself up after a couple of months. And you'll only encounter them in Grafana if you "zoom out" (or "scroll back") in time.

I suppose it could be cleaned up manually somehow, but unless it's a major issue to someone, we can probably just wait for it to heal.

Thanks for fixing this!

QA Note: The Martin's comment above (T401096#11127964) describes the new expected behavior when trying to ask a question to a mentor via the homepage while blocked. Visually, that's a slight degradation, but that feels ok given how little time we would like to spend here.

The actual fix from this ticked, that there are no longer strange labels in Grafana, should be verifiable with the links in the description.

@Urbanecm I am still seeing a lot of bad codes coming in: https://grafana.wikimedia.org/goto/yHue99wNR?orgId=1. It looks like there is a variety of different error messages. Is it possible that the same issue exists in multiple places?

@Urbanecm_WMF I am still seeing a lot of bad codes coming in: https://grafana.wikimedia.org/goto/yHue99wNR?orgId=1. It looks like there is a variety of different error messages. Is it possible that the same issue exists in multiple places?

I'm not actually seeing any hits for any of the codes? They all show as zeros on my end. Am I doing something wrong?

Zooming a bit out, the last problematic code I see is for September 3rd: https://grafana.wikimedia.org/goto/K2uCR6rNg?orgId=1

image.png (423×1 px, 62 KB)

Zooming a bit out, the last problematic code I see is for September 3rd: https://grafana.wikimedia.org/goto/K2uCR6rNg?orgId=1

Which is during the wmf.17 train, cf. T396378. So, it would make sense the error was not fully fixed back then.

I'm not actually seeing any hits for any of the codes? They all show as zeros on my end. Am I doing something wrong?

Ah, sorry - they are still listed in the legend of the diagram, bot the count is 0. I would have expect them to be removed from the legend as well... oh well, it will clean itself up eventually.

Etonkovidova subscribed.

QA Note: The Martin's comment above (T401096#11127964) describes the new expected behavior when trying to ask a question to a mentor via the homepage while blocked. Visually, that's a slight degradation, but that feels ok given how little time we would like to spend here.

Thank you, @Michael! I checked and saw the same message as in https://phabricator.wikimedia.org/T401096#11128099. Also, I added as checking blocked user Homepage error message as a test case to T244258.

The actual fix from this ticked, that there are no longer strange labels in Grafana, should be verifiable with the links in the description.

Yes, Grafana looks good.

sum by(error_code) (rate(mediawiki_api_errors{exception_cause="client_error", module="helppanelquestionposter"}[$__rate_interval]))

beforeafter
Screenshot 2025-09-19 at 4.10.14 PM.png (656×2 px, 251 KB)
Screenshot 2025-09-19 at 4.10.47 PM.png (566×2 px, 109 KB)