Page MenuHomePhabricator

Odd characters appearing before qualifier values
Closed, ResolvedPublicBUG REPORT


User story: N/A

We have this:
There are some strange characters appearing before qualifier values on file pages. Sample file:,_LVM,_B%C3%BCrogeb%C3%A4ude_--_2013_--_5149-51.jpg

We want this:
No character weirdness, please 😄

Screenshots (if possible):

image.png (692×919 px, 42 KB)

Acceptance Criteria:

  • Qualifier values have no strange characters appearing in their display

During development, please test the following:

  • Test this feature while logged in AND logged out
  • Test this feature on at least one mobile browser
  • Test that this feature works on the file page AND the Add Data step on UploadWizard (if applicable, some features only exist on one or the other)

COVID-19 Deployment Criteria

  • Can you roll back this change without lasting impact?
    1. A recovery plan is required as this will help identify our capacity for recovering from the failure
    2. THIS IS A KEY QUESTION, if you can’t answer it, you shouldn’t deploy
  • Is specialized knowledge required to support this change in production? If so, are there multiple people with this knowledge?
  • Is there a way to increase confidence about the correctness of this change?
    1. Reviews (Design, Code, etc)
    2. Testing coverage (unit tests, integration tests)
    3. Manual testing (e.g. Beta, vagrant, docker)

Event Timeline

Ramsey nerd-sniped me with this, so I dug briefly and found a Windows-specific bug which seemed on point. Initially when talking with him I thought it might be a random space character trailing a line in our output HTML, but I then found that at src/View/MediaInfoEntityStatementsView.php:371, we explicitly add the line separator character, which seems out of place here. From the Unicode docs:

A line separator indicates where a line break alone should occur, typically within a paragraph.

Given there is no line break rendered here, at least not on my screen, perhaps it would be more appropriate to use   (or similar) than 
 here, which would also greatly increase readability (i.e. you wouldn't need to look up the HTML decimal code the next time you wanted to figure out what this code was doing. Chrome on Windows should be able to handle a non-break space just fine, as most browsers have since the dawn of time.

@matthiasmullie seems to have made both changes (fd85f8b0 and 47dbc1be) to WBMI to add this character in, so perhaps he can shed some light as to why, and maybe dissuade us from a regression.

Change 578962 had a related patch set uploaded (by Matthias Mullie; owner: Matthias Mullie):
[mediawiki/extensions/WikibaseMediaInfo@master] Get rid of 

Change 578962 merged by jenkins-bot:
[mediawiki/extensions/WikibaseMediaInfo@master] Get rid of 

Change 581578 had a related patch set uploaded (by Matthias Mullie; owner: Matthias Mullie):
[mediawiki/extensions/WikibaseMediaInfo@master] Fix qualifier separator whitespace handling

Is T247711 a dupe or related?

Not a dupe, but yes: related.
This one has already been fixed, but in order to fix T247711, this one also had to change :)
See more elaborate comment in other ticket.

Change 581578 merged by jenkins-bot:
[mediawiki/extensions/WikibaseMediaInfo@master] Fix qualifier separator whitespace handling