Page MenuHomePhabricator

Check generated html for duplicate id= attributes.
Closed, ResolvedPublic

Description

Currently, it is perfectly possible to use html id= attributes inside
a wiki page giving them arbitrary values. When using one that is
also used by the skin, or duplicating id= values inside the page text,
one can generate invalid html, since id= values have to be unique within
html documents.

While this error class is tolerated by browsers and thus does not
usually cause problems to users, intended use of id= values for
either styling via CSS or as fragment identifiers that browsers
are expected to scroll to, is impacted.

At the moment, editors are not notified of the possibility of
problems, and duplicated id= values are not diagnosed.

Since it is obvious that the parser could collect id= values
for finding duplicates, this should be considered for the new
parser. Checking the id= values is one of the tasks requiring
a full page parse even during section editing, otherwise
sufficient data needs to be kept with each page to know existing
id= values of other sections without parsing them.

I would also suggest to reserve a series of id= values for
the use by MediaWiki and document that so as to reduce the
likelyhood of conflicts. Since using id= attributes is a
subject matter likely outside the scope of casual, unexperienced
and less educated editors, I also suggest to just accept duplicates
during a 2nd save, such as empty "Summary" fields. Maybe even
a toggle in Special:Preferences similar to the one for the
handling of empty "Summary" fields can be considered for the
id= value checking.

This bug is related to bug 7356.


Version: unspecified
Severity: enhancement

Details

Reference
bz29049

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:36 PM
bzimport set Reference to bz29049.
bzimport added a subscriber: Unknown Object (MLST).

(In reply to comment #0)

I would also suggest to reserve a series of id= values for
the use by MediaWiki and document that so as to reduce the
likelyhood of conflicts.

Generally speaking, classes and IDs added nowadays to the interface are prefixed by "mw-" to help avoid collisions with user input. In fact, this is documented at [0]. If the rest of your proposal were implemented (issuing warnings for duplicate IDs), we could throw a general warning for any ID beginning with "mw-".

[0] http://www.mediawiki.org/wiki/Coding_convention#Messages - beginning with 'HTML class and ID names should be prefixed with "mw-"...'

This seems to be the same issue as bug 7356; marking as a dupe. Is there any additional information that's not already listed there?

  • This bug has been marked as a duplicate of bug 7356 ***