- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Thu, Feb 25
Tue, Feb 23
Might be worth adding a "serialize-then-parse and check you get the same tree" assertion at some point to try to catch these cases more aggressively. @ssastry mentioned that we sort of do this already with our RT testing framework. In theory we could do this in parser tests as well, but I was a little disappointed to see that we didn't already have a parser test case for [[Foo|<div></div>]] so adding that assertion to the parser test framework might not buy us any extra coverage or protection.
Mon, Feb 22
https://phabricator.wikimedia.org/T214241#6849806 contains some earlier discussion, framed at the time as an issue of collapsing wrapper elements.
In T275082: Develop a spec for representing a DOM range in serialized Parsoid output the proposal is to have a way to represent a DOM *range*, both internally to Parsoid as well as externally as a serialization form. This is related to "issue #1" from the comment above, where we wanted to have a uniform way to represent "collapsed wrappers". However, the main difference is that T275082 ranges are *not* guaranteed to correspond to complete DOM subtrees (although maybe they should?), although they should still correspond to a contiguous set of nodes in an in-order traversal. So they can be represented as an about ID which is applied to a number of nodes, but adding a temporary <parsoid-wrapper> element to the internal DOM tree is probably not (quite) sufficient.
Sat, Feb 20
Fri, Feb 19
Thu, Feb 18
Tue, Feb 16
Sat, Feb 13
Wed, Feb 10
Mon, Feb 8
I think the proper resolution of this task was outlined by @Nikerabbit:
Wed, Feb 3
Ok, the pair of patches (https://gerrit.wikimedia.org/r/c/mediawiki/core/+/622663 and https://gerrit.wikimedia.org/r/c/mediawiki/extensions/VisualEditor/+/661213) actually fix the problem, according to @DLynch's testing. Sorry it took a while to get that right.
Mon, Feb 1
Basically there's one "builder" class in remex html; dodo could implement that to quickly get a tree builder. Alternatively, we could remove some explicit type casts (as I did for zest) to get a "universal" remex that would work with any some library as long as it was given an appropriate duck typed "DOMImplementation" class.
Fri, Jan 29
Jan 28 2021
In T69486#6738901, @Ltrlg wrote:In T69486#6731901, @cscott wrote:Self-links are (I expect) pretty rare, so they the fact that they could split a possible fragment cache based on 'current document title' isn't much of a concern right now.
I think most inclusions of navboxes contain a self-link. Not sure this is enough for for being a concern, but I wouldn’t classify it as “pretty rare”.
Jan 26 2021
Given the intent to implement TOC in Parsoid (T270199), probably the correct thing to do is add a 'toc' flag to the !!options to opt-in to the toc post-processing step (when that gets implemented).
Jan 25 2021
Another option is to break out the Sanitizer as a library; if it was a parallel PHP/JS library (like wikipeg) then we could perhaps solve some of the Sanitizer-multiplication issues described in T248211: One Sanitizer to Rule Them All and reuse this library in more places.
Jan 23 2021
Jan 20 2021
@Krinkle I'm not sure why you think the solution is to prefix IDs in *user* content, rather than to reserve a prefix for *system* content. After all, (a) links to IDs in user content are already all over the web and in archives, it's somewhat rude to change them even if we provide compatibility anchors for a short time (and if we keep them forever, we haven't actually solved the problem), and (b) URLs to wiki content are intended to be "human readable", which suggests that "machine" content like "mw-" should be omitted if possible -- adding a prefix to user IDs also doesn't play nicely with our i18n (consider RTL anchors).
I'll look at the code, the idea seems reasonable. We could also file a bug upstream to see if PHP will add a bignum or string interface. Could also check libicu to see what types they are supporting (maybe the fault is just in the PHP wrapper).
Jan 19 2021
What's the wikitext source? For [[Chapter 30]] Parsoid should definitely be outputting <a rel="mw:WikiLink" href="./Book_of_Jasher" title="Book of Jasher">Book of Jasher</a> and be totally compatible with the core parser output:
$ echo "[[Chapter 30]]" | php bin/parse.php --normalize
WRT captions appearing in the error case:
The {MediaWiki DOM Spec](https://www.mediawiki.org/wiki/Specs/HTML/2.2.0#Images) says:
<figcaption (absent when inline)>...</figcaption>
and
The outer <figure> element needs to become a <span> element when the figure is rendered inline, since otherwise the HTML5 parser will interrupt a surrounding block context. The inner <figcaption> element is rendered as a data-mw attribute in this case (since block content in an invisible caption would otherwise break parsing).
Great minds think alike: T230653: Use a parser function to encapsulate signatures
T230653: Use a parser function to encapsulate signatures would mitigate this long-term.
Jan 14 2021
I think it's high time that the Parser/Linker maintain a list of interface-reserved prefixes (like n-, p-, and mw-), as well as a (short) list of legacy IDs (such as footer), that are automatically mapped to a different name to avoid clashes with interace styles.
For example, by prepending it with h- for heading, or something like that. For compatibility this would of course be limited only to where it is causing potential conflicts. Doing this for the other 99.9% of headings is out of scope for this task.
Jan 12 2021
Jan 11 2021
\Wikimedia\DoDo maybe? Makes "DOm DOcument" clearer? OTOH, maybe reads as the repeated imperative "do do" instead of the bird.
Jan 10 2021
Can we open a new phab task for this? I apologize for not noticing/flagging this earlier. There are a number of tasks already in phab to deprecate and remove the old mediawiki codes (including sr-ec, sr-el, etc) and it would be a significant step backwards to have the old names written into article wikitext, which would require manually updating all that wikitext in the future.
Jan 8 2021
Not necessarily going to work on this immediately (I've got higher-priority parser tests tasks) but since I added the GetLinkColors hook to core/Parsoid I'll provisionally claim this task.
@GWicke's idea about putting the "document identity" in the CSS is interesting, so that a link could be styled as a self-link (or not) depending on the CSS that is applied to it.
Jan 6 2021
I think addDBDataOnce is more fundamentally broken, and shouldn't be used.
Related Q: how can we make code CI run your test suite so that it doesn't just break Parsoid CI? Core CI *does* run some tests in a mode where Parsoid is installed -- can we add your tests to that group?
Yes, the Parser test runner setup creates its own interwiki table (using wgInterwikiCache) so that test results are not dependent on the host wiki configuration.
Dec 23 2020
Dec 22 2020
The local/global/site interwiki tables are implemented in the CDB caching, that's not expected to change.
Some comments left on https://gerrit.wikimedia.org/r/c/mediawiki/core/+/617294 -- see if you can determine if the $deps array is correct or not.
Dec 21 2020
https://gerrit.wikimedia.org/r/c/mediawiki/extensions/VisualEditor/+/649755 is my recommended fix here. It's been waiting for review for a while.
Yeah, this is a bug in the lua code. I've attempted to contact the author: https://fr.wikipedia.org/w/index.php?title=Discussion_module%3ACoordinates&type=revision&diff=177884216&oldid=173976505
I strongly suspect that someone is converting "-71.3" degrees to "71.3 S" by chopping off the first *byte*, instead of the first *character*.
The unicode minus sign is from formatnum -- it shouldn't be getting chopped up into bad UTF-8, unless someone somewhere it doing a naive substr(1, ...) or something like that. I'll look.