Page MenuHomePhabricator

Flow: Parsoid should strip inline JavaScript from user posts
Closed, DeclinedPublic

Description

At https://en.wikipedia.org/wiki/Wikipedia_talk:Flow/Developer_test_page
if I scroll down until the posts older than 2/9/2014 are loaded, then the same 10 topics start loading in a loop.

webconsole says:
ReferenceError: categoryTreeLoadChildren is not defined

Possibly a duplicate of bug 61097 (which also mentioned categorytree)


Version: unspecified
Severity: normal

Details

Reference
bz68932

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:40 AM
bzimport set Reference to bz68932.
bzimport added a subscriber: Unknown Object (MLST).

This no longer looks broken to me...?

I reproduced. After I reach February and soon after "Yikes, where's the HISTORY?" topic, the same set of 8 topics ending with the "category tree test" topic repeats, and as I scroll down that same set loads over and over. "category tree test" reports a JS error that used to halt loading before bug 61097 was fixed.

In Small view, the Wikipedia_talk:Flow/Developer_test_page looks something like

Example discussion title
...
many topics
...
Yikes, where's the HISTORY? (Topic:Rpejgfwuwrxfc44h)
Flow (Topic:Rpcbhk1f6mn3feux)
<Begins repeating 8 posts starting with the following topic:>
Size issues
Signatures and timestamps and other
test redlink / bluelink This topic was hidden by Fram
Testing thread closing
Watchlist
Top Posting
History?
category tree test (Topic:Rokd7o2d2be9c0pc) <the topic with JS error >

<and it repeats:>
Size issues
Signatures and timestamps and other
test redlink / bluelink This topic was hidden by Fram
Testing thread closing
Watchlist
Top Posting
History?
category tree test
<and again and again, until>
[Load More]

Switch to "Newest first" ordering, and try loading from here:
https://en.wikipedia.org/w/index.php?title=Wikipedia_talk:Flow/Developer_test_page&topiclist_offset-dir=fwd&topiclist_limit=10&topiclist_offset-id=ronz285xqcpwt1ms
curiously, it will loop from different spots. Eg, also try loading from here:
https://en.wikipedia.org/w/index.php?title=Wikipedia_talk:Flow/Developer_test_page&topiclist_offset-dir=fwd&topiclist_limit=10&topiclist_offset-id=rovvnaai47pp9u9w

Note: The topic "Another test page" (2/3/2014, 1:38:27 PM) should be the very last topic. (It was the first topic, on the newly created board). If you don't reach that, or if something loads after it, that's this/a bug.

I'm testing locally. enwiki is still running old code...

To save endless scrolling this URL starts "further down" on the Flow board.
https://en.wikipedia.org/wiki/Wikipedia_talk:Flow/Developer_test_page?topiclist_offset-dir=fwd&topiclist_limit=10&topiclist_offset-id=rq0wujqfuhwmya82

You can see in the batch of topics that endlessly loads that the textarea is pre-expanded because the JavaScript that collapses it doesn't run due to the JS error in a post. As a result the JavaScript that updates the [Load More] button's link to to paginate starting at a new offset doesn't run. But the JavaScript implementing infinite scroll continues to read the [Load More] href and turns it into an API request starting at the same offset.

To guard against this, maybe the enhancement JS should remember the previous [Load More] URL and if it's unchanged display a warningbox before the new block of topics "This Flow board appears to be requesting the same set of topics again". And I've reopened bug 61097.

I can confirm that just defining the categoryTreeLoadChildren allows everything to succeed, it seems we need to catch and ignore any errors caused by inserting user content. I took a look over the related code, but not sure how to accomplish this.

(In reply to Erik Bernhardson from comment #7)

I can confirm that just defining the categoryTreeLoadChildren allows
everything to succeed, it seems we need to catch and ignore any errors
caused by inserting user content. I took a look over the related code, but
not sure how to accomplish this.

We shouldn't be running this JS at all. User wikitext content should obviously not have JavaScript (and does not, unless we have a critical XSS), and neither should well-written extensions (they have JS, but not inline JS in rendered user content).

If we're not running the JavaScript, we thus don't need to catch exceptions.

For CategoryTree itself, it should not be an issue going forward. The call to categoryTreeLoadChildren was removed in daf3e2d9f1ae0fe0a085079d56ab535edcf27fae (Brian Wolff removed wgCategoryTreeDynamicTag, since it was broken).

I am also removing a little unreachable code that (if reachable) would have called another function (categoryTreeExpandNode): https://gerrit.wikimedia.org/r/#/c/170288/ (pending review)

categoryTreeLoadChildren now appears nowhere in the extension. However, existing posts would need to be re-converted from wikitext (however Parsoid normally does this, purging?) if they still have the old call in their rendering.

But since we may or may not want to do that purge (and it wouldn't fix future extensions that tried the same thing), I suggest we strip JavaScript as a general rule. This can be done in our Parsoid/Fixer stage.

Mattflaschen-WMF renamed this task from Flow: Topics repeating in infinite scroll - possibly categorytree error to Flow: Strip inline JavaScript from user posts.Dec 10 2014, 7:25 PM
Mattflaschen-WMF triaged this task as Medium priority.
Mattflaschen-WMF set Security to None.

The specific CategoryTree issue should be fixed for new renderings, but we should also remove JS as a general rule, so I've pivoted the bug to that.

I think the no javascript should come from parsoid itself, adding the project.

ssastry renamed this task from Flow: Strip inline JavaScript from user posts to Flow: Parsoid should strip inline JavaScript from user posts.May 21 2020, 3:05 PM
ssastry moved this task from Backlog to Needs Investigation on the Parsoid board.
Arlolra added a subscriber: Arlolra.

The page in question here is no longer using StructuredDiscussions and there doesn't seem to be any sample wikitext where Parsoid is failing to sanitize user defined javascript.