Page MenuHomePhabricator

Get rid of <div class="pagetext"> which spans multiple sections, so that each section is balanced HTML
Closed, ResolvedPublic8 Estimated Story Points

Description

Unbalanced sections will get automatically balanced by VE - whatever pagetext is being used for (styling?) could be done in other ways.

Event Timeline

Change 284721 had a related patch set uploaded (by Esanders):
Remove <div class="pagetext"> from wikitext output

https://gerrit.wikimedia.org/r/284721

Will I be able to substitute this with something that sets the page width in the Page: namespace of Wikisource? This happens to be an important component in my proofreading. my css: https://en.wikisource.org/wiki/User:Ineuw/common.css

I hope that nobody reading this is so naïve as to expect each 'entry pane' to contain balanced HTML. Assemblages (such as Page:-as-a-whole; or results of <pages/> transclusion ) yes; but normal practice in the wikisources is that individual templates, modules or page sections frequently contain unbalanced HTML which is only resolved at a higher level structure. See for instance hanging indent inherit template usage on enWS alone (around 3600 uses).

I hope that nobody reading this is so naïve as to expect each 'entry pane' to contain balanced HTML. Assemblages (such as Page:-as-a-whole; or results of <pages/> transclusion ) yes; but normal practice in the wikisources is that individual templates, modules or page sections frequently contain unbalanced HTML which is only resolved at a higher level structure. See for instance hanging indent inherit template usage on enWS alone (around 3600 uses).

And there are lots of similar templates, and other oddities, which is why I asked on T48580: Create a VisualEditor plugin to integrate with ProofreadPage that we can have a test bed wiki to test with before this is deployed.
And if unbalanced HTML is going to cause problems, I hope T54141: VisualEditor: Provide a mechanism to disable VisualEditor on a given page (to be used if it corrupts said page) will be reconsidered, or a better alternative provided.

If I understood well the proposed changes in the code, <div class="pagetext"> is removing only from output wikitext written to the database. The <div class="pagetext"> will be still generated on web pages (shown in view mode) and from the user point of view nothing will changed. All the gadgets and css styles will operate as before, without any changes.
Is not that so?

Z.

So there will be two different wikitext output for pages created before/after this change when queried via API? Or ...?

Before:

"<noinclude><pagequality level=\"4\" user=\"User\" /><div class=\"pagetext\">\n\n\n</noinclude>Sometext<noinclude></div></noinclude>"

After:

"<noinclude><pagequality level=\"4\" user=\"User\" /></noinclude>Sometext<noinclude></noinclude>"

Change 284721 merged by jenkins-bot:
Remove <div class="pagetext"> from wikitext output

https://gerrit.wikimedia.org/r/284721

Tpt claimed this task.

This change has just been merged.

So there will be two different wikitext output for pages created before/after this change when queried via API? Or ...?

Yes. Exactly. If you want, you could also switch to rvcontentformat=application/json to get a nice JSON output

OK, then I think that Proofread module in pywikibot will have a problem.
Unfortunately, up to my knowledge, pywikibot is not capable of handling contentformat other than wikitext.

I've just submitted a change to pywikibot that should fix this problem https://gerrit.wikimedia.org/r/#/c/295626/

Change 295626 had a related patch set uploaded (by Mpaa):
<div class="pagetext"> is now optional in ProofreadPage pages serialization

https://gerrit.wikimedia.org/r/295626

@Tpt, how can one understand if the ProofreadPage version is using the old or new mode?

What will happen if one would try to save the old format

<noinclude><pagequality level=\"4\" user=\"User\" /><div class=\"pagetext\">\n\n\n</noinclude>Sometext<noinclude></div></noinclude>

after this change?

@Tpt, how can one understand if the ProofreadPage version is using the old or new mode?

Just ask for the Wikitext of a page and see if there is a <div class="pagetext"> in it.

What will happen if one would try to save the old format after this change?

It will work just as before. The old format still stays in the database for old revisions of pages so will probably keep being supported as long as we do not do big database updates (unlikely to happen anytime soon).

@Tpt, how can one understand if the ProofreadPage version is using the old or new mode?

Just ask for the Wikitext of a page and see if there is a <div class="pagetext"> in it.

@Tpt, I cannot know in advance which page exists in a Wikisource site.
And if I try to fetch a not existing page with 'preload' option, I always get <div class="pagetext"> (which I suspect it is due to the fact that 'preload' is ready to show the version for read purposes?).

So I have not reliable way to know it until an existing page is loaded, which is quite uncomfortable, especially if I want to create a new page from scratch.

Is it reliable to say that this is introduced with MediaWikiVersion ('1.28.0-wmf.8') and use this as criteria?

And if I try to fetch a not existing page with 'preload' option, I always get <div class="pagetext"> (which I suspect it is due to the fact that 'preload' is ready to show the version for read purposes?).

It was due to my mistake instead ...

Change 296961 had a related patch set uploaded (by Mpaa):
<div class="pagetext"> is now optional in ProofreadPage pages serialization

https://gerrit.wikimedia.org/r/296961

Change 296961 abandoned by Mpaa:
<div class="pagetext"> is now optional in ProofreadPage pages serialization

Reason:
Pushed by mistake.
Was supposed to be https://gerrit.wikimedia.org/r/#/c/295626/

https://gerrit.wikimedia.org/r/296961

Change 295626 merged by jenkins-bot:
<div class="pagetext"> is now optional in ProofreadPage pages serialization

https://gerrit.wikimedia.org/r/295626