Get rid of <div class="pagetext"> which spans multiple sections, so that each section is balanced HTML
Closed, ResolvedPublic8 Story Points

Description

Unbalanced sections will get automatically balanced by VE - whatever pagetext is being used for (styling?) could be done in other ways.

Esanders created this task.Apr 21 2016, 2:22 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 21 2016, 2:22 PM

Change 284721 had a related patch set uploaded (by Esanders):
Remove <div class="pagetext"> from wikitext output

https://gerrit.wikimedia.org/r/284721

Ineuw added a subscriber: Ineuw.Apr 22 2016, 1:00 AM

Will I be able to substitute this with something that sets the page width in the Page: namespace of Wikisource? This happens to be an important component in my proofreading. my css: https://en.wikisource.org/wiki/User:Ineuw/common.css

AuFCL added a subscriber: AuFCL.Apr 22 2016, 1:56 AM

I hope that nobody reading this is so naïve as to expect each 'entry pane' to contain balanced HTML. Assemblages (such as Page:-as-a-whole; or results of <pages/> transclusion ) yes; but normal practice in the wikisources is that individual templates, modules or page sections frequently contain unbalanced HTML which is only resolved at a higher level structure. See for instance hanging indent inherit template usage on enWS alone (around 3600 uses).

jayvdb added a subscriber: jayvdb.Apr 22 2016, 2:09 AM

I hope that nobody reading this is so naïve as to expect each 'entry pane' to contain balanced HTML. Assemblages (such as Page:-as-a-whole; or results of <pages/> transclusion ) yes; but normal practice in the wikisources is that individual templates, modules or page sections frequently contain unbalanced HTML which is only resolved at a higher level structure. See for instance hanging indent inherit template usage on enWS alone (around 3600 uses).

And there are lots of similar templates, and other oddities, which is why I asked on T48580: Create a VisualEditor plugin to integrate with ProofreadPage that we can have a test bed wiki to test with before this is deployed.
And if unbalanced HTML is going to cause problems, I hope T54141: VisualEditor: Provide a mechanism to disable VisualEditor on a given page (to be used if it corrupts said page) will be reconsidered, or a better alternative provided.

Zdzislaw added a subscriber: Zdzislaw.EditedApr 22 2016, 3:00 PM

If I understood well the proposed changes in the code, <div class="pagetext"> is removing only from output wikitext written to the database. The <div class="pagetext"> will be still generated on web pages (shown in view mode) and from the user point of view nothing will changed. All the gadgets and css styles will operate as before, without any changes.
Is not that so?

Z.

Mpaa added a subscriber: Mpaa.Apr 22 2016, 4:47 PM
Mpaa added a comment.EditedApr 22 2016, 5:38 PM

So there will be two different wikitext output for pages created before/after this change when queried via API? Or ...?

Before:

"<noinclude><pagequality level=\"4\" user=\"User\" /><div class=\"pagetext\">\n\n\n</noinclude>Sometext<noinclude></div></noinclude>"

After:

"<noinclude><pagequality level=\"4\" user=\"User\" /></noinclude>Sometext<noinclude></noinclude>"
Jdforrester-WMF set the point value for this task to 8.May 10 2016, 1:27 AM

Change 284721 merged by jenkins-bot:
Remove <div class="pagetext"> from wikitext output

https://gerrit.wikimedia.org/r/284721

Tpt closed this task as Resolved.Jun 22 2016, 12:37 PM
Tpt claimed this task.

This change has just been merged.

So there will be two different wikitext output for pages created before/after this change when queried via API? Or ...?

Yes. Exactly. If you want, you could also switch to rvcontentformat=application/json to get a nice JSON output

Mpaa added a comment.Jun 22 2016, 8:24 PM

OK, then I think that Proofread module in pywikibot will have a problem.
Unfortunately, up to my knowledge, pywikibot is not capable of handling contentformat other than wikitext.

Tpt added a comment.Jun 23 2016, 7:52 AM

I've just submitted a change to pywikibot that should fix this problem https://gerrit.wikimedia.org/r/#/c/295626/

Change 295626 had a related patch set uploaded (by Mpaa):
<div class="pagetext"> is now optional in ProofreadPage pages serialization

https://gerrit.wikimedia.org/r/295626

Mpaa added a comment.Jun 25 2016, 6:44 PM

@Tpt, how can one understand if the ProofreadPage version is using the old or new mode?

What will happen if one would try to save the old format

<noinclude><pagequality level=\"4\" user=\"User\" /><div class=\"pagetext\">\n\n\n</noinclude>Sometext<noinclude></div></noinclude>

after this change?

Tpt added a comment.Jun 26 2016, 7:21 AM

@Tpt, how can one understand if the ProofreadPage version is using the old or new mode?

Just ask for the Wikitext of a page and see if there is a <div class="pagetext"> in it.

What will happen if one would try to save the old format after this change?

It will work just as before. The old format still stays in the database for old revisions of pages so will probably keep being supported as long as we do not do big database updates (unlikely to happen anytime soon).

Mpaa added a comment.Jun 29 2016, 9:52 PM

@Tpt, how can one understand if the ProofreadPage version is using the old or new mode?

Just ask for the Wikitext of a page and see if there is a <div class="pagetext"> in it.

@Tpt, I cannot know in advance which page exists in a Wikisource site.
And if I try to fetch a not existing page with 'preload' option, I always get <div class="pagetext"> (which I suspect it is due to the fact that 'preload' is ready to show the version for read purposes?).

So I have not reliable way to know it until an existing page is loaded, which is quite uncomfortable, especially if I want to create a new page from scratch.

Is it reliable to say that this is introduced with MediaWikiVersion ('1.28.0-wmf.8') and use this as criteria?

Mpaa added a comment.Jun 30 2016, 5:19 AM

And if I try to fetch a not existing page with 'preload' option, I always get <div class="pagetext"> (which I suspect it is due to the fact that 'preload' is ready to show the version for read purposes?).

It was due to my mistake instead ...

Change 296961 had a related patch set uploaded (by Mpaa):
<div class="pagetext"> is now optional in ProofreadPage pages serialization

https://gerrit.wikimedia.org/r/296961

Change 296961 abandoned by Mpaa:
<div class="pagetext"> is now optional in ProofreadPage pages serialization

Reason:
Pushed by mistake.
Was supposed to be https://gerrit.wikimedia.org/r/#/c/295626/

https://gerrit.wikimedia.org/r/296961

Change 295626 merged by jenkins-bot:
<div class="pagetext"> is now optional in ProofreadPage pages serialization

https://gerrit.wikimedia.org/r/295626

Ineuw removed a subscriber: Ineuw.Aug 7 2016, 11:40 PM
AuFCL removed a subscriber: AuFCL.Nov 21 2016, 7:40 PM
Billinghurst moved this task from Backlog to Done on the ProofreadPage board.Mar 5 2017, 2:08 AM