ProofreadPage: use OpenSeadragon for the Page NS image viewer
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Inductiveload
	Aug 4 2021, 4:58 PM

Description

This will allow a lot of nice tools like built-in rotation and so on, as well as being able to strip out lots of jQuery UI stuff and simplify the core code.

Will also allow:

Easy implementation of the region selector for T269818
Loading of much higher res images perhaps via a tiled image engine

This should re-use the packaging already done in T283917.

Also, there should be some "global" access to the OSD viewer object in JS so other gadgets can interact with it.

Details

	Subject	Repo	Branch	Lines +/-
	WIP: Use OpenSeadragon for PRP image zooming	mediawiki/extensions/ProofreadPage	master	+43 -93

Customize query in gerrit

Related Objects
Search...

Status	Subtype	Assigned	Task
Open		None	T276530 JumpToFile: tracking
Resolved		None	T276042 JumpToFile: make high-res loading optional and add offset
Stalled		None	T276052 Make page-carousel icon zones shareable, generic areas for multi-script use
Open	Feature	None	T294903 ProofreadPage: Implement region selection UI for the OCR tool directly in the PRP page editor
Resolved		Inductiveload	T288146 JumpToFile: Port to OpenSeadragon implementation
Resolved		Inductiveload	T288318 PageCarousel: Port to OpenSeadragon implementation
Resolved		Inductiveload	T288141 ProofreadPage: use OpenSeadragon for the Page NS image viewer
Resolved		Yash4357	T283917 Add zoom and pan to the Pagelist Widget
Resolved	BUG REPORT	Inductiveload	T295662 OpenSeaDragon (OSD) button tooltips are not translated

Event Timeline

Inductiveload created this task.Aug 4 2021, 4:58 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 4 2021, 4:58 PM

Inductiveload mentioned this in T269818: New OCR tool should be able to OCR part of a page.Aug 4 2021, 5:02 PM

Inductiveload added a subtask: T283917: Add zoom and pan to the Pagelist Widget.Aug 4 2021, 5:04 PM

Inductiveload added a parent task: T288146: JumpToFile: Port to OpenSeadragon implementation.Aug 4 2021, 5:08 PM

Inductiveload updated the task description. (Show Details)

@Inductiveload We are actually working on a patch for adding OSD to the Page namespace editor within this month as part of the GSoC project :)

@Soda Oh, great. BTW, I have a very preliminary patch for it at https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ProofreadPage/+/709879

So feel free to take whatever you need from there if it's at all helpful.

It seems pretty trivial once you can load OSD as a module. I did it because I was wondering if it would be easier to just do the OSD thing than faff about with jQuery for the OCR crop tool (answer: looks much easier and less hacky).

In T288141#7260664, @Inductiveload wrote:

@Soda Oh, great. BTW, I have a very preliminary patch for it at https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ProofreadPage/+/709879

So feel free to take whatever you need from there if it's at all helpful.

It seems pretty trivial once you can load OSD as a module. I did it because I was wondering if it would be easier to just do the OSD thing than faff about with jQuery for the OCR crop tool (answer: looks much easier and less hacky).

Yeah, definitely implementing it with OSD will be a lot easier than implementing it using jQuery.

Soda added a subscriber: Yash4357.Aug 4 2021, 6:09 PM

Inductiveload added a subtask: T288318: PageCarousel: Port to OpenSeadragon implementation.Aug 6 2021, 12:41 AM

Inductiveload removed a subtask: T288318: PageCarousel: Port to OpenSeadragon implementation.

Inductiveload added a parent task: T288318: PageCarousel: Port to OpenSeadragon implementation.

Change 709879 had a related patch set uploaded (by Inductiveload; author: Inductiveload):

[mediawiki/extensions/ProofreadPage@master] WIP: Use OpenSeadragon for PRP image zooming

https://gerrit.wikimedia.org/r/709879

gerritbot added a project: Patch-For-Review.Aug 6 2021, 5:52 AM

Change 709879 abandoned by Inductiveload:

[mediawiki/extensions/ProofreadPage@master] WIP: Use OpenSeadragon for PRP image zooming

Reason:

reimplemented properly by Yash

https://gerrit.wikimedia.org/r/709879

Maintenance_bot removed a project: Patch-For-Review.Aug 12 2021, 5:10 PM

Inductiveload moved this task from Backlog to Usability/UX/Batch actions on the ProofreadPage board.Sep 20 2021, 6:28 AM

Inductiveload added a parent task: T294903: ProofreadPage: Implement region selection UI for the OCR tool directly in the PRP page editor.Nov 3 2021, 10:36 AM

This has now been deployed to Wikisources

Ruthven subscribed.Nov 20 2021, 1:04 PM

As this new feature has been rolled back (due to a memory leak), why not to delay de deployment until we are sure that there are no issues and that all the projects are notified of such a change? Clearly several local Gadgets will need to be corrected, and the communities need to be ready.
If there are memory issues (maybe OpenSeadragon is too greedy), Wikisources probably need a way to disable it locally. Have we thought about that?

To clarify. the memory leak is most likely not in this feature, it's in an unrelated change that was deployed at the same time (T296098).

In T288141#7518667, @Ruthven wrote:

If there are memory issues (maybe OpenSeadragon is too greedy), Wikisources probably need a way to disable it locally. Have we thought about that?

Openseadragon is heavier compared to the jquery based library that we had before. That being said, it should not be causing any browser memory issues since the size difference between the amount of JS being loaded is very small (~50-60kb)
compared to the size difference of the images themselves (~100kb-200kb based on a few pages on en.wikisource.org). That being said, if there are widespread memory issues, implementing a toggle/preference to conditionally load the zooming and panning interface should be fairly easy to implement.

As this new feature has been rolled back (due to a memory leak), why not to delay de deployment until we are sure that there are no issues and that all the projects are notified of such a change? Clearly, several local Gadgets will need to be corrected, and the communities need to be ready.

Wrt to broken gadgets, feel free to reach out/ping and we can help you fix the gadget. Broadly speaking, the following changes have been made:

Most of HTML should remain the same except for:

In the default layout, a openseadragon-container div has been added inside the .prp-page-image div
The .prp-page-image div has the .prp-page-image-openseadragon-vertical classs
The image inside the .prp-page-image div is hidden and has a srcset attribute.
If the user clicks on the horizontal (stacked) layout toggle, a new div with a .prp-page-image class is created that does not contain the image itself but contains only the .openseadragon-container div. This div also has the class .prp-page-image-openseadragon-horizontal

Wrt to the JS side of things, a new API, mw.proofreadpage.Viewer has been added that should allow script/gadgets to programmatically (via the Openseadragon API) zoom/pan/rotate/swap the image (or even draw shapes on it 😄). I personally am pretty excited to see what new scripts/gadgets can be created using this.

I will write up how I have migrated some enWS scripts in response to OSD (T296145).

It's actually not that hard, but we're still missing an important facet for a robust API: the hook that signals the OSD viewer is ready has not been reviewed or merged yet. It's included in the work for T294903.

Edit: now split into a standalone commit so maybe it can be reviewed faster and then backported: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ProofreadPage/+/740324

Inductiveload moved this task from Usability/UX/Batch actions to Done: to deploy/check on the ProofreadPage board.Nov 21 2021, 11:57 AM

@Ruthven: Do you actively work on this, as you set the task status to "In Progress"?

@Aklapper nope, but the deployment seemed stuck. I reverted it to Resolved. Sorry for the inconvenience :)
@Soda Thanks a lot for your comment; actually the issue is with the image inside the .prp-page-image that is hidden when editing the page. I've fixed this special case with display: inline;, not knowing what this will affect in OpensSeaDragon. But probably here it's not the right place to discuss about a specific gadget that was broken.

I'm fighting against OpenSeadragon, carefully deleting new elements and rebuilding previous environment. Please consider to convert the whole stuff into a central, optional gadget, so that users can work, if they like, into previous simpler environment. Thanks!

@Alex_brollo I don't think reversing a change that brings big improvements is the way to go (and this cannot function as a gadget), but your use case is interesting. Instead of trying to manually undo the changes I would suggest you try to find ways to use the new facilities to do what you're after. A lot of it I would expect to already be possible (OSD has an API exposed that can be used), and what's missing are probably good candidates for adding. If you explain your needs it might be possible to suggest alternate approaches for them.

@Alex_brollo please check the last line of @Soda 's message above, where there's a hint about some API interface.
Btw, in order fix our Javascriptsto, I've used the following code to display a new image in the (openseadragon) canvas:

image.onload = function() {
	canvas = document.querySelector('.openseadragon-canvas canvas');
	var ct = canvas.getContext("2d");
	ct.drawImage(image, 0, 0, canvas.width, canvas.height)
}

@Xover Can you suggest a place (can be phabricator as well) where users can interact with more expert ones to discuss such technical issues? I reckon that this is a specific case where such help is needed and requires a longer discussion.

@Alex_brollo / @Ruthven As promised by T296145, I have written up some basic documentation (which indeed should have been done when the feature was first written): https://www.mediawiki.org/wiki/Extension:Proofread_Page/Page_viewer

The "big" case of adding/replacing an image layer to the viewer is described there. Then the new image will be a first-class member of the OSD viewer. The API also supports plugins selection rectangles and other overlays.

There is now a handy mw.hook to signal the OSD viewer is ready, so that can help a lot.

In T288141#7525761, @Ruthven wrote:

@Xover Can you suggest a place (can be phabricator as well) where users can interact with more expert ones to discuss such technical issues? I reckon that this is a specific case where such help is needed and requires a longer discussion.

For specific issues (things that can be narrowed down to a concrete "Need access to X") a Phabricator task tagged with ProofreadPage is probably good. I imagine @Inductiveload would not be averse to getting questions on their talk page on enWS, that can eventually turn into a Phabricator task, and if worse comes to worse you're welcome to use my user talk page on enWS for such things. I won't be able to help much directly (at least in the near term), but I'm happy to host the discussions and I have few compunctions about pinging people I think may be able to help. :-)

In T288141#7525875, @Xover wrote:

In T288141#7525761, @Ruthven wrote:

@Xover Can you suggest a place (can be phabricator as well) where users can interact with more expert ones to discuss such technical issues? I reckon that this is a specific case where such help is needed and requires a longer discussion.

For specific issues (things that can be narrowed down to a concrete "Need access to X") a Phabricator task tagged with ProofreadPage is probably good. I imagine @Inductiveload would not be averse to getting questions on their talk page on enWS, that can eventually turn into a Phabricator task, and if worse comes to worse you're welcome to use my user talk page on enWS for such things. I won't be able to help much directly (at least in the near term), but I'm happy to host the discussions and I have few compunctions about pinging people I think may be able to help. :-)

Same for me, I'm open to responding to pings anywhere on wiki (as Sohom_data) (or in phabricator tasks (@Soda )/IRC/Matrix (Sohom Datta)) wrt to Openseadragon.

Ruthven awarded a token.Nov 25 2021, 6:25 AM

Ltrlg subscribed.Nov 27 2021, 9:43 AM

Inductiveload closed subtask T295662: OpenSeaDragon (OSD) button tooltips are not translated as Resolved.Jan 4 2022, 9:42 AM

Thanks, really the doc solved our it.wikisource specific issue. Now we can
replace the url of page image, and upload the new image into OpenSeadragon.
Nevertheless there's another issue into new OCR widget: it doesn't feel the
new url, and it returns the OCR of previously uploaded image (while the olf
OCR buttons run regularly). i'd like to take a look to new OCR widget.
where can I find it?

Alex brollo

Il giorno mar 4 gen 2022 alle ore 10:42 Inductiveload <
no-reply@phabricator.wikimedia.org> ha scritto:

Inductiveload closed subtask T295662: OpenSeaDragon (OSD) button tooltips
are not translated as "Resolved". View Task
https://phabricator.wikimedia.org/T288141
*TASK DETAIL*
https://phabricator.wikimedia.org/T288141

*EMAIL PREFERENCES*
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

*To: *Inductiveload
*Cc: *Ltrlg, Xover, Alex_brollo, Legoktm, Ruthven, Yash4357, Soda,
Aklapper, Inductiveload, DannyS712, Tshrinivasan, Info-farmer, Candalua, Tpt

@Alex_brollo good to hear, glad it was helpful.

I don't think there is any formal documentation about how it plugs together at the code level was ever written. The OCR widget itself lives in the Wikisource extension:

/Wikisource/modules/ext.wikisource.OCR/index.js the main entry point
/Wikisource/modules/ext.wikisource.OCR/OcrTool.js the "controller" for the OCR process - this is where the request to the OCR backend is made
/Wikisource/modules/ext.wikisource.OCR/ExtractTextWidget.js: the widget in the edit toolbar (which takes the OcrTool as an injected dependency in the ctor)

Notably for this issue, the image URL is set with OcrTool.prototype.setImage, which is set like this in ExtractTextWidget.js (using the original <img> element, which is not necessarily related to what the image viewer is loaded with at the time the OCR runs):

	this.prpImage = $prpImage.find( 'img' )[ 0 ];
	this.ocrTool = ocrTool;
	this.ocrTool.setImage( this.prpImage.src );

Can I hope that that issue will be solved? we can use
https://ocr.wmcloud.org/ to get OCR of any page, but it is tricky, and
wikisource needs speed.

alex

Il giorno mar 4 gen 2022 alle ore 21:11 Inductiveload <
no-reply@phabricator.wikimedia.org> ha scritto:

Inductiveload added a comment. View Task
https://phabricator.wikimedia.org/T288141

@Alex_brollo https://phabricator.wikimedia.org/p/Alex_brollo/ good to
hear, glad it was helpful.

I don't think there is any formal documentation about how it plugs
together at the code level was ever written. The OCR widget itself lives in
the Wikisource extension:

/Wikisource/modules/ext.wikisource.OCR/index.js the main entry point

/Wikisource/modules/ext.wikisource.OCR/OcrTool.js the "controller" for the OCR process - this is where the request to the OCR backend is made

/Wikisource/modules/ext.wikisource.OCR/ExtractTextWidget.js: the widget in the edit toolbar (which takes the OcrTool as an injected dependency in the ctor)

Notably for this issue, the image URL is set with
OcrTool.prototype.setImage, which is set like this in ExtractTextWidget.js
(using the original <img> element, which is not related to what the image
viewer is up to):

this.prpImage = $prpImage.find( 'img' )[ 0 ];
this.ocrTool = ocrTool;
this.ocrTool.setImage( this.prpImage.src );

*TASK DETAIL*
https://phabricator.wikimedia.org/T288141

*EMAIL PREFERENCES*
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

*To: *Inductiveload
*Cc: *Ltrlg, Xover, Alex_brollo, Legoktm, Ruthven, Yash4357, Soda,
Aklapper, Inductiveload, DannyS712, Tshrinivasan, Info-farmer, Candalua, Tpt

@Alex_brollo I have spun it out as T298663. I have no idea if or when it might be actioned, though.

Thanks!

Il giorno mer 5 gen 2022 alle ore 23:51 Inductiveload <
no-reply@phabricator.wikimedia.org> ha scritto:

Inductiveload added a comment. View Task
https://phabricator.wikimedia.org/T288141

@Alex_brollo https://phabricator.wikimedia.org/p/Alex_brollo/ I have
spun it out as T298663 https://phabricator.wikimedia.org/T298663. I
have no idea if or when it might be actioned, though.

*TASK DETAIL*
https://phabricator.wikimedia.org/T288141

*EMAIL PREFERENCES*
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

*To: *Inductiveload
*Cc: *Ltrlg, Xover, Alex_brollo, Legoktm, Ruthven, Yash4357, Soda,
Aklapper, Inductiveload, DannyS712, Tshrinivasan, Info-farmer, Candalua, Tpt

Ruthven mentioned this in T232918: Syntax highlighting shifts the page image below the text on Wikisource.Feb 1 2022, 3:02 PM

Soda closed subtask T283917: Add zoom and pan to the Pagelist Widget as Resolved.Apr 10 2023, 2:14 PM

ProofreadPage: use OpenSeadragon for the Page NS image viewerClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

ProofreadPage: use OpenSeadragon for the Page NS image viewer
Closed, ResolvedPublic
Actions

Related Objects
Search...