Page MenuHomePhabricator

OCR preferences are saved too often
Open, Needs TriagePublicBUG REPORT

Description

After selecting an OCR engine and a couple of languages, when you navigate to a new page to proofread (and don't change any OCR settings), multiple API requests are sent to save the OCR preferences. None need to be sent unless something is changed.

When opening an editing page, it looks like an API request is sent (to action=options&change=wikisource-ocr…) once at the start after it's read the preference, and then again each time it adds a language to the list, e.g.:

wikisource-ocr={"engine":"google","langs":[],"showOnboarding":false}
wikisource-ocr={"engine":"google","langs":["az-cyrl"],"showOnboarding":false}
wikisource-ocr={"engine":"google","langs":["az-cyrl","ar"],"showOnboarding":false}

It should:

  • not try to save if nothing's changed; and
  • only save once with all data when it does.

The same applies for anon users where the data is saved to localStorage.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

This could be done by only saving the preferences when the config popup is closed.

Change #1019436 had a related patch set uploaded (by Kolakachi; author: Kolakachi):

[mediawiki/extensions/Wikisource@master] Fixed OCR preferences are saved too often

https://gerrit.wikimedia.org/r/1019436

@theprotonade @SGill

Not sure if this is the right place to ask questions, Please let me know if I should move it elsewhere.

Please am participating in OutReachy and I want to locally recreate this issue. My Understanding is that I need to run mediawiki locally and then enable the Wikisource extension and the ProofreadPage extension. Below is what I have tried

OPERATING SYSTEM - MAC OS

  • Installed MAMP as directed here
  • 
Ran composer update at the root of mediawiki folder. Install and Setup mediawiki using MAMP. Configure DB and download LocalSettings.php .
  • Inside LocalSettings.php , I Changed wgDbServer from $wgDbServer = "localhost"; to $wgDbServer = "localhost:/Applications/MAMP/tmp/mysql/mysql.sock" as stated here again
  • Downloaded Wikisource and ProofreadPage Extensions in addition to all the extensions listed under ProofreadPage as requirements. All extensions downloaded were from gerrit (latest versions).
  • I added all extensions to extension folder while also updating the LocalSettings.php.

Screenshot 2025-03-19 at 01.08.24.png (2×2 px, 762 KB)

BLOCKER

I have not been able to actually use the Wikimedia OCR in my local setup. I have tried with no success to create an Index namespace and link it to Page namespace in order for ProofreadPage to kick in.

  • All Extensions appear on Special:Version and MediaWiki starts up

Screenshot 2025-03-19 at 00.34.33.png (1×2 px, 478 KB)

Screenshot 2025-03-19 at 00.35.03.png (1×2 px, 325 KB)

  • Below is the form I see when I attempt to create an Index namespace

Screenshot 2025-03-19 at 00.37.27.png (1×2 px, 252 KB)

Screenshot 2025-03-19 at 00.39.32.png (483×1 px, 112 KB)

  • Picture of Page Namespace

Screenshot 2025-03-19 at 00.40.30.png (1×2 px, 575 KB)

I have not been able to find clear instructions on what to do next. Any help will be greatly appreciated

These screenshots look like you are on the right path. Did you try clicking on the Edit button on the Page namespace and see if you are getting the OCR widget on the editing interface? You might also need the WikiEditor extension for the toolbar to take effect.

Change #1128524 had a related patch set uploaded (by Aklapper; author: Osuji pius):

[mediawiki/extensions/Wikisource@master] Fix OCR preferences being saved too often

https://gerrit.wikimedia.org/r/1128524

@theprotonade
I have installed WikiEditor
Also Following the instructions here -Wikisource Wikimedia OCR
I installed Wikimedia OCR. Locally I now have
Wikimedia OCR at - http://ocr.local:8888
mediawiki at - http://mediawiki.local:8888

Screenshot 2025-03-20 at 11.58.19.png (2×3 px, 296 KB)

Screenshot 2025-03-20 at 12.00.30.png (1×3 px, 511 KB)

I created
Index:Carroll_-_Alice%27s_Adventures_in_Wonderland.djvu
File:Carroll_-_Alice%27s_Adventures_in_Wonderland.djvu
Page:Carroll_-_Alice%27s_Adventures_in_Wonderland.djvu/1

But when I open Page:Carroll , I don't see the Transcribe Text widget as shown here - Wikisource Page showing Transcribe

Screenshot 2025-03-20 at 11.59.17.png (2×3 px, 2 MB)

Screenshot 2025-03-20 at 11.58.55.png (2×3 px, 2 MB)

But when I open Page:Carroll , I don't see the Transcribe Text widget as shown here - Wikisource Page showing Transcribe

Have you got $wgWikisourceEnableOcr and $wgWikisourceOcrUrl set in your LocalSettings.php? https://www.mediawiki.org/wiki/Extension:Wikisource#Configuration_parameters

Change #1130260 had a related patch set uploaded (by Osuji pius; author: Osuji pius):

[mediawiki/extensions/Wikisource@master] Fix OCR preferences being saved too often

https://gerrit.wikimedia.org/r/1130260

Change #1128524 abandoned by Osuji pius:

[mediawiki/extensions/Wikisource@master] Fix OCR preferences being saved too often

Reason:

Duplicate Patch exists here - https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikisource/+/1130260

https://gerrit.wikimedia.org/r/1128524

But when I open Page:Carroll , I don't see the Transcribe Text widget as shown here - Wikisource Page showing Transcribe

Have you got $wgWikisourceEnableOcr and $wgWikisourceOcrUrl set in your LocalSettings.php? https://www.mediawiki.org/wiki/Extension:Wikisource#Configuration_parameters

Thank you!, That worked, I can now see the Transcribe Text widget.
Made another push for this issue - https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikisource/+/1130260

Change #1131670 had a related patch set uploaded (by Aklapper; author: Nidhicodes):

[mediawiki/extensions/Wikisource@master] Fix: Prevent unnecessary OCR preference saves in Wikimedia OCR

https://gerrit.wikimedia.org/r/1131670

Change #1136358 had a related patch set uploaded (by Vicolas11; author: Vicolas11):

[mediawiki/extensions/Wikisource@master] Fix OCR preferences are saved too often

https://gerrit.wikimedia.org/r/1136358

Change #1130260 merged by jenkins-bot:

[mediawiki/extensions/Wikisource@master] Fix OCR preferences being saved too often

https://gerrit.wikimedia.org/r/1130260