Closed, ResolvedPublic2 Estimated Story Points
Actions

Assigned To

Authored By

	kaldari
	Sep 8 2016, 6:35 PM

Description

Some suggestions for improving the web interface at http://tools.wmflabs.org/ws-google-ocr/...

Add an option to the web interface that changes the output to human-readable text rather than JSON. This will allow users to copy and paste OCRed text from the web interface. (Line breaks will need to be converted to <br> tags.) This should be the default option from the web interface.
If the text mode is chosen, and the the API returns an error, just return the raw error text (maybe in red), rather than JSON.
If the text mode is chosen, just put the OCRed text under the form instead of loading a clean page.
Change "Language code (optional):" to "2-letter language code (optional)" to clarify which language code is expected.

Revisions and Commits

rWSOCR Wikisource OCR
	rWSOCR357887240c1b Side-by-side proofreading
	rWSOCRda0bfbdb0610 Localisation and a better front-end

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Resolved		Tshrinivasan	T120788 Tool to use Google OCRs in Indic language Wikisource
		Resolved		Samwilson	T145104 Improve web interface for http://tools.wmflabs.org/ws-google-ocr/

Event Timeline

kaldari created this task.Sep 8 2016, 6:35 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 8 2016, 6:35 PM

kaldari renamed this task from Improve http://tools.wmflabs.org/ws-google-ocr/ to Improve web interface for http://tools.wmflabs.org/ws-google-ocr/.Sep 8 2016, 6:35 PM

kaldari moved this task from New & TBD Tickets to Needs Discussion on the Community-Tech board.

kaldari updated the task description. (Show Details)Sep 9 2016, 12:23 AM

kaldari triaged this task as Low priority.Sep 9 2016, 12:34 AM

kaldari raised the priority of this task from Low to Medium.

kaldari set the point value for this task to 2.

kaldari edited projects, added Community-Tech-Sprint; removed Community-Tech.

Samwilson claimed this task.Sep 9 2016, 1:14 AM

Samwilson added a commit: rWSOCRda0bfbdb0610: Localisation and a better front-end.Sep 9 2016, 5:09 AM

I've introduced front-end interface localisation, using Krinkle/intuition (thanks to @Niharika). There's only English messages so far (me being a monoglot and all), but others can easily be added. Is it worth adding this to translatewiki? Is that a hard thing to do?

The API output is unchanged.

The web output is now a side-by-side image and OCR text, with the latter in a text box for easier editing. For example:
http://tools.wmflabs.org/ws-google-ocr/index.php?image=https%3A%2F%2Fupload.wikimedia.org%2Fwikipedia%2Fcommons%2Fthumb%2Ff%2Ff5%2FLudendorffbr%25C3%25BCcke_Gedenktafel_3.jpg%2F781px-Ludendorffbr%25C3%25BCcke_Gedenktafel_3.jpg

I experimented with a few layouts, and this seemed to be the most useful. It might be a bit counter-intuitive that there's a textarea that doesn't actually do anything; what do you think?

I've linked the relevant commits above.

Samwilson moved this task from Ready to In Development on the Community-Tech-Sprint board.Sep 9 2016, 6:14 AM

Samwilson added a commit: rWSOCR357887240c1b: Side-by-side proofreading.Sep 9 2016, 6:17 AM

@Samwilson: Looks pretty good.

The error handling is a bit wonky. Sometimes it outputs "The original image" and an error message:
http://tools.wmflabs.org/ws-google-ocr/index.php?image=https%3A%2F%2Fupload.wikimedia.org%2Fwikipedia%2Fcommons%2F9%2F93%2FBabajiPrarthana.png&lang=hi

... and sometimes it gives me no output at all:
http://tools.wmflabs.org/ws-google-ocr/index.php?image=https%3A%2F%2Fcommons.wikimedia.org%2Fwiki%2FFile%3ACsv_file.pdf&lang=hi

In the first case, we should probably get rid of "The original image" and just show the error message. And of course the second case should also show an error message.

Regarding the textarea, I think that's fine. I don't think it needs an explicit explanation, personally. Maybe you could put a "Copy text to clipboard" button under it instead.

Also, if someone enters a commons URL (rather than an upload URL), it should give them a specific error message explaining that they have to use the actual image URL rather than the page URL. Right now, it outputs nothing (or a generic error on the API side):
http://tools.wmflabs.org/ws-google-ocr/index.php?image=https%3A%2F%2Fcommons.wikimedia.org%2Fwiki%2FFile%3ABabajiPrarthana.jpg&lang=hi

Thanks @kaldari. I've fixed up the error handling, and added some more help text to the front-end. The language code input is restricted to two characters now, and the transcription box has a copy-the-text button under it (instead of the help text that was there).

Let me know what you think.

kaldari closed this task as Resolved.Sep 12 2016, 2:45 AM

kaldari moved this task from In Development to Q1 2018-19 on the Community-Tech-Sprint board.

• DannyH edited projects, added Community-Tech; removed Community-Tech-Sprint.Sep 13 2016, 11:16 PM

• DannyH moved this task from Needs Discussion to Archive on the Community-Tech board.Sep 13 2016, 11:41 PM

Bodhisattwa added a parent task: T120788: Tool to use Google OCRs in Indic language Wikisource.Sep 16 2016, 4:01 PM

Improve web interface for http://tools.wmflabs.org/ws-google-ocr/Closed, ResolvedPublic2 Estimated Story PointsActions

Description

Revisions and Commits

Related ObjectsSearch...

Event Timeline

Improve web interface for http://tools.wmflabs.org/ws-google-ocr/
Closed, ResolvedPublic2 Estimated Story Points
Actions

Related Objects
Search...