Page MenuHomePhabricator

Advanced Tools User Experience Improvements
Open, Needs TriagePublic3 Estimated Story Points

Description

As an advanced tools OCR user I can:

  • Understand what the page can be used for
  • Understand a URL is necessary to upload an image
  • Understand which options work best for my need
  • Have all the options for Tesseract OCR
  • Choose the document's language for OCR

Visual References:

Empty State

Empty.jpg (675Γ—1 px, 104 KB)

Google OCR

Google.jpg (675Γ—1 px, 103 KB)

Tesseract OCR

Tesseract.jpg (904Γ—1 px, 146 KB)

Figma Link

Related Objects

Event Timeline

Hey @nayoub, am I understanding correctly from this Slack thread that the designs can be added to this ticket, and that the ticket can be unassigned so that an engineer can pick it up? Thanks!

nayoub added a subscriber: nayoub.
nayoub added a subscriber: MusikAnimal.

Thanks @ldelench_wmf! The ticket description has been updated with the latest designs, and I've unassigned myself from it too :)
Please let me know if there's any issue, thanks! cc @NRodriguez @MusikAnimal

Thanks @nayoub looking great--

we had discussed removing the meta data from the radio buttons- i see it removed in the third screengrab but not on the second, could we remove to halt any confusion?

image.png (262Γ—524 px, 57 KB)

Apologies for the confusion, updated :)

@nayoub A couple of questions:

  • Could we get the SVG for the logo? T282960: Create logo for Wikimedia OCR is the main task for this.
  • I hope it isn't important to use the OOUI styles (like the "Transcribe" button), and Bootstrap is okay? We can use OOUI but it's a big library for what seems like only minor styling changes.
  • It's unclear how the language selector is supposed to work. I might be looking in the wrong place but I don't see any designs in the Figma page other than the default "Auto" state. It looks like a normal <select> input which allows only one selection. We will need to allow for any number of languages to be added. Is the current implementation with Select2 satisfactory for now?
  • I know this sounds odd, but currently there has to be an engine selected in order for the view to render. How important is it to have the "empty" state where no engine is selected yet? If we're okay with a default engine (currently set to Google but should probably be changed to Tesseract), that will shave off a few points from this task.

Hi @MusikAnimal:

  • Just added the svg on the ticket T282960 (I already sent it to Sam earlier)

  • It's totally fine to use Bootstrap components instead of OOUI in this case, as we did for the WS Export page.
  • We can keep the current language selector, it'd be nice if we have the possibility to have a default state (in the case of the mock "Auto") when there's no user input at first. Do you think it could be possible?
  • We can select Tesseract if you prefer, although Google might make more sense as options would appear when Tesseract is selected (and it might be odd for the Tesseract options to disappear after selecting Google?). But again, if Tesseract makes more sense, it shouldn't be a major UX issue.

Thanks!

We can keep the current language selector, it'd be nice if we have the possibility to have a default state (in the case of the mock "Auto") when there's no user input at first. Do you think it could be possible?

We could add placeholder text. "Auto" by itself might look a little weird, but we could have a sentence "Leave blank for automatic language detection". How does that sound? Also we can add any number of little clickable tooltip icons (?) next to the labels that provide more information, as desired.

We can select Tesseract if you prefer, although Google might make more sense as options would appear when Tesseract is selected (and it might be odd for the Tesseract options to disappear after selecting Google?). But again, if Tesseract makes more sense, it shouldn't be a major UX issue.

I had the same thinking, that the Google form is simpler hence the better default. I'm going to try to do an empty state like you have in the mocks, and just time box it. If it's too much work we can create a new task for that and triage accordingly.

We could add placeholder text. "Auto" by itself might look a little weird, but we could have a sentence "Leave blank for automatic language detection". How does that sound? Also we can add any number of little clickable tooltip icons (?) next to the labels that provide more information, as desired.

Sounds perfect! Tooltips would be very nice and probably look better but I worry that users might not click/hover on the "i" and thus miss helpful information. What do you think?

I had the same thinking, that the Google form is simpler hence the better default. I'm going to try to do an empty state like you have in the mocks, and just time box it. If it's too much work we can create a new task for that and triage accordingly.

Awesome! :) Worst case, if it's too much work, we can default to Tesseract also.

@nayoub Another heads up: I was going to make the content area smaller than the Bootstrap default, just like we did for WS Export, but then it occurred to me that the result page (when we show the image and transcription side-by-side) might benefit from a larger content area. Say, if the user wanted to spot check the OCR's transcription, they'll want a larger sized image to view. Do you think it's okay to stick with the Bootstrap default, which is 1170px for large screens? I suppose it's also possible to have the unsubmitted form downsized as you have it in the mocks, then when displaying results we use the Bootstrap width (or even full-width). Let me know if you've any opinion! I will definitely share a screenshot of what I have for your approval before putting it up for code review.

@MusikAnimal Thanks for letting me know! Do you think it would be possible to keep the larger content area while reducing the size of the input fields (similarly to the mocks)? That way we can offer a large side-by-side transcription/proofread experience without having very wide fields in the forms above. What do you think of this "hybrid" solution?

MusikAnimal set the point value for this task to 3.Jun 24 2021, 10:41 PM
Samwilson added a subscriber: Samwilson.

Merged, deployed to test, and ready for QA.

Not sure why, but the auto-deploy failed on the build step. I ran it manually and it was fine. The log from the failed auto-deploy is below.

0 info it worked if it ends with ok
1 verbose cli [ '/usr/bin/node', '/usr/bin/npm', 'run', 'build' ]
2 info using npm@6.14.12
3 info using node@v12.22.1
4 verbose run-script [ 'prebuild', 'build', 'postbuild' ]
5 info lifecycle @~prebuild: @
6 info lifecycle @~build: @
7 verbose lifecycle @~build: unsafe-perm in lifecycle true
8 verbose lifecycle @~build: PATH: /usr/lib/node_modules/npm/node_modules/npm-lifecycle/node-gyp-bin:/var/www/tool/node_modules/.bin:/usr/bin:/bin
9 verbose lifecycle @~build: CWD: /var/www/tool
10 silly lifecycle @~build: Args: [ '-c', 'encore production --progress' ]
11 silly lifecycle @~build: Returned: code: 1  signal: null
12 info lifecycle @~build: Failed to exec build script
13 verbose stack Error: @ build: `encore production --progress`
13 verbose stack Exit status 1
13 verbose stack     at EventEmitter.<anonymous> (/usr/lib/node_modules/npm/node_modules/npm-lifecycle/index.js:332:16)
13 verbose stack     at EventEmitter.emit (events.js:314:20)
13 verbose stack     at ChildProcess.<anonymous> (/usr/lib/node_modules/npm/node_modules/npm-lifecycle/lib/spawn.js:55:14)
13 verbose stack     at ChildProcess.emit (events.js:314:20)
13 verbose stack     at maybeClose (internal/child_process.js:1022:16)
13 verbose stack     at Process.ChildProcess._handle.onexit (internal/child_process.js:287:5)
14 verbose pkgid @
15 verbose cwd /var/www/tool
16 verbose Linux 4.19.0-16-cloud-amd64
17 verbose argv "/usr/bin/node" "/usr/bin/npm" "run" "build"
18 verbose node v12.22.1
19 verbose npm  v6.14.12
20 error code ELIFECYCLE
21 error errno 1
22 error @ build: `encore production --progress`
22 error Exit status 1
23 error Failed at the @ build script.
23 error This is probably not a problem with npm. There is likely additional logging output above.
24 verbose exit [ 1, true ]

@NRodriguez @nayoub I notice the "Copy to clipboard" button is now in a different place. Is this deliberate?

Old UI (as it appears on https://ocr.wmcloud.org):

before.png (893Γ—1 px, 417 KB)

New UI (as it appears on https://ocr-test.wmcloud.org):

after.png (967Γ—1 px, 426 KB)

Hi @dom_walden, yes it is deliberate to be consistent with the OCR button and increase visibility for the CTA

deliberate indeed, and to check my assumption-- there's no way that the text can ever clash with the button? i just worry about text/button covering it but now that i think about it they will be different elements on the page not superimposed so it should be good

@NRodriguez the button shouldn't clash with the text since as you said they're different elements. There might be edge cases though where the text appears underneath the button, which then could be a user pain point.

@NRodriguez and @nayoub I have done some functional and browser compatibility testing and it looks fine. On the screen shot below I have highlighted some differences compared on the beta page compared to the visual reference. Could you confirm if this looks okay and please update this ticket's description with the latest visual reference. Thank you.

Screen Shot 2021-07-15 at 10.29.00 AM.png (1Γ—2 px, 686 KB)

Screen Shot 2021-07-15 at 11.30.32 AM.png (1Γ—2 px, 1 MB)

Screen Shot 2021-07-15 at 11.28.52 AM.png (1Γ—2 px, 1 MB)

Hi @imaigwilo, thanks for sending all of those over and apologies for not updating the ticket with the latest visual references. @MusikAnimal and I made some changes as he was developing the front-end so there has been some updates that aren't reflected in the visual reference. Everything looks good, except for the beta environment which should not be reflected on the production release.