Page MenuHomePhabricator

Add toolbar button for OCR cleanup
Open, Needs TriagePublicFeature

Description

Feature summary (what you would like to be able to do and where):

  • A new toolbar button to run common text replacements/changes on OCR text to improve punctuation, line-wrapping, etc.
  • The specific cleanups done need to be wiki-specific, so they can meet the style guides and make use of templates.
  • An icon needs to be selected or created. Perhaps checkAll would suffice for now:
    check all.png (20×20 px, 274 B)

Use case(s) (list the steps that you performed to discover that problem, and describe the actual underlying problem which you want to solve. Do not describe only a solution):

  • After loading text from a text layer of a file, or running OCR on an image, the resulting text often has errors that can be fixed by regular expressions or other programmatic changes.

Benefits (why should this be implemented?):

  • Save time for proofreaders.
  • Fix errors such as white space and the end of lines that might otherwise not be fixed by visual inspection.

Other information: