Page MenuHomePhabricator

Add Arabic model to kraken OCR
Open, Needs TriagePublicFeature

Description

Wikimedia kraken OCR now does not have any Arabic models, Please add these Arabic models:

Event Timeline

Restricted Application added subscribers: Gerges, alaa, Aklapper. · View Herald Transcript

Hi @hubaishan

I would like to take the task. I have few questions regarding this task. I have downloaded the repo from (https://github.com/wikimedia/wikimedia-ocr)

  1. Kraken isn't integrated yet — should implementing the full Kraken engine (not just the Arabic models) be in scope here?
  1. Is Kraken already installed on the Toolforge server?
  1. Should both Zenodo models (all_arabic_scripts.mlmodel + arabic_best.mlmodel) be added, or just the Arabic-specific one?
  1. Should RTL mode be enabled by default for Arabic?

Thanks!

Hi @Samwilson

Can you just clarify the above queries regarding this task