Page MenuHomePhabricator

Backport newer tesseract-ocr-* packages from sid
Closed, InvalidPublic

Description

If this can be done without too much manual effort, this would be helpful for all tools which use tesseract. The packages are incredibly simple:

valhallasw@tools-bastion-01:/usr/share/tesseract-ocr/tessdata$ dpkg -L tesseract-ocr-srp
/.
/usr
/usr/share
/usr/share/doc
/usr/share/doc/tesseract-ocr-srp
/usr/share/doc/tesseract-ocr-srp/copyright
/usr/share/doc/tesseract-ocr-srp/changelog.Debian.gz
/usr/share/tesseract-ocr
/usr/share/tesseract-ocr/tessdata
/usr/share/tesseract-ocr/tessdata/srp.traineddata

so it should be possible to just install the newer packages.

Event Timeline

valhallasw raised the priority of this task from to Needs Triage.
valhallasw updated the task description. (Show Details)
valhallasw added a project: Toolforge.
valhallasw added subscribers: valhallasw, Billinghurst, Phe and 4 others.
Restricted Application added a project: Cloud-Services. · View Herald TranscriptNov 14 2015, 4:45 PM

@valhallasw I think that my inner stupid has a (significant) knowledge gap on doing this on toollabs, well at least some guidance/handholding. Are you saying that anyone with suitable access can do this? [I suppose I have a hesitation in f'ing up something important. ] So some clarification in that regard would be helpful. Wondering if this is something that is more open then we could look to a page at wikitech that covers this sort of upgrade. Thanks.

chasemp triaged this task as Low priority.Nov 30 2015, 4:45 PM
chasemp added a subscriber: chasemp.

@yuvipanda had an even better idea: we don't need to actually backport them, we can just install the newer packages. I'm going to take a look at doing this.

Meh. Apparently there also have been changes to the ocr engine, as the data files explicitly set

Breaks: tesseract-ocr (<< 3.04.00-1)

and we only have 3.03.02-3 installed...

hmm, so we have to backport that I guess...

Thanks to package_builder and a bit more fiddling, I now have a .deb for tesseract-ocr 3.04.00-5.

@Billinghurst:

  • do you foresee any problems with upgrading tesseract-ocr from 3.03.02-3 (on trusty) to 3.04.00-5?
  • is it OK to only upgrade on trusty and not on precise? I'd rather build less packages than more.

I am no expert on the matters, though I would think that generally the community would be looking for the most modern and stable and widely effective version of the software. So for bnWS that sounds like the update is it; for the remainder of the communities, I would believe that as it is a July 2015 release that it is stable. So I say go for it, as long as we can maintain a rollback situation in case it fubars. I also think that @yuvipanda only wants us in trusty, so that too sounds okay. That said, I have no idea how @Phe has it configured to run. Let me try and find you in IRC as I think that I can show you.

valhallasw moved this task from Triage to Backlog on the Toolforge board.May 27 2016, 12:21 PM
Tpt closed this task as Invalid.Mar 12 2019, 8:25 PM
Tpt added a subscriber: Tpt.

We have installed the latest packages from stretch-backport. I believe the goal of this task is now archived.