Page MenuHomePhabricator

Store Pdf extracted text in a structured table instead of img_metadata
Open, LowPublic

Description

Same as T32906 for djvu

Event Timeline

aaron created this task.May 15 2015, 5:42 PM
aaron raised the priority of this task from to Needs Triage.
aaron updated the task description. (Show Details)
aaron added a subscriber: aaron.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 15 2015, 5:42 PM
aaron set Security to None.
Restricted Application added a project: Multimedia. · View Herald TranscriptMay 15 2015, 5:43 PM
Nemo_bis added a subscriber: Nemo_bis.

A current very useful feature is the CirrusSearch indexes the text content of files; may need some adaptations if the text is moved elsewhere.

Restricted Application added a project: Discovery. · View Herald TranscriptAug 22 2015, 6:24 PM
Restricted Application added subscribers: Steinsplitter, Matanya. · View Herald Transcript

A current very useful feature is the CirrusSearch indexes the text content of files; may need some adaptations if the text is moved elsewhere.

CirrusSearch just calls methods in the MediaHandler class. It should be possibly to transparently change this without affecting Cirrus or anything else depending on where the text is stored

Jdforrester-WMF triaged this task as Low priority.Sep 4 2015, 6:55 PM
Jdforrester-WMF moved this task from Untriaged to Backlog on the Multimedia board.Sep 4 2015, 7:01 PM
Deskana moved this task from Needs triage to Search on the Discovery board.Dec 31 2015, 3:54 AM
Restricted Application added a project: Commons. · View Herald TranscriptDec 31 2015, 3:54 AM
zhuyifei1999 moved this task from Incoming to Backlog on the Commons board.Jan 2 2016, 6:40 AM