Test and analyze Kuromoji Japanese language analyzer
Open, MediumPublic13 Estimated Story Points
Actions

Assigned To

None

Authored By

	TJones
	Sep 21 2022, 7:37 PM

Description

User Story: As a user of a Japanese-language wiki, I'd like better language processing than overlapping bigrams. The Kuromoji analyzer might well be up to the task now.

It's been a bit more than five years since we last looked at Kuromoji (T166731). In that time, it has probably gotten better, and I expect my ability to deal with shortcomings in analyzers has also gotten better.*

────────
* Experience is something you don't get until right after you need it.

Acceptance Criteria:

A write up of findings on the Kuromoji analyzer
Either...
- ...include reasons why Kuromoji is unacceptable in the write up, or
- ...a patch implementing the Kuromoji analyzer

Related Objects
Search...

Status	Assigned	Task
Open	None	T318269 Test and analyze Kuromoji Japanese language analyzer
Resolved	TJones	T326822 Unpack CJK Analyzer
Resolved	TJones	T327720 Reindex Japanese-language wikis to use unpacked CJK analyzer

Event Timeline

TJones created this task.Sep 21 2022, 7:37 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 21 2022, 7:37 PM

I'm on the fence between 8 & 13 story points (can I say 10?), so I'm going with the bigger number until we talk about it at a later meeting.

MPhamWMF moved this task from Incoming to Ready for Dev -- SWE on the Discovery-Search (Current work) board.Sep 29 2022, 12:00 AM

TJones claimed this task.Dec 5 2022, 4:12 PM

TJones moved this task from Ready for Dev -- SWE to In Progress on the Discovery-Search (Current work) board.

Gehel mentioned this in T272606: [EPIC] Unpack all Elasticsearch analyzers.Dec 16 2022, 3:05 PM

TJones edited projects, added Discovery-Search; removed Discovery-Search (Current work).Jan 13 2023, 9:24 PM

Moving this back to the backlog to focus on more straightforward unpacking. CJK analyzer unpacking for Japanese (T326822) is still underway.

TJones triaged this task as High priority.Jan 13 2023, 9:28 PM

Gehel closed subtask T326822: Unpack CJK Analyzer as Resolved.Feb 3 2023, 3:31 PM

TJones lowered the priority of this task from High to Medium.Mar 6 2023, 6:26 PM

TJones removed TJones as the assignee of this task.Mar 20 2023, 4:18 PM

Test and analyze Kuromoji Japanese language analyzerOpen, MediumPublic13 Estimated Story PointsActions

Description

Related ObjectsSearch...

Event Timeline

Test and analyze Kuromoji Japanese language analyzer
Open, MediumPublic13 Estimated Story Points
Actions

Related Objects
Search...