Searched For: textcat
We want to spread the usefulness of Language Identification (via TextCat) to non-Wikipedia wikis.
Rather … give us insight into whether the TextCat configs can be straightforwardly … f the long tail of wiki projects.
ase new version
[ ] Update MediaWiki / etc. to use
[ ] wikimedia/textcat
[ ] Update to modern standards … [x] wikimedia/zest-css (@cscott)
t, regex, etc). A dimension on query length might also be useful.
latform dashboards are updated to reflect this new tagging proposal
guage is //prioritized//, other languages should be //available//.)
fer by the case of the first letter).
Language identification via TextCat is currently case-sensitive beca … ge ID results, using the current TextCat params, to get a sense of the sc … accuracy changes for case-folded TextCat models, using the currently opti … w sub-ticket) to re-optimize the TextCat params for the case-folded and s … at least not in a case like this.
raining on query data or other data) in "the lab" or in production.
A/B test to be able to decide whether Method 1 is worth deploying.
_Wrong_Keyboard%E2%80%94Russian_and_English | here ]].
We can use TextCat language detection to detect the … 6019}
- Similar task: {T155104}
db
* data-values
* php-session-serializer
* utfnormal
* wikimedia/textcat
I suggest `mediawiki/libs`, by … make the old locations read-only.
moved.
Once the model is updated, it should be tested against the TextCat regression test sets to make sur … ure everything works as expected.
and note which is which in the docs; more complex solution: allow TextCat to take more complex specificati … estigate later?) [highly desired]