As part of https://phabricator.wikimedia.org/T404183 task that LPL team is working in Q3 and Q4 of 2025-26FY, we want an embedding model available in LiftWing.
Details as follows:
- Model: https://huggingface.co/sentence-transformers/LaBSE
- Architecture: Sentence Transformer
- capability: It can be used to map 109 languages to a shared vector space.
- License: Apache 2.0
- https://embed.toolforge.org/ hosts the LaBSE model with an OpenVINO backend. But it is quantized.
- Internal KServer API is enough as we can connect it from cxserver production instance. For development purpose we can continue to use embed.toolforge.org or the model can be operated in local dev environment
- Expected API endpoints: embeddings from :predict method
- Expecting about 5 Requests/Second , latency: <300 ms.
- Request will be a list of template parameter names - small strings, with list size under 50