Page MenuHomePhabricator

Host sentence-transformers/LaBSE model in LiftWing
Open, Needs TriagePublicFeature

Description

As part of https://phabricator.wikimedia.org/T404183 task that LPL team is working in Q3 and Q4 of 2025-26FY, we want an embedding model available in LiftWing.

Details as follows:

  • Model: https://huggingface.co/sentence-transformers/LaBSE
  • Architecture: Sentence Transformer
  • capability: It can be used to map 109 languages to a shared vector space.
  • License: Apache 2.0
  • https://embed.toolforge.org/ hosts the LaBSE model with an OpenVINO backend. But it is quantized.
  • Internal KServer API is enough as we can connect it from cxserver production instance. For development purpose we can continue to use embed.toolforge.org or the model can be operated in local dev environment
  • Expected API endpoints: embeddings from :predict method
  • Expecting about 5 Requests/Second , latency: <300 ms.
  • Request will be a list of template parameter names - small strings, with list size under 50

Event Timeline

Change #1237731 had a related patch set uploaded (by Santhosh; author: Santhosh):

[machinelearning/liftwing/inference-services@main] Add sentence-transformers/LaBSE model

https://gerrit.wikimedia.org/r/1237731

Here is nodejs client code that can give cosine similarity based on the above service.
Pull the above patch, then

docker compose build labse-embeddings
docker compose up labse-embeddings

Get this code,

1#!/usr/bin/env node
2
3const http = require("http");
4
5/**
6 * Get embedding vector from the API
7 * @param {string[]} text - Text to get embedding for
8 * @returns {Promise<number[]>} - Embedding vector
9 */
10async function getEmbedding(texts) {
11 const postData = JSON.stringify({
12 input: texts,
13 });
14
15 const options = {
16 hostname: "localhost",
17 port: 8080,
18 path: "/v1/models/labse-embedding:predict",
19 method: "POST",
20 headers: {
21 "Content-Type": "application/json",
22 "Content-Length": Buffer.byteLength(postData),
23 },
24 };
25
26 return new Promise((resolve, reject) => {
27 const req = http.request(options, (res) => {
28 let data = "";
29
30 res.on("data", (chunk) => {
31 data += chunk;
32 });
33
34 res.on("end", () => {
35 try {
36 const response = JSON.parse(data);
37 // Adjust this based on the actual API response structure
38 const embeddings = response.data;
39 resolve(embeddings);
40 } catch (error) {
41 reject(new Error(`Failed to parse response: ${error.message}`));
42 }
43 });
44 });
45
46 req.on("error", (error) => {
47 reject(new Error(`API request failed: ${error.message}`));
48 });
49
50 req.write(postData);
51 req.end();
52 });
53}
54/**
55 * Calculate cosine similarity between two normalized vectors
56 * @param {number[]} vec1 - First normalized vector
57 * @param {number[]} vec2 - Second normalized vector
58 * @returns {number} - Similarity score between -1 and 1
59 */
60function cosineSimilarity(vec1, vec2) {
61 if (vec1.length !== vec2.length) {
62 throw new Error("Vectors must have the same length");
63 }
64
65 let dotProduct = 0;
66 for (let i = 0; i < vec1.length; i++) {
67 dotProduct += vec1[i] * vec2[i];
68 }
69
70 return dotProduct;
71}
72
73/**
74 * Compare two strings for similarity using embeddings
75 * @param {string} str1 - First string to compare
76 * @param {string} str2 - Second string to compare
77 * @returns {Promise<number>} - Similarity score between 0 and 1
78 */
79async function compareStrings(str1, str2) {
80 console.log("Getting embeddings ...");
81 const embeddings = await getEmbedding([str1, str2]);
82 console.log("Calculating similarity...");
83 const similarity = cosineSimilarity(
84 embeddings[0].embedding,
85 embeddings[1].embedding,
86 );
87
88 return similarity;
89}
90
91// Main execution
92async function main() {
93 const args = process.argv.slice(2);
94
95 if (args.length < 2) {
96 console.error("Usage: node compare-strings.js <string1> <string2>");
97 console.error('Example: node compare-strings.js "Hello world" "Hi there"');
98 process.exit(1);
99 }
100
101 const string1 = args[0];
102 const string2 = args[1];
103
104 console.log(`\nComparing strings:`);
105 console.log(`String 1: "${string1}"`);
106 console.log(`String 2: "${string2}"\n`);
107
108 try {
109 const similarity = await compareStrings(string1, string2);
110 console.log(
111 `\nSimilarity score: ${similarity.toFixed(4)} (${(similarity * 100).toFixed(2)}%)`,
112 );
113 } catch (error) {
114 console.error(`Error: ${error.message}`);
115 process.exit(1);
116 }
117}
118
119main();

Then

node compare-strings.js "cat" "gato"


Comparing strings:
String 1: "cat"
String 2: "gato"

Getting embeddings ...
Calculating similarity...

Similarity score: 0.9522 (95.22%)

Notice that the cosine similarity uses a simple dot product of the vectors since the embeddings are normalized in the liftwing modelserver. Euclidean distance over normalized vectors is a linear transform of the cosine distance.

Change #1237731 merged by jenkins-bot:

[machinelearning/liftwing/inference-services@main] Add sentence-transformers/LaBSE model

https://gerrit.wikimedia.org/r/1237731