Page MenuHomePhabricator

Support building and running of langid model-server via Makefile
Closed, ResolvedPublic2 Estimated Story Points

Description

The LiftWing model-server repo has a Makefile that makes building and running of model-servers locally much easier.

In this task, we will update the existing Makefile to support building and running the langid model-server.

Event Timeline

kevinbazira changed the task status from Open to In Progress.Feb 13 2024, 9:02 AM
kevinbazira claimed this task.
kevinbazira triaged this task as Medium priority.
kevinbazira set the point value for this task to 2.

Trying to build the langid model-server locally throws the error below. This seems to be caused when pip is installing fasttext==0.9.2 and can't find the pybind11 package.

Collecting fasttext==0.9.2 (from -r langid/././requirements.txt (line 2))
  Downloading fasttext-0.9.2.tar.gz (68 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 68.8/68.8 kB 3.3 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [25 lines of output]
      /home/langid_makefile/inference-services/my_venv/bin/python3: No module named pip
      Traceback (most recent call last):
        File "<string>", line 38, in __init__
      ModuleNotFoundError: No module named 'pybind11'
     
      During handling of the above exception, another exception occurred:
     
      Traceback (most recent call last):
        File "/home/langid_makefile/inference-services/my_venv/lib/python3.9/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/home/langid_makefile/inference-services/my_venv/lib/python3.9/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/home/langid_makefile/inference-services/my_venv/lib/python3.9/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
        File "/tmp/pip-build-env-tz0q5796/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=['wheel'])
        File "/tmp/pip-build-env-tz0q5796/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires
          self.run_setup()
        File "/tmp/pip-build-env-tz0q5796/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 480, in run_setup
          super().run_setup(setup_script=setup_script)
        File "/tmp/pip-build-env-tz0q5796/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 311, in run_setup
          exec(code, locals())
        File "<string>", line 72, in <module>
        File "<string>", line 41, in __init__
      RuntimeError: pybind11 install failed.
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

The error above has been fixed by installing the wheel package before installing fasttext. The langid requirements.txt that I used has:

kserve==0.11.2
wheel==0.42.0
fasttext==0.9.2

Change 1002424 had a related patch set uploaded (by Kevin Bazira; author: Kevin Bazira):

[machinelearning/liftwing/inference-services@main] langid: fix pybind11 missing issue

https://gerrit.wikimedia.org/r/1002424

Change 1002424 merged by jenkins-bot:

[machinelearning/liftwing/inference-services@main] langid: fix pybind11 missing issue

https://gerrit.wikimedia.org/r/1002424

calbon raised the priority of this task from Medium to Needs Triage.Feb 13 2024, 3:13 PM

Change 1003032 had a related patch set uploaded (by Kevin Bazira; author: Kevin Bazira):

[machinelearning/liftwing/inference-services@main] Makefile: add support for langid

https://gerrit.wikimedia.org/r/1003032

Change 1003032 merged by jenkins-bot:

[machinelearning/liftwing/inference-services@main] Makefile: add support for langid

https://gerrit.wikimedia.org/r/1003032

While testing the locally-built langid model-server, I queried the inference service and received some interesting results. I tested three languages (English, French, and Swahili) and found that the isvc struggled to predict English accurately when the input was a short sentence with only about four words. Here are the results of my tests:

❌ expected English but the isvc returned Afrikaans

curl localhost:8080/v1/models/langid:predict -i -X POST -d '{"text": "My name is Kevin"}'
{"language":"afr_Latn","wikicode":"af","languagename":"Afrikaans","score":0.6419122815132141}

✔️ expected French and the isvc returned French

curl localhost:8080/v1/models/langid:predict -i -X POST -d '{"text": "Je mappelle Kevin"}'
{"language":"fra_Latn","wikicode":"fr","languagename":"French","score":0.9690109491348267}

✔️ expected Swahili and the isvc returned Swahili

curl localhost:8080/v1/models/langid:predict -i -X POST -d '{"text": "Jina langu ni Kevin"}'
{"language":"swh_Latn","wikicode":"sw","languagename":"Swahili","score":0.9879443645477295}

❌ expected English but the isvc returned Venetian

curl localhost:8080/v1/models/langid:predict -i -X POST -d '{"text": "I live in Dublin."}'
{"language":"vec_Latn","wikicode":"vec","languagename":"Venetian","score":0.7141795754432678}

✔️ expected English and the isvc returned English

curl localhost:8080/v1/models/langid:predict -i -X POST -d '{"text": "My mum and I live in Dublin."}'
{"language":"eng_Latn","wikicode":"en","languagename":"English","score":0.9916096925735474}

Good catch!
Perhaps we could add a small note on the model card that this model should be used with full sentences longer than X words. Either in the Don't use this model for section or the Ethical considerations, caveats, and recommendations section.
@santhosh Do you think that would be ok or do you have any other suggestions?

kevinbazira closed this task as Resolved.EditedFeb 16 2024, 5:07 PM

+1 on adding a note to the model card. Support for building the langid model-server using the Makefile was added and it can be tested using:

# first terminal
$ make language-identification
# second terminal
$ curl localhost:8080/v1/models/langid:predict -i -X POST -d '{"text": "Some random text in any language"}'
$ MODEL_TYPE=langid make clean