Page MenuHomePhabricator
Paste P76839

simple ovms+kserve isvc serving OpenVINO/Phi-4-mini-instruct-int8-ov LLM on ml-lab1002
ActivePublic

Authored by kevinbazira on Jun 2 2025, 2:48 PM.
$ python3 simple_ovms_model.py
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
/ovms/lib/python/openvino/runtime/__init__.py:10: DeprecationWarning: The `openvino.runtime` module is deprecated and will be removed in the 2026.0 release. Please replace `openvino.runtime` with `openvino`.
warnings.warn(
2025-06-02 13:39:57.609 3256 kserve INFO [model_server.py:register_model():398] Registering model: simple-ovms
2025-06-02 13:39:57.610 3256 kserve INFO [model_server.py:setup_event_loop():278] Setting max asyncio worker threads as 32
2025-06-02 13:39:57.627 3256 kserve INFO [server.py:_register_endpoints():110] OpenAI endpoints not registered
2025-06-02 13:39:57.627 3256 kserve INFO [server.py:start():161] Starting uvicorn with 1 workers
2025-06-02 13:39:57.639 3256 uvicorn.error INFO: Started server process [3256]
2025-06-02 13:39:57.639 3256 uvicorn.error INFO: Waiting for application startup.
2025-06-02 13:39:57.642 3256 kserve INFO [server.py:start():70] Starting gRPC server with 4 workers
2025-06-02 13:39:57.642 3256 kserve INFO [server.py:start():71] Starting gRPC server on [::]:8081
2025-06-02 13:39:57.642 3256 uvicorn.error INFO: Application startup complete.
2025-06-02 13:39:57.642 3256 uvicorn.error INFO: Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)