Steps to replicate the issue (include links if applicable):
This request gives an error:
curl https://api.wikimedia.org/service/lw/inference/v1/models/langid:predict -X POST -d '{"text": "Some sample text in any language that we want to identify\n\n\n"}'https://github.com/facebookresearch/fastText/issues/1079
more logs with similar requests available on logstash
What happens?:
gives 500 response.
What should have happened instead?:
We should strip text from any special characters.
Separate lines should be concatenated.
I also recommend truncating the string and keeping only first 50-100 characters which would be sufficient for language identification.