Skip to content

TEI error for jinaai/jina-embeddings-v3 missing field model_type at line 51 column 1 #571

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 of 4 tasks
dromeuf opened this issue Apr 8, 2025 · 3 comments
Open
2 of 4 tasks
Assignees

Comments

@dromeuf
Copy link

dromeuf commented Apr 8, 2025

System Info

I would like to use jinaii embedding & reranker but not working error "missing field model_type at line 51 column 1"

King regards, David.

(HF_TEI) :~$ sudo docker run --gpus all -p 33434:80 ghcr.io/huggingface/text-embeddings-inference:1.7 --model-id jinaai/jina-embeddings-v3

Unable to find image 'ghcr.io/huggingface/text-embeddings-inference:1.7' locally
1.7: Pulling from huggingface/text-embeddings-inference
aece8493d397: Already exists
9fe5ccccae45: Already exists
8054e9d6e8d6: Already exists
bdddd5cb92f6: Already exists
5324914b4472: Already exists
bcdd2fd1a29f: Already exists
7e0a2d9e8540: Pull complete
Digest: sha256:dfb1f681721bad43fd0af2576c44c76531fcf6d44839edc853ce19102112205b
Status: Downloaded newer image for ghcr.io/huggingface/text-embeddings-inference:1.7
2025-04-08T13:43:14.251857Z  INFO text_embeddings_router: router/src/main.rs:185: Args { model_id: "jin***/****-*********s-v3", revision: None, tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, auto_truncate: false, default_prompt_name: None, default_prompt: None, hf_api_token: None, hf_token: None, hostname: "69c429813e5a", port: 80, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: Some("/data"), payload_limit: 2000000, api_key: None, json_output: false, disable_spans: false, otlp_endpoint: None, otlp_service_name: "text-embeddings-inference.server", cors_allow_origin: None }
2025-04-08T13:43:14.327188Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:20: Starting download
2025-04-08T13:43:14.327215Z  INFO download_artifacts:download_pool_config: text_embeddings_core::download: core/src/download.rs:53: Downloading `1_Pooling/config.json`
2025-04-08T13:43:15.298830Z  INFO download_artifacts:download_new_st_config: text_embeddings_core::download: core/src/download.rs:77: Downloading `config_sentence_transformers.json`
2025-04-08T13:43:15.500807Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:40: Downloading `config.json`
2025-04-08T13:43:15.703191Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:43: Downloading `tokenizer.json`
2025-04-08T13:43:18.916464Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:47: Model artifacts downloaded in 4.589276421s
Error: Failed to parse `config.json`

Caused by:
    missing field `model_type` at line 51 column 1

(HF_TEI) :~$ sudo docker run --gpus all -p 33434:80 ghcr.io/huggingface/text-embeddings-inference:1.7 --version
text-embeddings-router 1.7.0

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

sudo docker run --gpus all -p 33434:80 ghcr.io/huggingface/text-embeddings-inference:1.7 --model-id jinaai/jina-embeddings-v3

Expected behavior

TEi work with jinaai/jina-embeddings-v3 & jinaai/jina-reranker-v2-base-multilingual please.

@Johnno1011
Copy link

I was able to resolve this by adding "model_type": "XLMRobertaModel" to the config.json of the downloaded model. Not a long term solution, but also not caused by TEI - the model itself is just missing this detail :)

@alvarobartt
Copy link
Member

Hey @Johnno1011 thanks for reporting indeed! Also AFAIK this model is not natively supported on TEI as it relies on the custom implementation at jinaai/xlm-roberta-flash-implementation, I'll try to look into it and come back to you. Also on what respects to the authors of the models on the Hub we have few to no control over the uploaded artifacts, so these things breaking are expected if the artifacts are not compliant; in any case, we'll try to solve it and on-board this model as it has been requested recently too! Thanks again 🤗

@alvarobartt alvarobartt self-assigned this Apr 9, 2025
@alvarobartt
Copy link
Member

Also FYI the model type should be set to "model_type": "xlm-roberta", instead, but that won't use the JinaAI implementation so you may get unexpected results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants