You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
import requests
model = SentenceTransformer("Qwen/Qwen3-Embedding-0.6B")
txt = "this is a test"
st_embedding = model.encode([txt])
tei_embedding = requests.post(
"http://localhost:8000/embed", #TEI server
json={"inputs": [txt]},
)
np.dot(tei_embedding.json()[0], st_embedding[0])
Expected behavior
For Qwen3 models, the vectors returned by TEI have very low cosine similarity to vectors produced with the same model loaded via Sentence-Transformers.
Using identical text, pooling mode, and normalization, the cosine similarity between TEI and ST is often <0.2.
When running the same test with another model, e.g. BAAI/bge-base-en-v1.5, the similarity is 1.0, as expected.