GTE

GTE (General Text Embeddings) is Alibaba's family of open-source embedding models — released under MIT license and ranking among the top open-source options on MTEB benchmark. The family includes GTE-large, GTE-base, GTE-small (general-purpose), and specialized variants like GTE-Qwen2-7B-instruct (a larger LLM-based embedder) and GTE-multilingual-base. The models produce 1024-dimensional vectors (large variant) and demonstrate strong performance on a wide range of retrieval tasks. The GTE-Qwen2-7B variant uses a 7B-parameter Qwen2 backbone as the embedder, achieving state-of-the-art performance among open-source embedding models at the cost of larger model footprint. Available on Hugging Face. Real-world deployments span enterprise search, RAG systems, document classification, and any application requiring strong open-source embedding under permissive licensing. The GTE family is particularly popular in Asia and among organizations preferring Chinese-developed open-source AI alongside DeepSeek and Qwen lineages. AI governance, AI compliance, and AI risk management programs evaluate provider origin and licensing — supporting responsible AI through diversified embedding deployment in enterprise AI environments worldwide.

Centralpoint Routes to GTE for Open-Source Retrieval: Oxcyon's Centralpoint AI Governance Platform powers retrieval with GTE alongside OpenAI, Cohere, Voyage, BGE, and other embedding models. Centralpoint meters every call, keeps prompts and skills on-prem, and embeds retrieval chatbots into your portals via a single JavaScript line.


Related Keywords:
GTE,,