ONNX Runtime

ONNX Runtime is a high-performance cross-platform inference engine for models in the Open Neural Network Exchange (ONNX) format — a vendor-neutral standard for representing trained neural networks. Models from PyTorch, TensorFlow, scikit-learn, and other frameworks can be exported to ONNX and then run efficiently on Windows, Linux, macOS, iOS, Android, browsers, and edge devices. ONNX Runtime supports CPU, GPU (CUDA, DirectML, ROCm), and specialized accelerators (NVIDIA TensorRT, Intel OpenVINO, Apple CoreML). Microsoft maintains it as open source under the Linux Foundation. Real-world deployments include Office 365 AI features, Bing Search relevance ranking, Azure AI services, and countless on-device applications. ONNX Runtime is particularly popular for production scenarios requiring cross-platform deployment from a single trained model. AI governance, AI compliance, and AI risk management programs document runtime versions and target platforms in deployment evidence supporting responsible AI portability across enterprise AI environments worldwide.

Centralpoint Supports ONNX Workloads Out of the Box: Oxcyon's Centralpoint AI Governance Platform connects to ONNX-served embedded models alongside cloud APIs (OpenAI, Gemini) and other backends. Centralpoint meters every LLM call, keeps prompts and skills on-prem, and embeds chatbots into your portals via a single JavaScript line.

Related Keywords:
ONNX Runtime,,

Back