v1.66.0-stable - Realtime API Cost Tracking
Deploy this version​
- Docker
 - Pip
 
docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
ghcr.io/berriai/litellm:main-v1.66.0-stable
pip install litellm==1.66.0.post1
v1.66.0-stable is live now, here are the key highlights of this release
Key Highlights​
- Realtime API Cost Tracking: Track cost of realtime API calls
 - Microsoft SSO Auto-sync: Auto-sync groups and group members from Azure Entra ID to LiteLLM
 - xAI grok-3: Added support for 
xai/grok-3models - Security Fixes: Fixed CVE-2025-0330 and CVE-2024-6825 vulnerabilities
 
Let's dive in.
Realtime API Cost Tracking​
This release adds Realtime API logging + cost tracking.
- Logging: LiteLLM now logs the complete response from realtime calls to all logging integrations (DB, S3, Langfuse, etc.)
 - Cost Tracking: You can now set 'base_model' and custom pricing for realtime models. Custom Pricing
 - Budgets: Your key/user/team budgets now work for realtime models as well.
 
Start here
Microsoft SSO Auto-sync​
Auto-sync groups and members from Azure Entra ID to LiteLLM
This release adds support for auto-syncing groups and members on Microsoft Entra ID with LiteLLM. This means that LiteLLM proxy administrators can spend less time managing teams and members and LiteLLM handles the following:
- Auto-create teams that exist on Microsoft Entra ID
 - Sync team members on Microsoft Entra ID with LiteLLM teams
 
Get started with this here
New Models / Updated Models​
- 
xAI
- Added reasoning_effort support for 
xai/grok-3-mini-betaGet Started - Added cost tracking for 
xai/grok-3models PR 
 - Added reasoning_effort support for 
 - 
Hugging Face
- Added inference providers support Get Started
 
 - 
Azure
- Added azure/gpt-4o-realtime-audio cost tracking PR
 
 - 
VertexAI
- Added enterpriseWebSearch tool support Get Started
 - Moved to only passing keys accepted by the Vertex AI response schema PR
 
 - 
Google AI Studio
 - 
Azure
 - 
Databricks
 - 
General
- Added litellm.supports_reasoning() util to track if an llm supports reasoning Get Started
 - Function Calling - Handle pydantic base model in message tool calls, handle tools = [], and support fake streaming on tool calls for meta.llama3-3-70b-instruct-v1:0 PR
 - LiteLLM Proxy - Allow passing 
thinkingparam to litellm proxy via client sdk PR - Fixed correctly translating 'thinking' param for litellm PR
 
 
Spend Tracking Improvements​
- OpenAI, Azure
- Realtime API Cost tracking with token usage metrics in spend logs Get Started
 
 - Anthropic
 - General
 
Management Endpoints / UI​
- 
Test Key Tab
- Added rendering of Reasoning content, ttft, usage metrics on test key page PR
 
View input, output, reasoning tokens, ttft metrics.
 - 
Tag / Policy Management
- Added Tag/Policy Management. Create routing rules based on request metadata. This allows you to enforce that requests with 
tags="private"only go to specific models. Get Started 
Create and manage tags.
 - Added Tag/Policy Management. Create routing rules based on request metadata. This allows you to enforce that requests with 
 - 
Redesigned Login Screen
- Polished login screen PR
 
 - 
Microsoft SSO Auto-Sync
- Added debug route to allow admins to debug SSO JWT fields PR
 - Added ability to use MSFT Graph API to assign users to teams PR
 - Connected litellm to Azure Entra ID Enterprise Application PR
 - Added ability for admins to set 
default_team_paramsfor when litellm SSO creates default teams PR - Fixed MSFT SSO to use correct field for user email PR
 - Added UI support for setting Default Team setting when litellm SSO auto creates teams PR
 
 - 
UI Bug Fixes
 
Logging / Guardrail Improvements​
- Prometheus
- Emit Key and Team Budget metrics on a cron job schedule Get Started
 
 
Security Fixes​
- Fixed CVE-2025-0330 - Leakage of Langfuse API keys in team exception handling PR
 - Fixed CVE-2024-6825 - Remote code execution in post call rules PR
 
Helm​
Demo​
Try this on the demo instance today
Complete Git Diff​
See the complete git diff since v1.65.4-stable, here
