v1.66.0-stable - Realtime API Cost Tracking

April 12, 2025

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaffer

CTO, LiteLLM

Deploy this version

Docker
Pip

docker run litellm
docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
ghcr.io/berriai/litellm:main-v1.66.0-stable

pip install litellm
pip install litellm==1.66.0.post1

v1.66.0-stable is live now, here are the key highlights of this release

Key Highlights

Realtime API Cost Tracking: Track cost of realtime API calls
Microsoft SSO Auto-sync: Auto-sync groups and group members from Azure Entra ID to LiteLLM
xAI grok-3: Added support for xai/grok-3 models
Security Fixes: Fixed CVE-2025-0330 and CVE-2024-6825 vulnerabilities

Let's dive in.

Realtime API Cost Tracking

This release adds Realtime API logging + cost tracking.

Logging: LiteLLM now logs the complete response from realtime calls to all logging integrations (DB, S3, Langfuse, etc.)
Cost Tracking: You can now set 'base_model' and custom pricing for realtime models. Custom Pricing
Budgets: Your key/user/team budgets now work for realtime models as well.

Start here

Microsoft SSO Auto-sync

Auto-sync groups and members from Azure Entra ID to LiteLLM

This release adds support for auto-syncing groups and members on Microsoft Entra ID with LiteLLM. This means that LiteLLM proxy administrators can spend less time managing teams and members and LiteLLM handles the following:

Auto-create teams that exist on Microsoft Entra ID
Sync team members on Microsoft Entra ID with LiteLLM teams

Get started with this here

New Models / Updated Models

xAI
1. Added reasoning_effort support for xai/grok-3-mini-beta Get Started
2. Added cost tracking for xai/grok-3 models PR
Hugging Face
1. Added inference providers support Get Started
Azure
1. Added azure/gpt-4o-realtime-audio cost tracking PR
VertexAI
1. Added enterpriseWebSearch tool support Get Started
2. Moved to only passing keys accepted by the Vertex AI response schema PR
Google AI Studio
1. Added cost tracking for gemini-2.5-pro PR
2. Fixed pricing for 'gemini/gemini-2.5-pro-preview-03-25' PR
3. Fixed handling file_data being passed in PR
Azure
1. Updated Azure Phi-4 pricing PR
2. Added azure/gpt-4o-realtime-audio cost tracking PR
Databricks
1. Removed reasoning_effort from parameters PR
2. Fixed custom endpoint check for Databricks PR
General
1. Added litellm.supports_reasoning() util to track if an llm supports reasoning Get Started
2. Function Calling - Handle pydantic base model in message tool calls, handle tools = [], and support fake streaming on tool calls for meta.llama3-3-70b-instruct-v1:0 PR
3. LiteLLM Proxy - Allow passing thinking param to litellm proxy via client sdk PR
4. Fixed correctly translating 'thinking' param for litellm PR

Spend Tracking Improvements

OpenAI, Azure
1. Realtime API Cost tracking with token usage metrics in spend logs Get Started
Anthropic
1. Fixed Claude Haiku cache read pricing per token PR
2. Added cost tracking for Claude responses with base_model PR
3. Fixed Anthropic prompt caching cost calculation and trimmed logged message in db PR
General
1. Added token tracking and log usage object in spend logs PR
2. Handle custom pricing at deployment level PR

Management Endpoints / UI

Test Key Tab
1. Added rendering of Reasoning content, ttft, usage metrics on test key page PR
View input, output, reasoning tokens, ttft metrics.
Tag / Policy Management
1. Added Tag/Policy Management. Create routing rules based on request metadata. This allows you to enforce that requests with tags="private" only go to specific models. Get Started
Create and manage tags.
Redesigned Login Screen
1. Polished login screen PR
Microsoft SSO Auto-Sync
1. Added debug route to allow admins to debug SSO JWT fields PR
2. Added ability to use MSFT Graph API to assign users to teams PR
3. Connected litellm to Azure Entra ID Enterprise Application PR
4. Added ability for admins to set default_team_params for when litellm SSO creates default teams PR
5. Fixed MSFT SSO to use correct field for user email PR
6. Added UI support for setting Default Team setting when litellm SSO auto creates teams PR
UI Bug Fixes
1. Prevented team, key, org, model numerical values changing on scrolling PR
2. Instantly reflect key and team updates in UI PR

Logging / Guardrail Improvements

Prometheus
1. Emit Key and Team Budget metrics on a cron job schedule Get Started

Security Fixes

Fixed CVE-2025-0330 - Leakage of Langfuse API keys in team exception handling PR
Fixed CVE-2024-6825 - Remote code execution in post call rules PR

Helm

Added service annotations to litellm-helm chart PR
Added extraEnvVars to the helm deployment PR

Demo

Try this on the demo instance today

Complete Git Diff

See the complete git diff since v1.65.4-stable, here

Deploy this version​

Key Highlights​

Realtime API Cost Tracking​

Microsoft SSO Auto-sync​

New Models / Updated Models​

Spend Tracking Improvements​

Management Endpoints / UI​

Logging / Guardrail Improvements​

Security Fixes​

Helm​

Demo​

Complete Git Diff​