Benchmarking the TrueFoundry LLM Gateway: it's blazing fast ⚡

November 12, 2024

Srihari Radhakrishna

Share this post

https://www.truefoundry.com/blog/truefoundry-llm-gateway-is-blazing-fast

URL

Benchmarking the TrueFoundry LLM Gateway: it's blazing fast ⚡

Table of Contents

Lorem Ipsum Dolor

RPS	10 RPS	50 RPS	200 RPS	300 RPS
OpenAI direct (Setup 1)	73 ms	73 ms	73 ms	73 ms
TrueFoundry LLM Gateway (Setup 2)	76 ms (+3 ms)	76 ms (+3 ms)	76 ms (+3 ms)	77 ms (+4 ms)
LiteLLM Proxy (Setup 3)	88 ms (+15 ms)	99 ms (+26 ms)	Could not scale to 200 RPS	Could not scale to 300 RPS

Provider	Streaming Supported
GCP	✅
AWS	✅
Azure OpenAI	✅
Self Hosted Models on TrueFoundry	✅
OpenAI	✅
Cohere	✅
AI21	✅
Anthropic	✅
Anyscale	✅
Together AI	✅
DeepInfra	✅
Ollama	✅
Palm	✅
Perplexity AI	✅
Mistral AI	✅
Groq	✅
Nomic	✅

Subscribe to our Newsletter

Delivered twice a month

Join AI/ML leaders for the latest on product, community, and GenAI developments

Management free AI infrastructure

Book a demo now

Subscribe to our newsletter

By clicking Subscribe you're confirming that you agree with our Terms & Conditions.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

By clicking Subscribe you're confirming that you agree with our Terms & Conditions.

Discover More

AutoDeploy: LLM Agent for GenAI Deployments

Engineering and Product

LLMs & GenAI

March 18, 2025

Autopilot: Automating Infrastructure Management for GenAI

Engineering and Product

LLMs & GenAI

April 16, 2024

Cognita: Building an Open Source, Modular, RAG applications for Production

LLMs & GenAI

April 12, 2024

Cognita: Our Open Source Framework For RAG In Production

LLMs & GenAI

Related Blogs

No items found.

Benchmarking the TrueFoundry LLM Gateway: it's blazing fast ⚡

Highlights

Why does your org need an LLM Gateway?

How fast is TrueFoundry LLM Gateway?

Load Test Setup

Observations

More metrics

Speed features of LLM Gateway

Supported Providers

Subscribe to our newsletter

AutoDeploy: LLM Agent for GenAI Deployments

Autopilot: Automating Infrastructure Management for GenAI

Cognita: Building an Open Source, Modular, RAG applications for Production

Cognita: Our Open Source Framework For RAG In Production

Blazingly fast way to build, track and deploy your models!

Company

Product

Resources

Goodreads