Benchmarking Popular Opensource LLMs: Llama2, Falcon, and Mistral

November 23, 2023

min read

TrueFoundry

Share this post

https://www.truefoundry.com/blog/benchmarking-llama2-falcon-and-mistral

URL

Benchmarking Popular Opensource LLMs: Llama2, Falcon, and Mistral

Subscribe to our Newsletter

Delivered twice a month

Join AI/ML leaders for the latest on product, community, and GenAI developments

Table of Contents

Lorem Ipsum Dolor

Subscribe to our newsletter

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Subscribe to our newsletter

By clicking Subscribe you're confirming that you agree with our Terms & Conditions.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

By clicking Subscribe you're confirming that you agree with our Terms & Conditions.

Discover More

AutoDeploy: LLM Agent for GenAI Deployments

Engineering and Product

LLMs & GenAI

March 18, 2025

Autopilot: Automating Infrastructure Management for GenAI

Engineering and Product

LLMs & GenAI

November 12, 2024

Benchmarking the TrueFoundry LLM Gateway: it's blazing fast ⚡

LLMs & GenAI

April 16, 2024

Cognita: Building an Open Source, Modular, RAG applications for Production

LLMs & GenAI

Related Blogs

No items found.

MODEL	INPUT / OUTPUT TOKENS	CONCURRENT USERS / THROUGHPUT	GPU TYPE	AWS MACHINE TYPE (COST/HR) REGION: US-EAST-1	GCP MACHINE TYPE (COST/HR) REGION: US-EAST4	AZURE MACHINE TYPE (COST/HR) REGION: EAST US (VIRGINIA)	SAGEMAKER INSTANCE TYPE (COST/HR) REGION: US-EAST-1
Mistral 7b	1500 Input, 100 Output	7 users / 2.8	A100 40 GB (Count: 1)	p4d.24xlarge (Spot: $7.79/hr, On-Demand: $32.77/hr)	a2-highgpu-1g (Spot: $1.21/hr, On-Demand: $3.93/hr)	Standard_NC24ads_A100_v4 (Spot: $0.95/hr, On-Demand: $3.67/hr)	ml.p4d.24xlarge (On-Demand: $37.68/hr)
Mistral 7b	50 Input, 500 Output	40 users / 1.5	A100 40 GB (Count: 1)	p4d.24xlarge (Spot: $7.79/hr, On-Demand: $32.77/hr)	a2-highgpu-1g (Spot: $1.21/hr, On-Demand: $3.93/hr)	Standard_NC24ads_A100_v4 (Spot: $0.95/hr, On-Demand: $3.67/hr)	ml.p4d.24xlarge (On-Demand: $37.68/hr)
LLama 2 7b	1500 Input, 100 Output	20 users / 3.6	A100 40 GB (Count: 1)	p4d.24xlarge (Spot: $7.79/hr, On-Demand: $32.77/hr)	a2-highgpu-1g (Spot: $1.21/hr, On-Demand: $3.93/hr)	Standard_NC24ads_A100_v4 (Spot: $0.95/hr, On-Demand: $3.67/hr)	ml.p4d.24xlarge (On-Demand: $37.68/hr)
LLama 2 7b	50 Input, 500 Output	62 users / 3.5	A100 40 GB (Count: 1)	p4d.24xlarge (Spot: $7.79/hr, On-Demand: $32.77/hr)	a2-highgpu-1g (Spot: $1.21/hr, On-Demand: $3.93/hr)	Standard_NC24ads_A100_v4 (Spot: $0.95/hr, On-Demand: $3.67/hr)	ml.p4d.24xlarge (On-Demand: $37.68/hr)
LLama 2 13b	1500 Input, 100 Output	7 users / 1.4	A100 40 GB (Count: 1)	p4d.24xlarge (Spot: $7.79/hr, On-Demand: $32.77/hr)	a2-highgpu-1g (Spot: $1.21/hr, On-Demand: $3.93/hr)	Standard_NC24ads_A100_v4 (Spot: $0.95/hr, On-Demand: $3.67/hr)	ml.p4d.24xlarge (On-Demand: $37.68/hr)
LLama 2 13b	50 Input, 500 Output	23 users / 1.5	A100 40 GB (Count: 1)	p4d.24xlarge (Spot: $7.79/hr, On-Demand: $32.77/hr)	a2-highgpu-1g (Spot: $1.21/hr, On-Demand: $3.93/hr)	Standard_NC24ads_A100_v4 (Spot: $0.95/hr, On-Demand: $3.67/hr)	ml.p4d.24xlarge (On-Demand: $37.68/hr)
LLama 2 70b	1500 Input, 100 Output	15 users / 1.1	A100 40 GB (Count: 4)	p4d.24xlarge (Spot: $7.79/hr, On-Demand: $32.77/hr)	a2-highgpu-4g (Spot: $4.85/hr, On-Demand: $15.73/hr)	Standard_NC96ads_A100_v4 (Spot: $3.82/hr, On-Demand: $14.69/hr)	ml.p4d.24xlarge (On-Demand: $37.68/hr)
LLama 2 70b	50 Input, 500 Output	38 users / 0.8	A100 40 GB (Count: 4)	p4d.24xlarge (Spot: $7.79/hr, On-Demand: $32.77/hr)	a2-highgpu-4g (Spot: $4.85/hr, On-Demand: $15.73/hr)	Standard_NC96ads_A100_v4 (Spot: $3.82/hr, On-Demand: $14.69/hr)	ml.p4d.24xlarge (On-Demand: $37.68/hr)
Falcon 40b	1500 Input, 100 Output	16 users / 2	A100 40 GB (Count: 4)	p4d.24xlarge (Spot: $7.79/hr, On-Demand: $32.77/hr)	a2-highgpu-4g (Spot: $4.85/hr, On-Demand: $15.73/hr)	Standard_NC96ads_A100_v4 (Spot: $3.82/hr, On-Demand: $14.69/hr)	ml.p4d.24xlarge (On-Demand: $37.68/hr)
Falcon 40b	50 Input, 500 Output	75 users / 2.5	A100 40 GB (Count: 4)	p4d.24xlarge (Spot: $7.79/hr, On-Demand: $32.77/hr)	a2-highgpu-4g (Spot: $4.85/hr, On-Demand: $15.73/hr)	Standard_NC96ads_A100_v4 (Spot: $3.82/hr, On-Demand: $14.69/hr)	ml.p4d.24xlarge (On-Demand: $37.68/hr)

Benchmarking Popular Opensource LLMs: Llama2, Falcon, and Mistral

Use cases Benchmarked

Benchmarking Setup

LLMs Benchmarked

Details LLM Benchmarking Blogs on each LLMs

Subscribe to our newsletter

AutoDeploy: LLM Agent for GenAI Deployments

Autopilot: Automating Infrastructure Management for GenAI

Benchmarking the TrueFoundry LLM Gateway: it's blazing fast ⚡

Cognita: Building an Open Source, Modular, RAG applications for Production

Blazingly fast way to build, track and deploy your models!

Company

Product

Resources

Goodreads

The Complete Guide to AI Gateways and MCP Servers