The Gateway Model Metrics Query API provides a flexible way to query gateway model metrics on model usage, performance, cost, and user activity. You can retrieve either distribution (aggregated) or timeseries model metrics with powerful filtering and grouping capabilities.Documentation Index
Fetch the complete documentation index at: https://www.truefoundry.com/llms.txt
Use this file to discover all available pages before exploring further.
This page covers
datasource: "modelMetrics". For querying MCP server / tool metrics, see API Access to MCP Metrics.Access control
- Tenant admins: Can query metrics for the entire organization (tenant-wide).
- Users: Can query their own data and their teams’ data.
- Virtual accounts: Can query their own data and their teams’ data; with tenant-admin permissions, they can access tenant-wide data.
Contents
| Section | Description |
|---|---|
| Overview | Authentication, quick start, and API reference |
| Filtering | Filter operators, fields, and combinations |
| Distribution examples | Aggregated (distribution) query examples |
| Timeseries examples | Time-bucketed (timeseries) query examples |
| Response format | Response JSON structure |
Authentication
You need to authenticate with your TrueFoundry API key. You can use either a Personal Access Token (PAT) or Virtual Account Token (VAT).Get your API key
Get your API key
To generate an API key:
- Personal Access Token (PAT): Go to Access → Personal Access Tokens in your TrueFoundry dashboard
- Virtual Account Token (VAT): Go to Access → Virtual Account Tokens (requires admin permissions)
Quick Start
Distribution Query
Get aggregated model metrics distribution with multiple aggregations including count, sum, and percentiles:Timeseries Query
Get model metrics over time with hourly intervals, including latency percentiles:API Reference
Endpoint
Request Parameters
ISO 8601 timestamp for the start of the data range (e.g.,
"2025-01-21T00:00:00.000Z")ISO 8601 timestamp for the end of the data range (e.g.,
"2025-01-22T00:00:00.000Z")The data source to query. Use
"modelMetrics" for gateway model metrics.The type of query to execute:
"distribution"- Returns aggregated metrics"timeseries"- Returns metrics over time intervals
Array of aggregation objects. Each aggregation specifies:
Supported columns for aggregation:
type- The aggregation typecolumn- The column to aggregate on
| Type | Description |
|---|---|
count | Count of records |
sum | Sum of values |
p50 | 50th percentile (median) |
p75 | 75th percentile |
p90 | 90th percentile |
p99 | 99th percentile |
costInUSD- Cost incurred in USDinputTokens- Number of input tokensoutputTokens- Number of output tokenslatencyMs- Total request latency (ms)interTokenLatencyMs- Latency between the generation of consecutive tokens (ms)timeToFirstTokenMs- Time to first token (ms)timePerOutputTokenLatencyMs- Latency per output token (ms)
Array of fields to group the metrics by. Available options:
modelName- Group by model nameuserEmail- Group by user emailvirtualaccount- Group by virtual accountteam- Group by team (unnests the Teams array)virtualModel- Group by virtual modelerrorCode- HTTP error code returnedrequestType- Type of model request (e.g.ChatCompletion,Embeddingetc)providerAccountType- Account type of provider (e.g.model,mcp-server,guardrail-config)providerModelName- Underlying provider model namecreatedBySubjectType- Subject type (e.g.user,virtualaccount)metadata.<key>- Group by a custom metadata key (e.g.,metadata.environment)
Required for timeseries queries. The time interval in seconds for grouping data points.Common values:
60- 1 minute intervals300- 5 minute intervals1800- 30 minute intervals3600- 1 hour intervals86400- 1 day intervals