Request body
OptionalcostCents?: numberOptionalidempotencyKey?: stringOptionalinputTokens?: numberOptionallatencyMs?: numberOptionalmodel?: stringOptionaloutputTokens?: numberOptionalproperties?: { [key: string]: unknown }Optionalprovider?: stringOptionaltimestamp?: stringFormat: date-time
OptionaltotalTokens?: numberEvent tracked successfully - all limits within bounds
Track usage with enforcement
All-in-one endpoint for tracking usage with quota and rate limit enforcement.
This is the recommended endpoint* for most use cases. It performs three operations in sequence:
Response Codes:*
201 Created: Event tracked successfully, all limits within bounds429 Too Many Requests: Rate limit or quota exceeded (check response body for details)Rate Limit Headers:* The response includes standard rate limit headers:
X-RateLimit-Limit: Maximum requests allowedX-RateLimit-Remaining: Requests remaining in current windowX-RateLimit-Reset: Unix timestamp when window resetsRetry-After: Seconds to wait before retrying (on 429 responses)Quota Headers (on quota exceeded):* When a quota is exceeded, the response includes:
Retry-After: Seconds to wait before retryingX-Quota-Reset: Unix timestamp when the quota period resetsX-Quota-Period: Quota period (hour, day, week, month)X-Quota-Metric: Quota metric (total_tokens, total_events, total_cost_cents)Dimension Matching:* Quotas and rate limits are matched based on dimensions extracted from:
customerId→customer_ideventType→event_typemodel→modelprovider→providerproperties→ All string values are included as dimensions