Toto

by Datadog · 2 versions, 5 parameter counts · Apache 2.0 · HuggingFace · Blog post
forecasting multivariate fine-tunable

Time-series foundation model family from Datadog. Trained on 2 trillion data points of real-world infrastructure, business, and IoT metrics. Toto 2.0 ships in five sizes from 4M to 2.5B parameters, so you can trade off forecast quality against memory and latency.

Quick start

With no flags, routeframe pull grabs the latest version and smallest parameter count:

$routeframe pull toto

That pulls Toto 2.0 / 4M (~16 MB). To see everything available, run:

$routeframe pull toto --list

Versions & sizes

Toto 2.0 default

Current generation. u-µP-scaled decoder with alternating time / variate attention and a quantile output head.

ParamsSize (f32)Pull command
4M default 16 MB routeframe pull toto
22M 84 MB routeframe pull toto --params 22m
313M 1.2 GB routeframe pull toto --params 313m
1B 3.9 GB routeframe pull toto --params 1b
2.5B 9.1 GB routeframe pull toto --params 2.5b

Toto 1.0

Original Toto-Open-Base model. Mixture-of-Student-T output head. Supports multivariate inputs and exogenous covariates (via fine-tuning).

ParamsSizePull command
Open-Base (200M) 289 MB (f16), 577 MB (f32) routeframe pull toto --version 1.0

Pre-v0.7.0 routeframe clients still default to this version, so existing installations are unaffected by the 2.0 release.

Architecture

Toto 2.0

Architecture
Decoder-only transformer with u-µP scaling
Attention
Alternating time / variate, partial RoPE + xPos on Q/K
Output head
Quantile knots at [0.1, 0.2, ..., 0.9]
Scaler
Patched causal std-mean + asinh
Default context
512 timesteps
Patch size
32

Toto 1.0

Architecture
Decoder-only transformer (200M Open-Base)
Output head
Mixture of Student-T (24 components)
Scaler
Causal patch std-mean
Default context
4096 timesteps
Patch size
64
Multivariate
Yes (supports exogenous covariates via fine-tuning)

Feature support

FeatureToto 1.0Toto 2.0
Zero-shot forecasting
Fine-tuning ×
Exogenous covariates* ×
Multivariate inputs

* If you need exogenous covariates, use Toto 1.0.

Run a forecast

$routeframe run toto --input "45,48,52,49,55,58,62,59,64,67" --horizon 4

See the CLI docs for monitoring, exogenous covariates, and fine-tuning.