API Reference

Start the REST server with routeframe serve (default: localhost:11435). All endpoints accept and return JSON.

POST /api/forecast

POST/api/forecast

Run a time series forecast with a loaded model.

Request body

{
  "model": "toto",
  "input": [[20, 22, 19, 23, 21, 25, 24, 26, 27, 25]],
  "prediction_length": 4,
  "future_exogenous": [[0, 1, 1, 0]]
}

Field	Type	Required	Description
`model`	string	Yes	Model name: `toto`, `timesfm`, `chronos2`
`input`	float[][]	Yes	Array of time series arrays. One inner array per variate.
`prediction_length`	int	No	Number of future steps to predict. Default: `1`.
`future_exogenous`	float[][]	No	Known future covariate values (one array per covariate, length = prediction_length).

Response

{
  "model": "toto",
  "mean": [[68.42, 71.05, 69.88, 72.31]],
  "inference_time_ms": 4
}

Field	Type	Description
`mean`	float[][]	Point predictions. One inner array per variate.
`inference_time_ms`	int	Wall-clock inference latency in milliseconds.

GET /api/tags

GET/api/tags

List all models that are downloaded and ready to use.

{
  "models": [
    {
      "name": "toto",
      "architecture": "toto2",
      "size_bytes": 16581280
    },
    {
      "name": "timesfm",
      "architecture": "timesfm",
      "size_bytes": 462817120
    }
  ]
}

POST /api/pull

POST/api/pull

Download a model from the registry in the background. The model will be available for inference once the download completes. With no flags, pulls the latest version and smallest parameter count.

// Request -- pull the default for this model
{ "model": "toto" }

// Request -- pin a specific version + parameter count + variant
{ "model": "toto", "version": "2.0", "params": "1b", "variant": "f32" }

// Response
{ "status": "started", "model": "toto" }

Field	Type	Required	Description
`model`	string	Yes	Model name: `toto`, `timesfm`, `chronos2`.
`version`	string	No	Model version (e.g. `1.0`, `2.0`). Defaults to the registry's `default_version`.
`params`	string	No	Parameter-count label (e.g. `4m`, `1b`, `open-base`). Defaults to the version's `default_params`.
`variant`	string	No	File variant: `f16` (half-precision) or `f32` (full precision). Defaults to `f16` when available, else `f32`.

Backward compatibility: the older two-field form ({"model","variant"}) continues to work; it resolves to the model's legacy variants block. Toto 2.0 ships f32 only — see the Toto model page for why.

POST /api/finetune

POST/api/finetune

Start a fine-tuning job in the background. The job runs asynchronously — poll /api/finetune/status to track progress.

// Request
{
  "output_model": "my-model",
  "data_path": "/path/to/data.csv",
  "targets": ["cpu_usage"],
  "exogenous": ["is_holiday"],
  "steps": 1400
}

// Response
{ "job_id": "ft-1", "status": "running" }

Field	Type	Description
`output_model`	string	Name for the saved fine-tuned model.
`data_path`	string	Absolute path to a CSV file with a timestamp column and target column(s).
`targets`	string[]	Column names to forecast.
`exogenous`	string[]	Optional known-future covariate column names.
`steps`	int	Training steps. Default: `1000`.

Check job status

GET/api/finetune/status?id=ft-1

{
  "job_id": "ft-1",
  "status": "completed",
  "output_model": "my-model",
  "error": null
}

Status values: running, completed, failed.

POST /api/show

POST/api/show

Return metadata for a specific model.

// Request
{ "model": "toto" }

// Response
{
  "name": "toto",
  "architecture": "toto2",
  "size_bytes": 16581280,
  "path": "/Users/you/.routeframe/models/toto.gguf"
}

DELETE /api/delete

DELETE/api/delete

Remove a downloaded model from disk.

// Request
{ "model": "toto" }

// Response
{ "status": "deleted", "model": "toto" }

GET /health

GET/health

Health check. Returns 200 OK when the server is ready.

{ "status": "ok", "version": "0.7.3" }

Error responses

All endpoints return standard HTTP status codes. Error bodies follow this shape:

{
  "error": "model 'toto' not found — run: routeframe pull toto"
}

Status	Meaning
`400`	Bad request — missing or invalid fields in the request body.
`404`	Model or resource not found.
`500`	Inference or internal error.