API Reference

Start the REST server with routeframe serve (default: localhost:11435). All endpoints accept and return JSON.

POST /api/forecast

POST/api/forecast

Run a time series forecast with a loaded model.

Request body

{
  "model": "toto",
  "input": [[20, 22, 19, 23, 21, 25, 24, 26, 27, 25]],
  "prediction_length": 4,
  "future_exogenous": [[0, 1, 1, 0]]
}
FieldTypeRequiredDescription
modelstringYesModel name: toto, timesfm, chronos2
inputfloat[][]YesArray of time series arrays. One inner array per variate.
prediction_lengthintNoNumber of future steps to predict. Default: 1.
future_exogenousfloat[][]NoKnown future covariate values (one array per covariate, length = prediction_length).

Response

{
  "model": "toto",
  "mean": [[68.42, 71.05, 69.88, 72.31]],
  "inference_time_ms": 4
}
FieldTypeDescription
meanfloat[][]Point predictions. One inner array per variate.
inference_time_msintWall-clock inference latency in milliseconds.

GET /api/tags

GET/api/tags

List all models that are downloaded and ready to use.

{
  "models": [
    {
      "name": "toto",
      "architecture": "toto2",
      "size_bytes": 16581280
    },
    {
      "name": "timesfm",
      "architecture": "timesfm",
      "size_bytes": 462817120
    }
  ]
}

POST /api/pull

POST/api/pull

Download a model from the registry in the background. The model will be available for inference once the download completes. With no flags, pulls the latest version and smallest parameter count.

// Request -- pull the default for this model
{ "model": "toto" }

// Request -- pin a specific version + parameter count + variant
{ "model": "toto", "version": "2.0", "params": "1b", "variant": "f32" }

// Response
{ "status": "started", "model": "toto" }
FieldTypeRequiredDescription
modelstringYesModel name: toto, timesfm, chronos2.
versionstringNoModel version (e.g. 1.0, 2.0). Defaults to the registry's default_version.
paramsstringNoParameter-count label (e.g. 4m, 1b, open-base). Defaults to the version's default_params.
variantstringNoFile variant: f16 (half-precision) or f32 (full precision). Defaults to f16 when available, else f32.
Backward compatibility: the older two-field form ({"model","variant"}) continues to work; it resolves to the model's legacy variants block. Toto 2.0 ships f32 only — see the Toto model page for why.

POST /api/finetune

POST/api/finetune

Start a fine-tuning job in the background. The job runs asynchronously — poll /api/finetune/status to track progress.

// Request
{
  "output_model": "my-model",
  "data_path": "/path/to/data.csv",
  "targets": ["cpu_usage"],
  "exogenous": ["is_holiday"],
  "steps": 1400
}

// Response
{ "job_id": "ft-1", "status": "running" }
FieldTypeDescription
output_modelstringName for the saved fine-tuned model.
data_pathstringAbsolute path to a CSV file with a timestamp column and target column(s).
targetsstring[]Column names to forecast.
exogenousstring[]Optional known-future covariate column names.
stepsintTraining steps. Default: 1000.

Check job status

GET/api/finetune/status?id=ft-1
{
  "job_id": "ft-1",
  "status": "completed",
  "output_model": "my-model",
  "error": null
}

Status values: running, completed, failed.

POST /api/show

POST/api/show

Return metadata for a specific model.

// Request
{ "model": "toto" }

// Response
{
  "name": "toto",
  "architecture": "toto2",
  "size_bytes": 16581280,
  "path": "/Users/you/.routeframe/models/toto.gguf"
}

DELETE /api/delete

DELETE/api/delete

Remove a downloaded model from disk.

// Request
{ "model": "toto" }

// Response
{ "status": "deleted", "model": "toto" }

GET /health

GET/health

Health check. Returns 200 OK when the server is ready.

{ "status": "ok", "version": "0.7.3" }

Error responses

All endpoints return standard HTTP status codes. Error bodies follow this shape:

{
  "error": "model 'toto' not found — run: routeframe pull toto"
}
StatusMeaning
400Bad request — missing or invalid fields in the request body.
404Model or resource not found.
500Inference or internal error.