← Back to Dashboard

API Documentation

Complete API reference for the Distributed Compute Platform. Base URL: /api/v1

Table of Contents

Authentication

Four authentication methods are used across different API areas:

Public API: X-API-Key: your-api-key
Internal API: X-Internal-Token: generated-token (persisted in data/internal_token)
Admin API: X-Admin-Key: admin-secret (from ADMIN_KEY env var)
SDK Downloads: X-SDK-Token: session-token (issued on device registration)

Default API Keys:

Internal API

Internal endpoints for service-to-service communication. Authenticated via X-Internal-Token header.

GET /internal/uuid-usage Query per-UUID compute usage

Returns per-SDK-UUID compute usage stats (successful tasks only), bucketed by hour. Rolling 3-day window. Only hours with actual activity are included.

Query Parameters
FieldTypeDescription
from required string Start hour in format YYYY-MM-DDTHH
to string End hour (defaults to current hour if omitted)
Response (200 OK)
{
  "from": "2026-03-15T21",
  "to": "2026-03-16T07",
  "data": [
    { "uuid": "sdk-python-c5d145", "hour": "2026-03-15T21", "cpu_ms": 0, "gpu_ms": 72917, "cpu_tasks": 0, "gpu_tasks": 1867 }
  ]
}
GET /internal/uuid-usage/errors Query per-UUID error usage

Same format as above, but for failed tasks only. Tracked separately from successful task stats.

Health Endpoints

Health check endpoints do not require authentication.

GET /health Basic health check

Returns basic health status and version information.

Response (200 OK)
{
  "status": "healthy",
  "version": "1.1.0-abc1234",
  "timestamp": "2026-02-08T10:30:00.000Z"
}
GET /health/detailed Detailed health with service status

Returns detailed health information including all service statuses and resource counts.

Response (200 OK)
{
  "status": "healthy",
  "services": {
    "device_registry": { "status": "healthy", "devices": 5 },
    "resource_pool": { "status": "healthy", "available": 3 },
    "job_manager": { "status": "healthy", "active_jobs": 2 },
    "model_store": { "status": "healthy", "models": 4 },
    "websocket": { "status": "healthy", "connections": 5 }
  },
  "resources": { ... },
  "jobs": { ... }
}

Models

Upload, manage, and retrieve ML models. Models are automatically distributed to SDK devices when needed.

POST /models Upload a model

Upload an ONNX or TorchScript model file. Use multipart/form-data encoding. Specify built-in preprocessing and/or postprocessing adapters to enable server-side format conversion via AWS Lambda.

Request (multipart/form-data)
FieldTypeDescription
model required file Model file (.onnx or .pt)
name string Model name (defaults to filename)
format string "onnx" or "torchscript"
input_shape json Input tensor shape, e.g., [1, 10]
output_shape json Output tensor shape, e.g., [1, 2]
input_schema json JSON Schema for input validation
output_schema json JSON Schema for output documentation
labels json Label names array, e.g., ["cat","dog"] — used by adapters
preprocessing string Built-in adapter: image_classification, tabular, text_tokens, or passthrough
postprocessing string Built-in adapter: top_k_labels, binary_classification, regression, or passthrough
Response (201 Created)
{
  "id": "model_abc12345",
  "name": "my-model",
  "format": "onnx",
  "size": 4096,
  "checksum": "sha256:...",
  "uploaded_at": "2026-02-08T10:30:00.000Z",
  "preprocessing": "image_classification",
  "postprocessing": "top_k_labels",
  "input_schema": { ... },
  "labels": ["cat", "dog"]
}
GET /models List all models

Returns all models owned by the authenticated user.

Response (200 OK)
{
  "models": [
    { "id": "model_abc", "name": "my-model", ... }
  ],
  "total": 1
}
GET /models/:id Get model details

Returns detailed information about a specific model.

DELETE /models/:id Delete a model

Deletes a model. Returns 204 No Content on success.

PATCH /models/:id/schema Update model schema and labels

Update input/output schema, labels, or adapter selections on an existing model without re-uploading. Only provided fields are modified.

Request Body
FieldTypeDescription
input_schemaobjectJSON Schema for input validation
output_schemaobjectJSON Schema for output documentation
labelsarrayLabel names
preprocessingstringBuilt-in preprocessing adapter name (or null)
postprocessingstringBuilt-in postprocessing adapter name (or null)

Returns updated model info.

Inference

Submit inference requests in synchronous or asynchronous mode. Batch inference is also supported.

POST /inference Submit inference (sync or async)

Submit a single inference request. By default runs synchronously (waits for result). Use async mode for fire-and-forget.

If the model has an input_schema, the input is validated against it (returns 400 on mismatch). If the model has a postprocessing adapter (other than passthrough), the result is the business format produced by the dcmp-postprocess Lambda (e.g. { predictions: [...] }) instead of raw {data, shape}. When preprocessing is set, input is expected in the business format the adapter consumes.

Note: adapter conversion currently applies only to synchronous single inference. Asynchronous calls (async=true) and POST /inference/batch return raw {data, shape} regardless of adapter configuration.

Request Body
FieldTypeDescription
model_id required string ID of the model to use
input required object Input tensor: { "data": [...], "shape": [1, 10] }
async boolean Run asynchronously (default: false)
project_id string Associate with a project for tracking
options.timeout number Timeout in milliseconds
options.prefer_gpu boolean Prefer GPU devices

Async Mode: Set via query param ?async=true, header X-Async: true, or body field "async": true

Sync Response (200 OK)
{
  "job_id": "infer_abc123",
  "status": "completed",
  "result": { "data": [0.8, 0.2], "shape": [1, 2] },
  "latency_ms": 45,
  "device_id": "dev_xyz789"
}
Async Response (202 Accepted)
{
  "job_id": "infer_abc123",
  "status": "queued",
  "message": "Inference job queued",
  "poll_url": "/api/v1/inference/infer_abc123"
}
POST /inference/batch Submit batch inference

Submit multiple inference requests as a batch. Always runs asynchronously.

Adapter limitation: the batch route does not currently invoke preprocessing / postprocessing adapters. Inputs must be raw {data, shape} tensors and results are returned as raw tensors, even for models that declare adapter fields.

Request Body
FieldTypeDescription
model_id required string ID of the model to use
inputs required array Array of input tensors
name string Batch job name
project_id string Project ID (auto-generated if not provided)
Response (202 Accepted)
{
  "job_id": "batch_abc123",
  "project_id": "batch_1707384600000",
  "status": "queued",
  "total_tasks": 100
}
GET /inference/:jobId Get inference job status

Get the status and results of an inference job. For batch jobs, includes progress and partial results.

DELETE /inference/:jobId Cancel inference job

Cancel a running or queued inference job.

Training

Start and manage distributed training jobs across multiple devices.

POST /training Start training job

Start a distributed training job. The system automatically distributes batches across available devices and aggregates gradients.

Request Body
FieldTypeDescription
model_id required string ID of the model to train
data_config required object Data configuration (see below)
training_config required object Training configuration (see below)
resource_config object Resource requirements
data_config
FieldTypeDescription
type string "stream" (generated data) or "dataset"
total_samples number Total training samples
batch_size number Batch size per device
training_config
FieldTypeDescription
epochs number Number of epochs
learning_rate number Learning rate (e.g., 0.001)
optimizer string "sgd" or "adam"
target_accuracy number Target accuracy for early stopping (e.g., 0.95)
sync_mode string FedAvg sync mode: "async" (default), "semi_sync", "full_sync"
local_steps number Batches before weight sync (default: 50)
sync_interval_ms number Weight pull interval in ms (default: 60000)
resource_config
FieldTypeDescription
min_devices number Minimum devices required (default: 1)
max_devices number Maximum devices to use
prefer_gpu boolean Prefer GPU devices
sdk_types array Filter by SDK type, e.g. ["python"]
min_sdk_version string Minimum SDK version, e.g. "1.1.6"

Devices are allocated proportional to job weight (total_samples * epochs). If no devices are free, the job queues until the scheduler rebalances and drains a device from an over-allocated job.

Response (202 Accepted)
{
  "job_id": "train_abc123",
  "status": "queued",
  "estimated_devices": 3
}
GET /training/:jobId Get training status

Get detailed status of a training job including progress, loss, and assigned devices.

Response (200 OK)
{
  "job_id": "train_abc123",
  "status": "running",
  "model_id": "model_xyz",
  "config": { "epochs": 10, "learning_rate": 0.001 },
  "progress": {
    "current_epoch": 3,
    "total_epochs": 10,
    "batches_completed": 150,
    "total_batches": 500,
    "percent_complete": 30,
    "current_loss": 0.4523,
    "best_loss": 0.4102
  },
  "assigned_devices": ["dev_a", "dev_b", "dev_c"]
}
POST /training/:jobId/pause Pause training

Pause a running training job. Can be resumed later.

POST /training/:jobId/resume Resume training

Resume a paused training job.

DELETE /training/:jobId Cancel training

Cancel a training job. Stops all distributed training tasks.

Devices

View connected SDK devices and their status.

GET /devices List devices

Returns all connected devices with their status and capabilities.

Query Parameters
FieldTypeDescription
status string Filter by status: "idle", "busy", "offline"
GET /devices/stats Get device statistics

Returns aggregated statistics about connected devices.

GET /devices/:id Get device details

Returns detailed information about a specific device including hardware, capabilities, and metrics.

GET /devices/:id/logs Pull logs from device

Requests logs from a connected SDK device via WebSocket. Returns the device's recent log entries.

Query Parameters
FieldTypeDescription
linesnumberNumber of log lines (default: 100)
levelstringFilter by level (default: "all")
timeoutnumberTimeout in ms (default: 30000)

Projects

Group related inference or training tasks into projects for tracking and analytics.

POST /projects Create project

Create a new project for tracking tasks.

Request Body
FieldTypeDescription
name string Project name
type string "realtime", "batch", "training", or "benchmark"
model_id string Associated model ID
total_tasks number Expected total tasks (for progress tracking)
GET /projects List projects

Returns all projects for the authenticated user.

GET /projects/:id Get project details

Returns full project details including all metrics.

GET /projects/:id/analytics Get project analytics

Returns computed analytics including latency percentiles, throughput data, and instance type comparisons.

POST /projects/:id/complete Mark project complete

Manually mark a realtime project as complete.

POST /projects/:id/cancel Cancel project

Cancel a project and all associated jobs.

Usage & Billing

Track compute usage for billing purposes. Tracks CPU/GPU time, task counts, and data transfer.

GET /usage Customer usage summary

Returns total compute usage for the authenticated customer including GPU/CPU hours, task counts, and estimated cost.

GET /usage/projects/:id Project usage

Returns compute usage for a specific project.

GET /usage/hourly Hourly usage breakdown

Returns hourly usage data for time-series analysis.

Query Parameters
FieldTypeDescription
hoursnumberHours to return (default: 24, max: 72)
GET /admin/usage All customer usage (admin)

Returns usage data for all customers. Requires X-Admin-Key header.

Efficiency Metrics

Training efficiency metrics: time-to-accuracy (TTA), scaling efficiency, and power consumption.

GET /efficiency Aggregate training stats

Returns aggregate training efficiency statistics.

Query Parameters
FieldTypeDescription
instance_typestring"aws", "user", or omit for both
GET /efficiency/comparison AWS vs user comparison

Returns side-by-side comparison of AWS instances vs user-contributed devices.

GET /efficiency/chart/:type Chart data

Chart-ready data. Types: tta, scaling, power, throughput.

GET /efficiency/active Active runs

Returns currently active training/inference runs with real-time metrics.

GET /efficiency/job/:jobId Per-job metrics

Detailed efficiency metrics for a specific job.

Dashboard Endpoints

Specialized endpoints for the web dashboard with aggregated data.

GET /dashboard/devices Device stats by platform

Returns device statistics grouped by platform (nodejs, python, windows).

GET /dashboard/jobs Recent jobs list

Returns jobs from the last 5 days, sorted by activity.

Query Parameters
FieldTypeDescription
type string Filter by type: "inference" or "training"
GET /dashboard/projects Recent projects

Returns up to 50 most recent projects, active first.

GET /dashboard/projects/:id/stream SSE streaming for project

Server-Sent Events endpoint for real-time project updates. Pushes stats every 2 seconds.

Query Parameters
FieldTypeDescription
formatstring"json" (default) or "text" (ANSI for CLI)
intervalnumberUpdate interval in ms (default: 2000)
GET /dashboard/tasks Recent tasks

Returns individual task records across all projects.

Query Parameters
FieldTypeDescription
limit number Max tasks to return (default: 100)
project_id string Filter by project
GET /dashboard/tasks/throughput Task throughput data

Returns time-bucketed throughput data for graphing.

Query Parameters
FieldTypeDescription
project_idstringFilter by project
bucket_sizenumberBucket size in seconds (default: 1)
GET /dashboard/efficiency Efficiency overview

Training efficiency overview with recent TTA, scaling, and power data.

GET /dashboard/comparison Global AWS vs user comparison

Training and inference comparison between AWS and user devices, formatted for dashboard charts.

GET /cluster/utilization Cluster utilization

Current cluster state: device allocation, CPU/GPU usage, and capacity metrics.

POST /dashboard/benchmark Run quick benchmark

Start a quick inference or training benchmark.

Request Body
FieldTypeDescription
mode string "inference", "training", or "gpu-training"
epochs number Training epochs (training mode only)
batch_size number Batch size (training mode only)
POST /dashboard/benchmark/perf Run performance benchmark

Start a step-load performance benchmark with configurable RPS targets.

Request Body
FieldTypeDescription
duration number Total duration in seconds (min: 60)
instance_type string "aws", "user", or "all"
rps_steps array RPS targets per step (default: [2,4,6,8,10,15,20,30,40,50])
max_concurrency number Max concurrent requests (default: 5)
GET /dashboard/benchmark/perf/:id Get benchmark status

Get status and results of a performance benchmark.

Error Codes

HTTP Status Codes

200Success 201Created 202Accepted (async operation started) 204No Content (successful deletion) 400Bad Request (validation error) 401Unauthorized (missing/invalid API key) 403Forbidden (not authorized for resource) 404Not Found 500Internal Server Error

Error Response Format

{
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable error message"
  }
}

Error Codes

AUTH_REQUIRED API key header is missing
AUTH_INVALID API key is invalid
NOT_FOUND Resource not found
FORBIDDEN Not authorized to access resource
VALIDATION_ERROR Request validation failed
NO_DEVICES No idle devices available
NO_TRAINING_DEVICES No devices with training capability
NO_MODEL Model not found or not loaded
CANNOT_CANCEL Job cannot be cancelled (already completed)
CANNOT_PAUSE Training cannot be paused
INTERNAL_ERROR Server error occurred

Complete Examples

Example 1: Simple Inference Flow

Upload a model and run inference

curl -X POST http://localhost:3000/api/v1/models \
  -H "X-API-Key: test-api-key" \
  -F "model=@my_model.onnx" \
  -F "name=my-classifier"

Response: {"id": "model_abc123", ...}

curl -X POST http://localhost:3000/api/v1/inference \
  -H "X-API-Key: test-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "model_abc123",
    "input": {
      "data": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0],
      "shape": [1, 10]
    }
  }'

Response: {"job_id": "infer_xyz", "result": {"data": [0.8, 0.2], "shape": [1, 2]}, "latency_ms": 45}

Example 2: Batch Inference with Project Tracking

Create a project and submit batch inference with progress tracking

curl -X POST http://localhost:3000/api/v1/projects \
  -H "X-API-Key: test-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Image Classification Batch",
    "type": "batch",
    "model_id": "model_abc123",
    "total_tasks": 100
  }'
curl -X POST http://localhost:3000/api/v1/inference/batch \
  -H "X-API-Key: test-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "model_abc123",
    "project_id": "proj_xyz",
    "inputs": [
      {"data": [...], "shape": [1, 10]},
      {"data": [...], "shape": [1, 10]}
    ]
  }'
curl http://localhost:3000/api/v1/inference/batch_123 \
  -H "X-API-Key: test-api-key"
curl http://localhost:3000/api/v1/projects/proj_xyz/analytics \
  -H "X-API-Key: test-api-key"

Example 3: Distributed Training

Start a distributed training job and monitor progress

curl -X POST http://localhost:3000/api/v1/training \
  -H "X-API-Key: test-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "model_abc123",
    "data_config": {
      "type": "stream",
      "total_samples": 10000,
      "batch_size": 32
    },
    "training_config": {
      "epochs": 10,
      "learning_rate": 0.001,
      "optimizer": "adam"
    },
    "resource_config": {
      "min_devices": 2,
      "max_devices": 5,
      "prefer_gpu": true
    }
  }'

Response: {"job_id": "train_abc", "status": "queued", "estimated_devices": 3}

curl http://localhost:3000/api/v1/training/train_abc \
  -H "X-API-Key: test-api-key"
curl -X POST http://localhost:3000/api/v1/training/train_abc/pause \
  -H "X-API-Key: test-api-key"
curl -X POST http://localhost:3000/api/v1/training/train_abc/resume \
  -H "X-API-Key: test-api-key"

Example 4: Realtime Inference with Project

For continuous inference, associate requests with a project to track metrics, latency percentiles, and throughput

curl -X POST http://localhost:3000/api/v1/projects \
  -H "X-API-Key: test-api-key" \
  -H "Content-Type: application/json" \
  -d '{"name": "Production API", "type": "realtime", "model_id": "model_abc"}'
curl -X POST http://localhost:3000/api/v1/inference \
  -H "X-API-Key: test-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "model_abc",
    "project_id": "proj_production",
    "input": {"data": [...], "shape": [1, 10]}
  }'
curl http://localhost:3000/api/v1/projects/proj_production/analytics \
  -H "X-API-Key: test-api-key"
curl -X POST http://localhost:3000/api/v1/projects/proj_production/complete \
  -H "X-API-Key: test-api-key"

Distributed Compute Platform API v1.1.33 | Dashboard | Health Check