API Documentation

Complete API reference for the Distributed Compute Platform. Base URL: /api/v1

Authentication
Internal API
Health Endpoints
Models
Inference
Training
Devices
Projects
Usage & Billing
Efficiency Metrics
Dashboard
Error Codes
Examples

Authentication

Four authentication methods are used across different API areas:

Public API: X-API-Key: your-api-key
Internal API: X-Internal-Token: generated-token (persisted in data/internal_token)
Admin API: X-Admin-Key: admin-secret (from ADMIN_KEY env var)
SDK Downloads: X-SDK-Token: session-token (issued on device registration)

Default API Keys:

test-api-key - General testing
dashboard-api-key - Dashboard access
benchmark-api-key - Benchmark operations

Internal API

Internal endpoints for service-to-service communication. Authenticated via X-Internal-Token header.

GET /internal/uuid-usage Query per-UUID compute usage

Returns per-SDK-UUID compute usage stats (successful tasks only), bucketed by hour. Rolling 3-day window. Only hours with actual activity are included.

Query Parameters

Field	Type	Description
from required	string	Start hour in format `YYYY-MM-DDTHH`
to	string	End hour (defaults to current hour if omitted)

Response (200 OK)

{
  "from": "2026-03-15T21",
  "to": "2026-03-16T07",
  "data": [
    { "uuid": "sdk-python-c5d145", "hour": "2026-03-15T21", "cpu_ms": 0, "gpu_ms": 72917, "cpu_tasks": 0, "gpu_tasks": 1867 }
  ]
}

GET /internal/uuid-usage/errors Query per-UUID error usage

Same format as above, but for failed tasks only. Tracked separately from successful task stats.

Health Endpoints

Health check endpoints do not require authentication.

GET /health Basic health check

Returns basic health status and version information.

Response (200 OK)

{
  "status": "healthy",
  "version": "1.1.0-abc1234",
  "timestamp": "2026-02-08T10:30:00.000Z"
}

GET /health/detailed Detailed health with service status

Returns detailed health information including all service statuses and resource counts.

Response (200 OK)

{
  "status": "healthy",
  "services": {
    "device_registry": { "status": "healthy", "devices": 5 },
    "resource_pool": { "status": "healthy", "available": 3 },
    "job_manager": { "status": "healthy", "active_jobs": 2 },
    "model_store": { "status": "healthy", "models": 4 },
    "websocket": { "status": "healthy", "connections": 5 }
  },
  "resources": { ... },
  "jobs": { ... }
}

Models

Upload, manage, and retrieve ML models. Models are automatically distributed to SDK devices when needed.

POST /models Upload a model

Upload an ONNX or TorchScript model file. Use multipart/form-data encoding. Specify built-in preprocessing and/or postprocessing adapters to enable server-side format conversion via AWS Lambda.

Request (multipart/form-data)

Field	Type	Description
model required	file	Model file (.onnx or .pt)
name	string	Model name (defaults to filename)
format	string	"onnx" or "torchscript"
input_shape	json	Input tensor shape, e.g., [1, 10]
output_shape	json	Output tensor shape, e.g., [1, 2]
input_schema	json	JSON Schema for input validation
output_schema	json	JSON Schema for output documentation
labels	json	Label names array, e.g., ["cat","dog"] — used by adapters
preprocessing	string	Built-in adapter: `image_classification`, `tabular`, `text_tokens`, or `passthrough`
postprocessing	string	Built-in adapter: `top_k_labels`, `binary_classification`, `regression`, or `passthrough`

Response (201 Created)

{
  "id": "model_abc12345",
  "name": "my-model",
  "format": "onnx",
  "size": 4096,
  "checksum": "sha256:...",
  "uploaded_at": "2026-02-08T10:30:00.000Z",
  "preprocessing": "image_classification",
  "postprocessing": "top_k_labels",
  "input_schema": { ... },
  "labels": ["cat", "dog"]
}

GET /models List all models

Returns all models owned by the authenticated user.

Response (200 OK)

{
  "models": [
    { "id": "model_abc", "name": "my-model", ... }
  ],
  "total": 1
}

GET /models/:id Get model details

Returns detailed information about a specific model.

DELETE /models/:id Delete a model

Deletes a model. Returns 204 No Content on success.

PATCH /models/:id/schema Update model schema and labels

Update input/output schema, labels, or adapter selections on an existing model without re-uploading. Only provided fields are modified.

Request Body

Field	Type	Description
input_schema	object	JSON Schema for input validation
output_schema	object	JSON Schema for output documentation
labels	array	Label names
preprocessing	string	Built-in preprocessing adapter name (or null)
postprocessing	string	Built-in postprocessing adapter name (or null)

Returns updated model info.

Inference

Submit inference requests in synchronous or asynchronous mode. Batch inference is also supported.

POST /inference Submit inference (sync or async)

Submit a single inference request. By default runs synchronously (waits for result). Use async mode for fire-and-forget.

If the model has an input_schema, the input is validated against it (returns 400 on mismatch). If the model has a postprocessing adapter (other than passthrough), the result is the business format produced by the dcmp-postprocess Lambda (e.g. { predictions: [...] }) instead of raw {data, shape}. When preprocessing is set, input is expected in the business format the adapter consumes.

Note: adapter conversion currently applies only to synchronous single inference. Asynchronous calls (async=true) and POST /inference/batch return raw {data, shape} regardless of adapter configuration.

Request Body

Field	Type	Description
model_id required	string	ID of the model to use
input required	object	Input tensor: `{ "data": [...], "shape": [1, 10] }`
async	boolean	Run asynchronously (default: false)
project_id	string	Associate with a project for tracking
options.timeout	number	Timeout in milliseconds
options.prefer_gpu	boolean	Prefer GPU devices

Async Mode: Set via query param ?async=true, header X-Async: true, or body field "async": true

Sync Response (200 OK)

{
  "job_id": "infer_abc123",
  "status": "completed",
  "result": { "data": [0.8, 0.2], "shape": [1, 2] },
  "latency_ms": 45,
  "device_id": "dev_xyz789"
}

Async Response (202 Accepted)

{
  "job_id": "infer_abc123",
  "status": "queued",
  "message": "Inference job queued",
  "poll_url": "/api/v1/inference/infer_abc123"
}

POST /inference/batch Submit batch inference

Submit multiple inference requests as a batch. Always runs asynchronously.

Adapter limitation: the batch route does not currently invoke preprocessing / postprocessing adapters. Inputs must be raw {data, shape} tensors and results are returned as raw tensors, even for models that declare adapter fields.

Request Body

Field	Type	Description
model_id required	string	ID of the model to use
inputs required	array	Array of input tensors
name	string	Batch job name
project_id	string	Project ID (auto-generated if not provided)

Response (202 Accepted)

{
  "job_id": "batch_abc123",
  "project_id": "batch_1707384600000",
  "status": "queued",
  "total_tasks": 100
}

GET /inference/:jobId Get inference job status

Get the status and results of an inference job. For batch jobs, includes progress and partial results.

DELETE /inference/:jobId Cancel inference job

Cancel a running or queued inference job.

Training

Start and manage distributed training jobs across multiple devices.

POST /training Start training job

Start a distributed training job. The system automatically distributes batches across available devices and aggregates gradients.

Request Body

Field	Type	Description
model_id required	string	ID of the model to train
data_config required	object	Data configuration (see below)
training_config required	object	Training configuration (see below)
resource_config	object	Resource requirements

data_config

Field	Type	Description
type	string	"stream" (generated data) or "dataset"
total_samples	number	Total training samples
batch_size	number	Batch size per device

training_config

Field	Type	Description
epochs	number	Number of epochs
learning_rate	number	Learning rate (e.g., 0.001)
optimizer	string	"sgd" or "adam"
target_accuracy	number	Target accuracy for early stopping (e.g., 0.95)
sync_mode	string	FedAvg sync mode: "async" (default), "semi_sync", "full_sync"
local_steps	number	Batches before weight sync (default: 50)
sync_interval_ms	number	Weight pull interval in ms (default: 60000)

resource_config

Field	Type	Description
min_devices	number	Minimum devices required (default: 1)
max_devices	number	Maximum devices to use
prefer_gpu	boolean	Prefer GPU devices
sdk_types	array	Filter by SDK type, e.g. ["python"]
min_sdk_version	string	Minimum SDK version, e.g. "1.1.6"

Devices are allocated proportional to job weight (total_samples * epochs). If no devices are free, the job queues until the scheduler rebalances and drains a device from an over-allocated job.

Response (202 Accepted)

{
  "job_id": "train_abc123",
  "status": "queued",
  "estimated_devices": 3
}

GET /training/:jobId Get training status

Get detailed status of a training job including progress, loss, and assigned devices.

Response (200 OK)

{
  "job_id": "train_abc123",
  "status": "running",
  "model_id": "model_xyz",
  "config": { "epochs": 10, "learning_rate": 0.001 },
  "progress": {
    "current_epoch": 3,
    "total_epochs": 10,
    "batches_completed": 150,
    "total_batches": 500,
    "percent_complete": 30,
    "current_loss": 0.4523,
    "best_loss": 0.4102
  },
  "assigned_devices": ["dev_a", "dev_b", "dev_c"]
}

POST /training/:jobId/pause Pause training

Pause a running training job. Can be resumed later.

POST /training/:jobId/resume Resume training

Resume a paused training job.

DELETE /training/:jobId Cancel training

Cancel a training job. Stops all distributed training tasks.

Devices

View connected SDK devices and their status.

GET /devices List devices

Returns all connected devices with their status and capabilities.

Query Parameters

Field	Type	Description
status	string	Filter by status: "idle", "busy", "offline"

GET /devices/stats Get device statistics

Returns aggregated statistics about connected devices.

GET /devices/:id Get device details

Returns detailed information about a specific device including hardware, capabilities, and metrics.

GET /devices/:id/logs Pull logs from device

Requests logs from a connected SDK device via WebSocket. Returns the device's recent log entries.

Query Parameters

Field	Type	Description
lines	number	Number of log lines (default: 100)
level	string	Filter by level (default: "all")
timeout	number	Timeout in ms (default: 30000)

Projects

Group related inference or training tasks into projects for tracking and analytics.

POST /projects Create project

Create a new project for tracking tasks.

Request Body

Field	Type	Description
name	string	Project name
type	string	"realtime", "batch", "training", or "benchmark"
model_id	string	Associated model ID
total_tasks	number	Expected total tasks (for progress tracking)

GET /projects List projects

Returns all projects for the authenticated user.

GET /projects/:id Get project details

Returns full project details including all metrics.

GET /projects/:id/analytics Get project analytics

Returns computed analytics including latency percentiles, throughput data, and instance type comparisons.

POST /projects/:id/complete Mark project complete

Manually mark a realtime project as complete.

POST /projects/:id/cancel Cancel project

Cancel a project and all associated jobs.

Usage & Billing

Track compute usage for billing purposes. Tracks CPU/GPU time, task counts, and data transfer.

GET /usage Customer usage summary

Returns total compute usage for the authenticated customer including GPU/CPU hours, task counts, and estimated cost.

GET /usage/projects/:id Project usage

Returns compute usage for a specific project.

GET /usage/hourly Hourly usage breakdown

Returns hourly usage data for time-series analysis.

Query Parameters

Field	Type	Description
hours	number	Hours to return (default: 24, max: 72)

GET /admin/usage All customer usage (admin)

Returns usage data for all customers. Requires X-Admin-Key header.

Efficiency Metrics

Training efficiency metrics: time-to-accuracy (TTA), scaling efficiency, and power consumption.

GET /efficiency Aggregate training stats

Returns aggregate training efficiency statistics.

Query Parameters

Field	Type	Description
instance_type	string	"aws", "user", or omit for both

GET /efficiency/comparison AWS vs user comparison

Returns side-by-side comparison of AWS instances vs user-contributed devices.

GET /efficiency/chart/:type Chart data

Chart-ready data. Types: tta, scaling, power, throughput.

GET /efficiency/active Active runs

Returns currently active training/inference runs with real-time metrics.

GET /efficiency/job/:jobId Per-job metrics

Detailed efficiency metrics for a specific job.

Dashboard Endpoints

Specialized endpoints for the web dashboard with aggregated data.

GET /dashboard/devices Device stats by platform

Returns device statistics grouped by platform (nodejs, python, windows).

GET /dashboard/jobs Recent jobs list

Returns jobs from the last 5 days, sorted by activity.

Query Parameters

Field	Type	Description
type	string	Filter by type: "inference" or "training"

GET /dashboard/projects Recent projects

Returns up to 50 most recent projects, active first.

GET /dashboard/projects/:id/stream SSE streaming for project

Server-Sent Events endpoint for real-time project updates. Pushes stats every 2 seconds.

Query Parameters

Field	Type	Description
format	string	"json" (default) or "text" (ANSI for CLI)
interval	number	Update interval in ms (default: 2000)

GET /dashboard/tasks Recent tasks

Returns individual task records across all projects.

Query Parameters

Field	Type	Description
limit	number	Max tasks to return (default: 100)
project_id	string	Filter by project

GET /dashboard/tasks/throughput Task throughput data

Returns time-bucketed throughput data for graphing.

Query Parameters

Field	Type	Description
project_id	string	Filter by project
bucket_size	number	Bucket size in seconds (default: 1)

GET /dashboard/efficiency Efficiency overview

Training efficiency overview with recent TTA, scaling, and power data.

GET /dashboard/comparison Global AWS vs user comparison

Training and inference comparison between AWS and user devices, formatted for dashboard charts.

GET /cluster/utilization Cluster utilization

Current cluster state: device allocation, CPU/GPU usage, and capacity metrics.

POST /dashboard/benchmark Run quick benchmark

Start a quick inference or training benchmark.

Request Body

Field	Type	Description
mode	string	"inference", "training", or "gpu-training"
epochs	number	Training epochs (training mode only)
batch_size	number	Batch size (training mode only)

POST /dashboard/benchmark/perf Run performance benchmark

Start a step-load performance benchmark with configurable RPS targets.

Request Body

Field	Type	Description
duration	number	Total duration in seconds (min: 60)
instance_type	string	"aws", "user", or "all"
rps_steps	array	RPS targets per step (default: [2,4,6,8,10,15,20,30,40,50])
max_concurrency	number	Max concurrent requests (default: 5)

GET /dashboard/benchmark/perf/:id Get benchmark status

Get status and results of a performance benchmark.

Error Codes

HTTP Status Codes

200Success 201Created 202Accepted (async operation started) 204No Content (successful deletion) 400Bad Request (validation error) 401Unauthorized (missing/invalid API key) 403Forbidden (not authorized for resource) 404Not Found 500Internal Server Error

Error Response Format

{
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable error message"
  }
}

Error Codes

AUTH_REQUIRED API key header is missing

AUTH_INVALID API key is invalid

NOT_FOUND Resource not found

FORBIDDEN Not authorized to access resource

VALIDATION_ERROR Request validation failed

NO_DEVICES No idle devices available

NO_TRAINING_DEVICES No devices with training capability

NO_MODEL Model not found or not loaded

CANNOT_CANCEL Job cannot be cancelled (already completed)

CANNOT_PAUSE Training cannot be paused

INTERNAL_ERROR Server error occurred

Complete Examples

Example 1: Simple Inference Flow

Upload a model and run inference

curl -X POST http://localhost:3000/api/v1/models \
  -H "X-API-Key: test-api-key" \
  -F "model=@my_model.onnx" \
  -F "name=my-classifier"

Response: {"id": "model_abc123", ...}

curl -X POST http://localhost:3000/api/v1/inference \
  -H "X-API-Key: test-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "model_abc123",
    "input": {
      "data": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0],
      "shape": [1, 10]
    }
  }'

Response: {"job_id": "infer_xyz", "result": {"data": [0.8, 0.2], "shape": [1, 2]}, "latency_ms": 45}

Example 2: Batch Inference with Project Tracking

Create a project and submit batch inference with progress tracking

curl -X POST http://localhost:3000/api/v1/projects \
  -H "X-API-Key: test-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Image Classification Batch",
    "type": "batch",
    "model_id": "model_abc123",
    "total_tasks": 100
  }'

curl -X POST http://localhost:3000/api/v1/inference/batch \
  -H "X-API-Key: test-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "model_abc123",
    "project_id": "proj_xyz",
    "inputs": [
      {"data": [...], "shape": [1, 10]},
      {"data": [...], "shape": [1, 10]}
    ]
  }'

curl http://localhost:3000/api/v1/inference/batch_123 \
  -H "X-API-Key: test-api-key"

curl http://localhost:3000/api/v1/projects/proj_xyz/analytics \
  -H "X-API-Key: test-api-key"

Example 3: Distributed Training

Start a distributed training job and monitor progress

curl -X POST http://localhost:3000/api/v1/training \
  -H "X-API-Key: test-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "model_abc123",
    "data_config": {
      "type": "stream",
      "total_samples": 10000,
      "batch_size": 32
    },
    "training_config": {
      "epochs": 10,
      "learning_rate": 0.001,
      "optimizer": "adam"
    },
    "resource_config": {
      "min_devices": 2,
      "max_devices": 5,
      "prefer_gpu": true
    }
  }'

Response: {"job_id": "train_abc", "status": "queued", "estimated_devices": 3}

curl http://localhost:3000/api/v1/training/train_abc \
  -H "X-API-Key: test-api-key"

curl -X POST http://localhost:3000/api/v1/training/train_abc/pause \
  -H "X-API-Key: test-api-key"

curl -X POST http://localhost:3000/api/v1/training/train_abc/resume \
  -H "X-API-Key: test-api-key"

Example 4: Realtime Inference with Project

For continuous inference, associate requests with a project to track metrics, latency percentiles, and throughput

curl -X POST http://localhost:3000/api/v1/projects \
  -H "X-API-Key: test-api-key" \
  -H "Content-Type: application/json" \
  -d '{"name": "Production API", "type": "realtime", "model_id": "model_abc"}'

curl -X POST http://localhost:3000/api/v1/inference \
  -H "X-API-Key: test-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "model_abc",
    "project_id": "proj_production",
    "input": {"data": [...], "shape": [1, 10]}
  }'

curl http://localhost:3000/api/v1/projects/proj_production/analytics \
  -H "X-API-Key: test-api-key"

curl -X POST http://localhost:3000/api/v1/projects/proj_production/complete \
  -H "X-API-Key: test-api-key"

Distributed Compute Platform API v1.1.33 | Dashboard | Health Check

API Documentation

Table of Contents

Authentication

Internal API

Query Parameters

Response (200 OK)

Health Endpoints

Response (200 OK)

Response (200 OK)

Models

Request (multipart/form-data)

Response (201 Created)

Response (200 OK)

Request Body

Inference

Request Body

Sync Response (200 OK)

Async Response (202 Accepted)

Request Body

Response (202 Accepted)

Training

Request Body

data_config

training_config

resource_config

Response (202 Accepted)

Response (200 OK)

Devices

Query Parameters

Query Parameters

Projects

Request Body

Usage & Billing

Query Parameters

Efficiency Metrics

Query Parameters

Dashboard Endpoints

Query Parameters

Query Parameters

Query Parameters

Query Parameters

Request Body

Request Body

Error Codes

HTTP Status Codes

Error Response Format

Error Codes

Complete Examples

Example 1: Simple Inference Flow

Example 2: Batch Inference with Project Tracking

Example 3: Distributed Training

Example 4: Realtime Inference with Project