API Reference/Docs/
Compute · Overview

ZiB Network

ZiB Compute Layer

ZiB Compute extends the storage network with AI inference running directly on compute-capable nodes. All compute jobs operate on the encrypted file stored on the network — the node decrypts in memory, runs the model, encrypts the output, and stores the result back on the network. Neither the input file nor the AI output is ever stored in plaintext.

Two compute primitives are currently available: Transcription (speech-to-text) and Vision AI (scene analysis). Both can be requested standalone or combined with a video encoding job.

Transcription
3 tiers
fastest · recommended · accurate
Vision AI
2 tiers
standard · hq
Output
Encrypted
Stored on network
Compute nodes never see plaintext. The input file is decrypted in-memory only for the duration of inference. The resulting SRT/WebVTT or ZibSidecar JSON is encrypted before being stored on the network. Node operators cannot read file content or AI output.

How It Works

01Job Submission

You call POST /v1/api/compute/:objectId with a transcription or vision field in the body. The backend records the job and queues it for a compute-capable node.

02Node Assignment

The scheduler assigns the job to an available compute node capable of running the requested tier. Nodes pull jobs from the queue over the ZiB mesh network.

03In-Memory Inference

The compute node fetches the encrypted file shards, reconstructs, decrypts in memory, runs the model, then immediately discards plaintext. The AES key is never written to disk on the node.

04Encrypted Output

The output (SRT subtitle file or ZibSidecar JSON) is encrypted with a fresh per-file key and stored redundantly on the storage network. The resulting file_id is recorded in the job record.

Available Tiers

Transcription

Option
Speed
Quality
fastestHighest throughputGood — basic captions
recommendedBalanced (default)Better — good for most content
accurateSlowerHighest — long-form / professional

Vision AI

Option
Speed
Quality
standardDefaultGood — scene + tags + summary
hqSlowerHigh — detailed analysis + nuance

Quick Start

Submit a transcription job for a file you have already uploaded to ZiB.

javascript
const storage = new ZiBStorage({
  accessKey: process.env.ZIB_ACCESS_KEY,
  secretKey: process.env.ZIB_SECRET_KEY,
});

// Submit transcription — fileId is the UUID from your upload response
const job = await storage.submitTranscription(fileId, 'recommended');
console.log('Job queued:', job.job_id);

// Poll until done (typically 10-60s depending on file length)
let status;
do {
  await new Promise(r => setTimeout(r, 3000));
  status = await storage.getComputeStatus(job.job_id);
} while (status.status !== 'complete' && status.status !== 'failed');

if (status.status === 'failed') {
  throw new Error(status.error_message);
}

// SRT file is stored encrypted on the ZiB network
const srtUrl = storage.getCdnUrl(status.srt_file_id);
console.log('Subtitle file:', srtUrl);
Tip: For video files that also need encoding, use startEncoding(fileId, { transcription: 'recommended' }) instead — transcription runs in parallel with encoding and subtitles are auto-embedded in the HLS stream.

Combined with Encoding

Pass transcription and/or vision options to startEncoding() to run AI jobs in parallel with video transcoding. This is the most efficient path — a single API call, a single job to poll.

javascript
// Single call — encoding + transcription + vision in parallel
const { encoding_id } = await storage.startEncoding(fileId, {
  transcription: 'recommended',
  vision: 'standard',
});

// Poll the encoding job — it completes when ALL tasks are done
let status;
do {
  await new Promise(r => setTimeout(r, 3000));
  status = await storage.getEncodingStatus(encoding_id);
} while (status.status !== 'complete' && status.status !== 'failed');

// HLS manifest — subtitles already embedded as #EXT-X-MEDIA tracks
console.log('HLS:', status.hls_manifest_url);

// DASH manifest
console.log('DASH:', status.dash_manifest_url);

What happens in parallel:

Video encodingMulti-quality HLS + DASH ladder (360p–2160p)
TranscriptionRuns on compute node, SRT stored encrypted
Vision AIRuns on compute node, ZibSidecar stored encrypted

Reading the Sidecar

After a Vision AI compute job completes, the sidecar_file_id in the job status response is the ZiB object ID of the encrypted ZibSidecar JSON. Fetch it the same way as any other ZiB file — the CDN decrypts on-the-fly and returns plain JSON.

javascript
// After compute job completes:
const status = await zib.getComputeStatus(jobId);

// Sidecar is a standard ZiB object — CDN decrypts on the fly
const sidecarUrl = zib.getCdnUrl(status.sidecar_file_id);
const sidecar = await fetch(sidecarUrl).then(r => r.json());

// Access any field directly
console.log(sidecar.scene_understanding.title_suggestion);
console.log(sidecar.content_classification.garm.brand_safety_score);
console.log(sidecar.chapters_youtube_format); // paste into YouTube description
console.log(sidecar.clip_suggestions[0].virality_score);
No special authentication is required to fetch the sidecar from the CDN. The file_id UUID acts as the access token, just like any other ZiB object. See the Vision AI docs for the complete ZibSidecar field reference and consumer examples.

Output Files

All compute outputs are encrypted and stored on the ZiB network, accessible via CDN URL using the file ID returned in the job status.

Transcription output (srt_file_id)

A standard SRT subtitle file. When transcription is requested alongside encoding, the subtitles are automatically embedded in the HLS manifest as an #EXT-X-MEDIA track. Any HLS player (hls.js, Video.js, AVPlayer) will pick them up automatically.

text
1
00:00:01,240 --> 00:00:04,800
Welcome to ZiB Network, the decentralised
encrypted storage platform.

2
00:00:05,100 --> 00:00:08,400
All files are encrypted client-side
before leaving the browser.

Vision AI output (sidecar_file_id)

A ZibSidecar JSON file. See the Vision AI docs for the full schema reference.

json
{
  "transcript": "Full transcript text...",
  "subtitles": [{ "start": 1.24, "end": 4.8, "text": "..." }],
  "chapters": [{ "start": 0, "title": "Introduction" }],
  "scene_understanding": {
    "overall_description": "A screen recording...",
    "key_moments": [...]
  },
  "content_classification": {
    "tags": ["technology", "storage", "encryption"],
    "categories": ["software", "tutorial"]
  },
  "clip_suggestions": [{ "start": 12.5, "end": 27.0, "reason": "..." }],
  "thumbnail_candidates": [{ "timestamp": 8.3, "score": 0.92 }]
}