Compute Node

The Compute Node executes AI models in secure sandboxed environments. It connects to Gateway Nodes, receives inference tasks, and returns signed results.

Overview

┌─────────────────────────────────────────────────────────────────────────┐
│                            Compute Node                                  │
│                                                                         │
│   Gateway Connection                                                    │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │  Bidirectional gRPC Stream                                       │   │
│   │  ◀── Receive InferenceRequest                                    │   │
│   │  ──▶ Send NodeInfo (heartbeat)                                   │   │
│   │  ──▶ Send InferenceResponse                                      │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                           │                                             │
│                           ▼                                             │
│   Execution Manager                                                     │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │  ┌───────────────┐  ┌───────────────┐  ┌───────────────┐       │   │
│   │  │ Model Loader  │  │ Executor      │  │ Result Signer │       │   │
│   │  │ & Cache       │  │ Selector      │  │               │       │   │
│   │  └───────────────┘  └───────────────┘  └───────────────┘       │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                           │                                             │
│                           ▼                                             │
│   Sandbox Layer                                                         │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │  ┌─────────────────────────────────────────────────────────┐    │   │
│   │  │                    OS Sandbox                            │    │   │
│   │  │  • seccomp (syscall filtering)                          │    │   │
│   │  │  • Linux namespaces (PID, network, mount, IPC)          │    │   │
│   │  │  • Landlock (filesystem isolation)                       │    │   │
│   │  │  • Resource limits (CPU, memory, time)                   │    │   │
│   │  └─────────────────────────────────────────────────────────┘    │   │
│   │  ┌─────────────────────────────────────────────────────────┐    │   │
│   │  │                  Python Executor                         │    │   │
│   │  │  • Isolated Python process                              │    │   │
│   │  │  • File-based I/O                                       │    │   │
│   │  │  • Timeout enforcement                                   │    │   │
│   │  └─────────────────────────────────────────────────────────┘    │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                           │                                             │
│                           ▼                                             │
│   Model Storage                                                         │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │  ./models/                                                       │   │
│   │  ├── bark_semantic_model.pt                                      │   │
│   │  ├── bark_coarse_model.pt                                        │   │
│   │  └── bark_fine_model.pt                                          │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Features

Sandboxed Execution: Multi-layer security isolation for model execution
Python Executor: Runs PyTorch and Transformers models
Cryptographic Signing: Signs all computation results
Model Caching: Efficient model loading with TTL-based cache
Model Sync: Auto-download models from subnets
Resource Limits: CPU, memory, and time bounds

Quick Start

Build

cd ncn-network-v2-rs
cargo build --release -p compute_node

Run

cargo run --release --bin compute_node -- \
  --gateway-addr http://127.0.0.1:50051 \
  --model-path ./models

Expected Output:

Starting Compute Node...
Loading wallet from environment...
✓ Wallet initialized: 0x742d35Cc...
Connecting to Gateway at http://127.0.0.1:50051...
✓ Connected to Gateway
✓ Registered with supported models: ["bark_semantic", "bark_coarse"]
✓ Compute Node ready for tasks

Configuration

Command Line Arguments

Argument

Required

Default

Description

--gateway-addr

Yes

Gateway gRPC address

--model-path

./models

Model files directory

--registry-addr

P2P Registry address (for sync)

--subnet-id

1

Subnet to join

--sync-models

false

Enable model synchronization

--models-dir

./models

Download directory

Environment Variables

Variable

Required

Default

Description

COMPUTE_NODE_PRIVATE_KEY

Yes*

Node wallet private key

NODE_WALLET_ADDRESS

Derived

Node wallet address

SANDBOX_MODE

strict

Sandbox mode (strict/permissive/disabled)

EXECUTION_TIMEOUT_SECS

300

Max execution time

MAX_MEMORY_MB

4096

Max memory per execution

ENABLE_EXECUTION_FALLBACK

false

Enable fallback execution

REQUIRE_PAYMENT

false

Require payment validation

PYTHON_EXECUTOR_PATH

python_executors

Path to Python executors

*Auto-generated if not provided (development only)

See Configuration Reference for complete details.

Sandbox Security

The compute node uses multiple layers of security isolation:

Security Layers

Layer

Technology

Protection

seccomp

Filters dangerous system calls

PID namespace

Isolates process visibility

Network namespace

No external network access

Mount namespace

Isolated filesystem view

Landlock

Fine-grained filesystem access control

Resource limits

Bounds CPU, memory, time

Sandbox Modes

Mode

seccomp

Namespaces

Landlock

Limits

Use Case

strict

✅

Production

permissive

✅

⚠️

✅

Testing

disabled

❌

⚠️

Development only

Blocked Operations

Operation

Protection

Network access

Network namespace isolation

Read /etc/passwd

Landlock filesystem rules

Fork bomb

RLIMIT_NPROC limit

Memory exhaustion

RLIMIT_AS limit

CPU exhaustion

RLIMIT_CPU limit

Access other processes

PID namespace isolation

See Sandbox Execution for deep dive.

Model Execution

Supported Model Formats

Format

Extension

Framework

Auto-detect

TorchScript

.pt, .pth

PyTorch

✅

ONNX

.onnx

ONNX Runtime

✅

Safetensors

.safetensors

Multiple

✅

Executor Scripts

Located in compute_node/python_executors/:

Script

Purpose

generic_model_executor.py

General-purpose executor

semantic_model_executor.py

Bark semantic stage

coarse_model_executor.py

Bark coarse stage

Execution Flow

1. Receive InferenceRequest
        │
        ▼
2. Determine model type
        │
        ▼
3. Select appropriate executor script
        │
        ▼
4. Prepare sandbox environment
   ├── Create temp directory
   ├── Copy executor script
   └── Apply security restrictions
        │
        ▼
5. Execute in sandbox
   ├── Write input to file
   ├── Run Python process
   ├── Enforce timeout
   └── Read output from file
        │
        ▼
6. Sign result
   ├── Hash output (SHA256)
   └── Sign with node wallet (secp256k1)
        │
        ▼
7. Return InferenceResponse
   ├── output_data
   ├── response_hash
   ├── response_signature
   └── completion_validation

See Model Management and Python Executors for details.

Signing

All computation results are cryptographically signed:

Signature Components

Field

Description

response_hash

SHA256 hash of output_data

response_signature

secp256k1 signature

completion_validation

Complete attestation

Signed Message Format

SHA256(request_id || result_data || timestamp)

Verification

// On gateway or validator
let message = format!("{}{}{}", request_id, result_data, timestamp);
let message_hash = sha256(&message);
let recovered_address = signature.recover(message_hash)?;
assert_eq!(recovered_address, compute_node_address);

Model Synchronization

Compute nodes can auto-download models from subnets:

Enable Sync

cargo run --bin compute_node -- \
  --gateway-addr http://127.0.0.1:50051 \
  --registry-addr http://127.0.0.1:50050 \
  --subnet-id 1 \
  --sync-models \
  --models-dir ./models

Sync Process

Query P2P Registry for subnet metadata
Compare local models with subnet requirements
Download missing models
Verify model hashes
Extract executor scripts

Metrics & Monitoring

Execution Statistics

The compute node tracks:

Total executions
Successful / failed count
Average execution time
Model-specific statistics

Logging

# Enable debug logging
RUST_LOG=debug cargo run --bin compute_node -- ...

# Filter to specific modules
RUST_LOG=compute_node::sandbox=trace cargo run --bin compute_node -- ...

Troubleshooting

Common Issues

"Failed to connect to Gateway"

Error: Failed to connect to Gateway at http://127.0.0.1:50051

Verify Gateway is running
Check --gateway-addr is correct

"Model not found"

Error: Model bark_semantic_model.pt not found at ./models/bark_semantic_model.pt

Ensure model file exists in --model-path
Enable --sync-models for auto-download

"Sandbox execution failed"

Error: Sandbox execution failed: Timeout after 300s

Increase EXECUTION_TIMEOUT_SECS
Check model complexity
Use SANDBOX_MODE=permissive for debugging

"Permission denied in sandbox"

Error: Landlock: Access denied to /some/path

Ensure all required files are in allowed paths
Check Landlock rules in sandbox config

See Compute Troubleshooting for more help.

Python Environment Setup

Create Virtual Environment

cd compute_node/python_executors

# Create environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Verify
python verify_environment.py

Requirements

torch>=2.1.0
torchaudio>=2.1.0
transformers>=4.35.0
numpy>=1.24.3
scipy>=1.11.3
soundfile>=0.12.1
librosa>=0.10.1

Architecture - Internal architecture details
Sandbox Execution - Security isolation deep dive
Model Management - Model loading and caching
Python Executors - Executor scripts
Configuration - Configuration options

PreviousGateway Node NextP2P Registry

Last updated 3 months ago

hashtagOverview

hashtagFeatures

hashtagQuick Start

hashtagBuild

hashtagRun

hashtagConfiguration

hashtagCommand Line Arguments

hashtagEnvironment Variables

hashtagSandbox Security

hashtagSecurity Layers

hashtagSandbox Modes

hashtagBlocked Operations

hashtagModel Execution

hashtagSupported Model Formats

hashtagExecutor Scripts

hashtagExecution Flow

hashtagSigning

hashtagSignature Components

hashtagSigned Message Format

hashtagVerification

hashtagModel Synchronization

hashtagEnable Sync

hashtagSync Process

hashtagMetrics & Monitoring

hashtagExecution Statistics

hashtagLogging

hashtagTroubleshooting

hashtagCommon Issues

hashtagPython Environment Setup

hashtagCreate Virtual Environment

hashtagRequirements

hashtagRelated Documentation

Overview

Features

Quick Start

Build

Run

Configuration

Command Line Arguments

Environment Variables

Sandbox Security

Security Layers

Sandbox Modes

Blocked Operations

Model Execution

Supported Model Formats

Executor Scripts

Execution Flow

Signing

Signature Components

Signed Message Format

Verification

Model Synchronization

Enable Sync

Sync Process

Metrics & Monitoring

Execution Statistics

Logging

Troubleshooting

Common Issues

Python Environment Setup

Create Virtual Environment

Requirements

Related Documentation