# Compute Node

The Compute Node executes AI models in secure sandboxed environments. It connects to Gateway Nodes, receives inference tasks, and returns signed results.

***

## Overview

```
┌─────────────────────────────────────────────────────────────────────────┐
│                            Compute Node                                  │
│                                                                         │
│   Gateway Connection                                                    │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │  Bidirectional gRPC Stream                                       │   │
│   │  ◀── Receive InferenceRequest                                    │   │
│   │  ──▶ Send NodeInfo (heartbeat)                                   │   │
│   │  ──▶ Send InferenceResponse                                      │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                           │                                             │
│                           ▼                                             │
│   Execution Manager                                                     │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │  ┌───────────────┐  ┌───────────────┐  ┌───────────────┐       │   │
│   │  │ Model Loader  │  │ Executor      │  │ Result Signer │       │   │
│   │  │ & Cache       │  │ Selector      │  │               │       │   │
│   │  └───────────────┘  └───────────────┘  └───────────────┘       │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                           │                                             │
│                           ▼                                             │
│   Sandbox Layer                                                         │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │  ┌─────────────────────────────────────────────────────────┐    │   │
│   │  │                    OS Sandbox                            │    │   │
│   │  │  • seccomp (syscall filtering)                          │    │   │
│   │  │  • Linux namespaces (PID, network, mount, IPC)          │    │   │
│   │  │  • Landlock (filesystem isolation)                       │    │   │
│   │  │  • Resource limits (CPU, memory, time)                   │    │   │
│   │  └─────────────────────────────────────────────────────────┘    │   │
│   │  ┌─────────────────────────────────────────────────────────┐    │   │
│   │  │                  Python Executor                         │    │   │
│   │  │  • Isolated Python process                              │    │   │
│   │  │  • File-based I/O                                       │    │   │
│   │  │  • Timeout enforcement                                   │    │   │
│   │  └─────────────────────────────────────────────────────────┘    │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                           │                                             │
│                           ▼                                             │
│   Model Storage                                                         │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │  ./models/                                                       │   │
│   │  ├── bark_semantic_model.pt                                      │   │
│   │  ├── bark_coarse_model.pt                                        │   │
│   │  └── bark_fine_model.pt                                          │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘
```

***

## Features

* **Sandboxed Execution**: Multi-layer security isolation for model execution
* **Python Executor**: Runs PyTorch and Transformers models
* **Cryptographic Signing**: Signs all computation results
* **Model Caching**: Efficient model loading with TTL-based cache
* **Model Sync**: Auto-download models from subnets
* **Resource Limits**: CPU, memory, and time bounds

***

## Quick Start

### Build

```bash
cd ncn-network-v2-rs
cargo build --release -p compute_node
```

### Run

```bash
cargo run --release --bin compute_node -- \
  --gateway-addr http://127.0.0.1:50051 \
  --model-path ./models
```

**Expected Output**:

```
Starting Compute Node...
Loading wallet from environment...
✓ Wallet initialized: 0x742d35Cc...
Connecting to Gateway at http://127.0.0.1:50051...
✓ Connected to Gateway
✓ Registered with supported models: ["bark_semantic", "bark_coarse"]
✓ Compute Node ready for tasks
```

***

## Configuration

### Command Line Arguments

| Argument          | Required | Default    | Description                     |
| ----------------- | -------- | ---------- | ------------------------------- |
| `--gateway-addr`  | Yes      | -          | Gateway gRPC address            |
| `--model-path`    | No       | `./models` | Model files directory           |
| `--registry-addr` | No       | -          | P2P Registry address (for sync) |
| `--subnet-id`     | No       | `1`        | Subnet to join                  |
| `--sync-models`   | No       | `false`    | Enable model synchronization    |
| `--models-dir`    | No       | `./models` | Download directory              |

### Environment Variables

| Variable                    | Required | Default            | Description                               |
| --------------------------- | -------- | ------------------ | ----------------------------------------- |
| `COMPUTE_NODE_PRIVATE_KEY`  | Yes\*    | -                  | Node wallet private key                   |
| `NODE_WALLET_ADDRESS`       | No       | Derived            | Node wallet address                       |
| `SANDBOX_MODE`              | No       | `strict`           | Sandbox mode (strict/permissive/disabled) |
| `EXECUTION_TIMEOUT_SECS`    | No       | `300`              | Max execution time                        |
| `MAX_MEMORY_MB`             | No       | `4096`             | Max memory per execution                  |
| `ENABLE_EXECUTION_FALLBACK` | No       | `false`            | Enable fallback execution                 |
| `REQUIRE_PAYMENT`           | No       | `false`            | Require payment validation                |
| `PYTHON_EXECUTOR_PATH`      | No       | `python_executors` | Path to Python executors                  |

\*Auto-generated if not provided (development only)

See [Configuration Reference](https://github.com/NeuroChainAi/docs-guides/blob/main/components/compute-node/configuration.md) for complete details.

***

## Sandbox Security

The compute node uses multiple layers of security isolation:

### Security Layers

| Layer | Technology            | Protection                             |
| ----- | --------------------- | -------------------------------------- |
| 1     | **seccomp**           | Filters dangerous system calls         |
| 2     | **PID namespace**     | Isolates process visibility            |
| 3     | **Network namespace** | No external network access             |
| 4     | **Mount namespace**   | Isolated filesystem view               |
| 5     | **Landlock**          | Fine-grained filesystem access control |
| 6     | **Resource limits**   | Bounds CPU, memory, time               |

### Sandbox Modes

| Mode         | seccomp | Namespaces | Landlock | Limits | Use Case         |
| ------------ | ------- | ---------- | -------- | ------ | ---------------- |
| `strict`     | ✅       | ✅          | ✅        | ✅      | Production       |
| `permissive` | ✅       | ⚠️         | ⚠️       | ✅      | Testing          |
| `disabled`   | ❌       | ❌          | ❌        | ⚠️     | Development only |

### Blocked Operations

| Operation              | Protection                  |
| ---------------------- | --------------------------- |
| Network access         | Network namespace isolation |
| Read `/etc/passwd`     | Landlock filesystem rules   |
| Fork bomb              | `RLIMIT_NPROC` limit        |
| Memory exhaustion      | `RLIMIT_AS` limit           |
| CPU exhaustion         | `RLIMIT_CPU` limit          |
| Access other processes | PID namespace isolation     |

See [Sandbox Execution](https://github.com/NeuroChainAi/docs-guides/blob/main/components/compute-node/sandbox-execution.md) for deep dive.

***

## Model Execution

### Supported Model Formats

| Format      | Extension      | Framework    | Auto-detect |
| ----------- | -------------- | ------------ | ----------- |
| TorchScript | `.pt`, `.pth`  | PyTorch      | ✅           |
| ONNX        | `.onnx`        | ONNX Runtime | ✅           |
| Safetensors | `.safetensors` | Multiple     | ✅           |

### Executor Scripts

Located in `compute_node/python_executors/`:

| Script                       | Purpose                  |
| ---------------------------- | ------------------------ |
| `generic_model_executor.py`  | General-purpose executor |
| `semantic_model_executor.py` | Bark semantic stage      |
| `coarse_model_executor.py`   | Bark coarse stage        |

### Execution Flow

```
1. Receive InferenceRequest
        │
        ▼
2. Determine model type
        │
        ▼
3. Select appropriate executor script
        │
        ▼
4. Prepare sandbox environment
   ├── Create temp directory
   ├── Copy executor script
   └── Apply security restrictions
        │
        ▼
5. Execute in sandbox
   ├── Write input to file
   ├── Run Python process
   ├── Enforce timeout
   └── Read output from file
        │
        ▼
6. Sign result
   ├── Hash output (SHA256)
   └── Sign with node wallet (secp256k1)
        │
        ▼
7. Return InferenceResponse
   ├── output_data
   ├── response_hash
   ├── response_signature
   └── completion_validation
```

See [Model Management](https://github.com/NeuroChainAi/docs-guides/blob/main/components/compute-node/model-management.md) and [Python Executors](https://github.com/NeuroChainAi/docs-guides/blob/main/components/compute-node/python-executors.md) for details.

***

## Signing

All computation results are cryptographically signed:

### Signature Components

| Field                   | Description                 |
| ----------------------- | --------------------------- |
| `response_hash`         | SHA256 hash of output\_data |
| `response_signature`    | secp256k1 signature         |
| `completion_validation` | Complete attestation        |

### Signed Message Format

```
SHA256(request_id || result_data || timestamp)
```

### Verification

```rust
// On gateway or validator
let message = format!("{}{}{}", request_id, result_data, timestamp);
let message_hash = sha256(&message);
let recovered_address = signature.recover(message_hash)?;
assert_eq!(recovered_address, compute_node_address);
```

***

## Model Synchronization

Compute nodes can auto-download models from subnets:

### Enable Sync

```bash
cargo run --bin compute_node -- \
  --gateway-addr http://127.0.0.1:50051 \
  --registry-addr http://127.0.0.1:50050 \
  --subnet-id 1 \
  --sync-models \
  --models-dir ./models
```

### Sync Process

1. Query P2P Registry for subnet metadata
2. Compare local models with subnet requirements
3. Download missing models
4. Verify model hashes
5. Extract executor scripts

***

## Metrics & Monitoring

### Execution Statistics

The compute node tracks:

* Total executions
* Successful / failed count
* Average execution time
* Model-specific statistics

### Logging

```bash
# Enable debug logging
RUST_LOG=debug cargo run --bin compute_node -- ...

# Filter to specific modules
RUST_LOG=compute_node::sandbox=trace cargo run --bin compute_node -- ...
```

***

## Troubleshooting

### Common Issues

**"Failed to connect to Gateway"**

```
Error: Failed to connect to Gateway at http://127.0.0.1:50051
```

* Verify Gateway is running
* Check `--gateway-addr` is correct

**"Model not found"**

```
Error: Model bark_semantic_model.pt not found at ./models/bark_semantic_model.pt
```

* Ensure model file exists in `--model-path`
* Enable `--sync-models` for auto-download

**"Sandbox execution failed"**

```
Error: Sandbox execution failed: Timeout after 300s
```

* Increase `EXECUTION_TIMEOUT_SECS`
* Check model complexity
* Use `SANDBOX_MODE=permissive` for debugging

**"Permission denied in sandbox"**

```
Error: Landlock: Access denied to /some/path
```

* Ensure all required files are in allowed paths
* Check Landlock rules in sandbox config

See [Compute Troubleshooting](https://github.com/NeuroChainAi/docs-guides/blob/main/troubleshooting/compute-issues.md) for more help.

***

## Python Environment Setup

### Create Virtual Environment

```bash
cd compute_node/python_executors

# Create environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Verify
python verify_environment.py
```

### Requirements

```
torch>=2.1.0
torchaudio>=2.1.0
transformers>=4.35.0
numpy>=1.24.3
scipy>=1.11.3
soundfile>=0.12.1
librosa>=0.10.1
```

***

## Related Documentation

* [Architecture](https://github.com/NeuroChainAi/docs-guides/blob/main/components/compute-node/architecture.md) - Internal architecture details
* [Sandbox Execution](https://github.com/NeuroChainAi/docs-guides/blob/main/components/compute-node/sandbox-execution.md) - Security isolation deep dive
* [Model Management](https://github.com/NeuroChainAi/docs-guides/blob/main/components/compute-node/model-management.md) - Model loading and caching
* [Python Executors](https://github.com/NeuroChainAi/docs-guides/blob/main/components/compute-node/python-executors.md) - Executor scripts
* [Configuration](https://github.com/NeuroChainAi/docs-guides/blob/main/components/compute-node/configuration.md) - Configuration options


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.neurochain.ai/nc/neurochainai-guides/components/compute-node.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
