System Overview

This document provides a comprehensive overview of the NCN Network v2 architecture, including component responsibilities, interactions, and design decisions.


High-Level Architecture

NCN Network v2 is a decentralized AI inference network with four main component types:

┌──────────────────────────────────────────────────────────────────────────┐
│                              NCN Network v2                               │
│                                                                          │
│   ┌─────────────┐                                                        │
│   │   Clients   │  Submit inference requests, pay for computation        │
│   └──────┬──────┘                                                        │
│          │ HTTP/gRPC/WebSocket                                           │
│          ▼                                                               │
│   ┌─────────────────────────────────────────────────────────────────┐    │
│   │                        Gateway Layer                             │    │
│   │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐              │    │
│   │  │  Gateway 1  │  │  Gateway 2  │  │  Gateway N  │  ...         │    │
│   │  │  (Subnet 1) │  │  (Subnet 2) │  │  (Subnet N) │              │    │
│   │  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘              │    │
│   └─────────┼────────────────┼────────────────┼──────────────────────┘    │
│             │ gRPC           │                │                          │
│             ▼                ▼                ▼                          │
│   ┌─────────────────────────────────────────────────────────────────┐    │
│   │                        Compute Layer                             │    │
│   │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐              │    │
│   │  │  Compute 1  │  │  Compute 2  │  │  Compute M  │  ...         │    │
│   │  │  (GPU/CPU)  │  │  (GPU/CPU)  │  │  (GPU/CPU)  │              │    │
│   │  └─────────────┘  └─────────────┘  └─────────────┘              │    │
│   └──────────────────────────────────────────────────────────────────┘    │
│                                                                          │
│   ┌─────────────────────────────────────────────────────────────────┐    │
│   │                      Coordination Layer                          │    │
│   │  ┌───────────────────────────────────────────────────────────┐  │    │
│   │  │                    P2P Registry Network                    │  │    │
│   │  │  • Node Discovery (Kademlia DHT)                          │  │    │
│   │  │  • Validator Consensus                                     │  │    │
│   │  │  • Subnet Management                                       │  │    │
│   │  │  • Mempool for pending requests                            │  │    │
│   │  └───────────────────────────────────────────────────────────┘  │    │
│   └──────────────────────────────────────────────────────────────────┘    │
│                                                                          │
│   ┌─────────────────────────────────────────────────────────────────┐    │
│   │                       Blockchain Layer                           │    │
│   │  ┌─────────────────┐  ┌─────────────────┐                       │    │
│   │  │ InferencePayment│  │ValidatorRegistry│                       │    │
│   │  │    Contract     │  │    Contract     │                       │    │
│   │  └─────────────────┘  └─────────────────┘                       │    │
│   │                     Arbitrum / Ethereum                          │    │
│   └──────────────────────────────────────────────────────────────────┘    │
│                                                                          │
└──────────────────────────────────────────────────────────────────────────┘

Component Deep Dive

Gateway Node

Purpose: Routes inference requests between clients and compute nodes, manages payments.

Key Responsibilities:

  1. Receive client requests (gRPC, HTTP, WebSocket)

  2. Reserve available compute nodes

  3. Request preprocessing validation from P2P Registry

  4. Verify client payments on blockchain

  5. Dispatch tasks to compute nodes

  6. Request completion validation

  7. Trigger payment distribution

  8. Return results to clients

Internal Architecture:

State Management:

  • Reservation Manager: Tracks compute node reservations

  • Payment Processor: Handles payment requests and verification

  • State Manager: Persists request lifecycle state

  • Finalization Handler: Manages completion and payment distribution


Compute Node

Purpose: Executes AI models in secure sandboxed environments.

Key Responsibilities:

  1. Register with Gateway

  2. Sync models from subnet

  3. Receive inference tasks

  4. Execute models in sandbox

  5. Sign computation results

  6. Return results to Gateway

Internal Architecture:

Sandbox Security Layers:

Layer
Technology
Protection

Syscall filtering

seccomp

Blocks dangerous syscalls

Process isolation

PID namespace

Hides other processes

Network isolation

Network namespace

No external network

Filesystem isolation

Landlock

Read-only access to allowed paths

Resource limits

rlimit

CPU, memory, time bounds


P2P Registry

Purpose: Decentralized coordination service for node discovery, validation, and subnet management.

Key Responsibilities:

  1. Node registration and heartbeat tracking

  2. Model registry

  3. Subnet creation and management

  4. Validator pool management

  5. Preprocessing validation

  6. Completion validation

  7. P2P coordination with other registry nodes

Internal Architecture:


Smart Contracts

Purpose: On-chain payment coordination and validator registry.

Contracts:

Contract
Address (Forknet)
Purpose

InferencePayment

0x4361115359E5C0a25c9b2f8Bb71184F010b768ea

Payment handling

NCNToken

0x38E2565e8905BeAf83C34b266592465C22A2f108

ERC20 payment token

ValidatorRegistry

(TBD)

On-chain validator management

InferencePayment Contract Flow:


Data Flow

Complete Request Lifecycle


Scalability Design

Horizontal Scaling

Component
Scaling Strategy

Gateway

Multiple gateways, each manages subnets

Compute

Add more compute nodes to subnet

Registry

P2P network, distributed state

Performance Targets

Metric
Target

Inference latency

< 5 seconds

Throughput

100+ requests/second

Uptime

99.9%

Payment success

100%

Bottleneck Mitigation

Potential Bottleneck
Mitigation

Single gateway

Multiple gateways per subnet

Validator consensus

Parallel signature collection

Blockchain confirmation

Optimistic execution

Model loading

Model caching with TTL


Fault Tolerance

Node Failures

Failure
Recovery

Gateway crash

Client reconnects to another gateway

Compute crash

Gateway releases reservation, retries

Registry crash

Other registry nodes continue

Payment Safety

Scenario
Protection

Compute failure

Expiry-based refund

Gateway failure

On-chain expiry refund

Validator unavailable

M-of-N allows failures


Next Steps

Last updated