# Production Deployment

Best practices and considerations for deploying NCN Network in production.

***

## Production Checklist

### Security

* [ ] TLS certificates configured for all endpoints
* [ ] Private keys stored in secure vault (HSM recommended)
* [ ] Firewall rules configured
* [ ] Rate limiting enabled
* [ ] Security audit completed

### Infrastructure

* [ ] High availability setup (multiple replicas)
* [ ] Load balancer configured
* [ ] Auto-scaling enabled
* [ ] Persistent storage for data
* [ ] Backup strategy implemented

### Monitoring

* [ ] Prometheus metrics enabled
* [ ] Grafana dashboards configured
* [ ] Alerting rules set up
* [ ] Log aggregation configured
* [ ] Health checks implemented

### Operations

* [ ] Runbooks documented
* [ ] Incident response plan
* [ ] Update/rollback procedures
* [ ] On-call rotation established

***

## TLS Configuration

### Generate Certificates

```bash
# Using Let's Encrypt with certbot
sudo apt install certbot
sudo certbot certonly --standalone -d api.ncn-network.io

# Certificates will be at:
# /etc/letsencrypt/live/api.ncn-network.io/fullchain.pem
# /etc/letsencrypt/live/api.ncn-network.io/privkey.pem
```

### Configure Gateway TLS

```bash
# gateway.env
TLS_ENABLED=true
TLS_CERT_PATH=/etc/ncn/certs/fullchain.pem
TLS_KEY_PATH=/etc/ncn/certs/privkey.pem
```

### Certificate Renewal

```bash
# Add to crontab
0 0 1 * * certbot renew --quiet && systemctl reload ncn-gateway
```

***

## High Availability

### Gateway HA

```
                    Load Balancer
                    ┌───────────────────┐
                    │   HAProxy/NGINX   │
                    │   (health checks) │
                    └─────────┬─────────┘
                              │
              ┌───────────────┼───────────────┐
              │               │               │
              ▼               ▼               ▼
        ┌──────────┐   ┌──────────┐   ┌──────────┐
        │Gateway 1 │   │Gateway 2 │   │Gateway 3 │
        │ Server A │   │ Server B │   │ Server C │
        └──────────┘   └──────────┘   └──────────┘
```

### HAProxy Configuration

```haproxy
# /etc/haproxy/haproxy.cfg
frontend ncn-http
    bind *:80
    bind *:443 ssl crt /etc/haproxy/certs/ncn.pem
    default_backend ncn-gateway

backend ncn-gateway
    balance roundrobin
    option httpchk GET /health
    server gateway1 10.0.1.1:8080 check
    server gateway2 10.0.1.2:8080 check
    server gateway3 10.0.1.3:8080 check

frontend ncn-grpc
    bind *:50051
    mode tcp
    default_backend ncn-gateway-grpc

backend ncn-gateway-grpc
    mode tcp
    balance roundrobin
    server gateway1 10.0.1.1:50051 check
    server gateway2 10.0.1.2:50051 check
    server gateway3 10.0.1.3:50051 check
```

***

## Security Hardening

### Firewall Rules

```bash
# Public-facing (Gateway only)
sudo ufw allow 443/tcp      # HTTPS
sudo ufw allow 50051/tcp    # gRPC (if public)

# Internal only
sudo ufw deny 50050/tcp     # Registry gRPC (internal)
sudo ufw deny 8828/tcp      # Registry P2P (internal)

# Block all other inbound
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw enable
```

### Rate Limiting (NGINX)

```nginx
# /etc/nginx/conf.d/rate-limit.conf
limit_req_zone $binary_remote_addr zone=api:10m rate=100r/m;

server {
    location /api/ {
        limit_req zone=api burst=20 nodelay;
        proxy_pass http://ncn-gateway;
    }
}
```

### Key Management

**Development:**

* Environment variables (acceptable)

**Staging:**

* Secrets manager (AWS Secrets Manager, HashiCorp Vault)

**Production:**

* Hardware Security Module (HSM)
* Key Management Service (AWS KMS, GCP KMS)

```bash
# Using AWS Secrets Manager
GATEWAY_WALLET_PRIVATE_KEY=$(aws secretsmanager get-secret-value \
  --secret-id ncn/gateway-key \
  --query SecretString \
  --output text)
```

***

## Monitoring Setup

### Prometheus Configuration

```yaml
# prometheus.yml
scrape_configs:
  - job_name: 'ncn-gateway'
    static_configs:
      - targets: ['gateway1:8080', 'gateway2:8080', 'gateway3:8080']
    metrics_path: /metrics

  - job_name: 'ncn-registry'
    static_configs:
      - targets: ['registry1:50050', 'registry2:50050', 'registry3:50050']
```

### Key Metrics to Monitor

| Metric              | Alert Threshold |
| ------------------- | --------------- |
| Request latency p99 | > 5s            |
| Error rate          | > 5%            |
| CPU usage           | > 80%           |
| Memory usage        | > 85%           |
| Disk usage          | > 90%           |
| Active connections  | > 1000          |

### Alerting Rules

```yaml
# alerts.yml
groups:
- name: ncn-alerts
  rules:
  - alert: HighErrorRate
    expr: rate(ncn_requests_failed_total[5m]) / rate(ncn_requests_total[5m]) > 0.05
    for: 5m
    labels:
      severity: critical
    annotations:
      summary: "High error rate on NCN Gateway"
      
  - alert: HighLatency
    expr: histogram_quantile(0.99, rate(ncn_request_duration_seconds_bucket[5m])) > 5
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "High latency on NCN Gateway"
```

***

## Backup Strategy

### What to Backup

| Data                | Frequency             | Retention |
| ------------------- | --------------------- | --------- |
| Configuration files | Daily                 | 30 days   |
| Registry DHT data   | Every 6 hours         | 7 days    |
| Logs                | Daily                 | 90 days   |
| Wallet keys         | Once (secure offsite) | Forever   |

### Backup Script

```bash
#!/bin/bash
# /opt/ncn/scripts/backup.sh

DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR=/backup/ncn/$DATE

mkdir -p $BACKUP_DIR

# Backup configuration
cp -r /opt/ncn/config $BACKUP_DIR/

# Backup data
tar -czf $BACKUP_DIR/data.tar.gz /opt/ncn/data/

# Upload to S3
aws s3 sync $BACKUP_DIR s3://ncn-backups/$DATE/

# Cleanup old backups
find /backup/ncn -type d -mtime +30 -exec rm -rf {} \;
```

***

## Disaster Recovery

### Recovery Time Objective (RTO)

| Component | RTO        |
| --------- | ---------- |
| Gateway   | 5 minutes  |
| Registry  | 15 minutes |
| Compute   | 30 minutes |

### Recovery Procedures

**Gateway Failure:**

1. Health check detects failure
2. Load balancer removes from pool
3. Alert sent to ops team
4. Auto-restart or manual intervention
5. Verify health, add back to pool

**Registry Failure:**

1. P2P network continues with remaining nodes
2. Failed node restarts automatically
3. DHT data syncs from peers
4. Verify consensus capability

**Full Cluster Recovery:**

```bash
# 1. Start infrastructure
systemctl start ncn-registry
sleep 30

# 2. Start gateways
systemctl start ncn-gateway
sleep 10

# 3. Start compute nodes
systemctl start ncn-compute

# 4. Verify health
curl http://localhost:8080/health
```

***

## Performance Tuning

### System Limits

```bash
# /etc/security/limits.conf
ncn soft nofile 65535
ncn hard nofile 65535
ncn soft nproc 32768
ncn hard nproc 32768
```

### Kernel Parameters

```bash
# /etc/sysctl.conf
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_tw_reuse = 1
```

### Application Tuning

```bash
# Gateway tuning
TOKIO_WORKER_THREADS=4
HTTP_KEEP_ALIVE_TIMEOUT=60
GRPC_MAX_CONNECTIONS=1000
```

***

## Update Procedures

### Rolling Update

```bash
# 1. Update one gateway at a time
for host in gateway1 gateway2 gateway3; do
  ssh $host "systemctl stop ncn-gateway"
  ssh $host "cp /tmp/gateway_node /opt/ncn/bin/"
  ssh $host "systemctl start ncn-gateway"
  sleep 30
  # Verify health before continuing
  curl -f http://$host:8080/health || exit 1
done
```

### Rollback Procedure

```bash
# Keep previous version
cp /opt/ncn/bin/gateway_node /opt/ncn/bin/gateway_node.backup

# Rollback if needed
systemctl stop ncn-gateway
cp /opt/ncn/bin/gateway_node.backup /opt/ncn/bin/gateway_node
systemctl start ncn-gateway
```

***

## Compliance Considerations

### Data Handling

* Minimize data retention
* Encrypt data at rest
* Log access to sensitive data
* GDPR compliance for EU users

### Audit Logging

```bash
# Enable audit logging
AUDIT_LOG_ENABLED=true
AUDIT_LOG_PATH=/var/log/ncn/audit.log
```

***

## Next Steps

* [Monitoring](/nc/neurochainai-guides/operators/monitoring.md) - Detailed monitoring setup
* [Troubleshooting](/nc/neurochainai-guides/troubleshooting/troubleshooting.md) - Common issues
* [Security](/nc/neurochainai-guides/security/security.md) - Security documentation


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.neurochain.ai/nc/neurochainai-guides/deployment/production.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
