Skip to content

poc-k8s-cluster

A bare-metal Kubernetes cluster where folks on our AI floor can safely host their own OpenClaw instances — without sharing API keys, without exposing services to the public internet, and without needing to know how Kubernetes works.

What we're building

We're turning a shelf of Dell OptiPlex 3080 Micro machines into a shared compute platform. Each person gets their own isolated OpenClaw environment deployed through NemoClaw, with secrets encrypted via SOPS so API keys never appear in plaintext.

The cluster runs on Talos Linux — an immutable, hardened OS built specifically for Kubernetes — with Cilium handling networking and tenant isolation.

Access is through Cloudflare Tunnels, so users can reach their OpenClaw instances from anywhere without us opening ports or exposing the cluster directly to the internet. The tunnel connects outbound from inside our network to Cloudflare's edge, and Cloudflare handles authentication and routing back to the right tenant.

Current status

Version: alpha-0.0.2 — cluster is bootstrapped with 3 nodes (1 control-plane + 2 workers). Not yet serving tenants.

Next milestone: v0.5.0 — flash remaining machines, crimp cables, set up Cloudflare Tunnel, and deploy the first OpenClaw instances.

See Next Steps for the full roadmap.

Documentation

Document What it covers
Cluster Plan Architecture, network topology, hardware, software stack
Runbook Step-by-step guide to bootstrapping and operating the cluster
Compute Capacity CPU, RAM, storage specs and projections
Inference Capacity LLM inference speed across hardware tiers
GPU Inference Phase 2: adding GPU nodes
OS Install Talos raw-image flashing strategy
SOPS + OpenClaw Multi-tenant OpenClaw deployment with ArgoCD and SOPS (WIP)
Meetings Agendas and notes
Next Steps Roadmap for upcoming versions
Changelog Version history