poc-k8s-cluster¶
A bare-metal Kubernetes cluster where folks on our AI floor can safely host their own OpenClaw instances — without sharing API keys, without exposing services to the public internet, and without needing to know how Kubernetes works.
What we're building¶
We're turning a shelf of Dell OptiPlex 3080 Micro machines into a shared compute platform. Each person gets their own isolated OpenClaw environment deployed through NemoClaw, with secrets encrypted via SOPS so API keys never appear in plaintext.
The cluster runs on Talos Linux — an immutable, hardened OS built specifically for Kubernetes — with Cilium handling networking and tenant isolation.
Access is through Cloudflare Tunnels, so users can reach their OpenClaw instances from anywhere without us opening ports or exposing the cluster directly to the internet. The tunnel connects outbound from inside our network to Cloudflare's edge, and Cloudflare handles authentication and routing back to the right tenant.
Current status¶
Version: alpha-0.0.2 — cluster is bootstrapped with 3 nodes (1 control-plane + 2 workers). Not yet serving tenants.
Next milestone: v0.5.0 — flash remaining machines, crimp cables, set up Cloudflare Tunnel, and deploy the first OpenClaw instances.
See Next Steps for the full roadmap.
Documentation¶
| Document | What it covers |
|---|---|
| Cluster Plan | Architecture, network topology, hardware, software stack |
| Runbook | Step-by-step guide to bootstrapping and operating the cluster |
| Compute Capacity | CPU, RAM, storage specs and projections |
| Inference Capacity | LLM inference speed across hardware tiers |
| GPU Inference | Phase 2: adding GPU nodes |
| OS Install | Talos raw-image flashing strategy |
| SOPS + OpenClaw | Multi-tenant OpenClaw deployment with ArgoCD and SOPS (WIP) |
| Meetings | Agendas and notes |
| Next Steps | Roadmap for upcoming versions |
| Changelog | Version history |
Quick Links¶
- GitHub repo — source code and raw docs
- Browse the docs site