Posts • Loki's Wager

7 Jul 2026
Building a GPU SaaS Platform - Runtime Ops Boundaries
Part 22: harden runtime with Secret/TLS config, split RBAC, probes, metrics, and deployment boundaries while keeping audit in the control plane.
6 Jul 2026
Building a GPU SaaS Platform - Reliable Serverless Execution
Part 21: add queue retry policy, dead-letter handling, execution state classification, and runtime behavior changes for serverless GPU execution.
2 Jul 2026
Building a GPU SaaS Platform - GPU Virtualization with HAMi
Part 20: add HAMi virtual GPU packages on top of the DRA allocation contract.
29 Jun 2026
Building a GPU SaaS Platform - DRA Package Allocation
Part 19: replace runtime-side GPU counting with DRA-backed package allocation, ResourceClaim status, and controlled runtime packages.
15 Jun 2026
Building a GPU SaaS Platform - Runtime Control Plane Split
Part 18: split the runtime into a reconciler-only controller manager and a separate runtime API server.
14 Jun 2026
Building a GPU SaaS Platform - Invocation Result Store
Part 17: add a ScyllaDB-backed result-store consumer for durable serverless invocation metadata and control-plane result lookup.
8 Jun 2026
Building a GPU SaaS Platform - Worker Lifecycle
Part 16: add activator-owned worker lifecycle management, prewarm pools, idle scale-down, and metrics-driven worker state.
31 May 2026
Building a GPU SaaS Platform - Worker Sidecar
Part 15: add a serverless worker sidecar, define a UDS framework contract, and return results and metrics through NATS.
17 May 2026
Building a GPU SaaS Platform - Activator Dispatch
Part 14: add a dedicated activator that consumes ingress invocations, selects or creates GPUUnit workers, and publishes worker-targeted dispatch messages.
10 May 2026
Building a GPU SaaS Platform - Queue-First Ingress
Part 13: record the runtime-side serverless contract on GPU units and enqueue invocations durably through NATS JetStream before any worker executes them.