Skip to main content
🎤 Speaking at KubeCon EU 2026 Lessons Learned Orchestrating Multi-Tenant GPUs on OpenShift AI View Session
🎤 Speaking at Red Hat Summit 2026 GPUs take flight: Safety-first multi-tenant Platform Engineering with NVIDIA and OpenShift AI Learn More
Platform Engineering

Rust in Cloud Native: Why Platform Teams Are Rewriting Critical Infrastructure

Luca Berton 2 min read
#rust#cloud-native#systems-programming#platform-engineering#performance

The Rust Takeover Nobody Talks About

Look at what’s already written in Rust in the cloud-native ecosystem:

  • Bottlerocket — AWS’s container-optimized OS
  • Firecracker — micro-VM hypervisor powering Lambda and Fargate
  • Linkerd2-proxy — the data plane proxy doing mTLS at scale
  • Vector — observability data pipeline (replacing Fluentd/Logstash)
  • Tikv — distributed key-value store (CNCF graduated)
  • SpinKube — WebAssembly on Kubernetes runtime

This isn’t coincidental. These are all performance-critical, security-sensitive components where memory safety and zero-cost abstractions matter most.

Why Platform Teams Care

If you’re building an internal developer platform, you’re responsible for the foundational components that hundreds of developers depend on. Those components need to be:

  1. Memory-safe — no segfaults, no buffer overflows, no use-after-free
  2. Fast — microsecond-level latency in the data path
  3. Concurrent — handle thousands of connections without thread explosion
  4. Predictable — no GC pauses, consistent tail latency

Go covers most of this, which is why Kubernetes and most CNCF projects use it. But Rust covers all of it, which is why the most performance-critical components are migrating.

Real-World Example: A Custom Kubernetes Admission Controller

Here’s a simple admission webhook in Rust that enforces resource limits:

use actix_web::{web, App, HttpServer, HttpResponse};
use k8s_openapi::api::core::v1::Pod;
use kube::core::admission::{AdmissionRequest, AdmissionResponse, AdmissionReview};
use serde_json;

async fn validate(
    body: web::Json<AdmissionReview<Pod>>,
) -> HttpResponse {
    let req: AdmissionRequest<Pod> = body.into_inner().try_into().unwrap();
    let pod = req.object.as_ref().unwrap();
    
    // Check all containers have resource limits
    let mut response = AdmissionResponse::from(&req);
    
    if let Some(spec) = &pod.spec {
        for container in &spec.containers {
            if container.resources.is_none() ||
               container.resources.as_ref().unwrap().limits.is_none() {
                response = response.deny(format!(
                    "Container '{}' must specify resource limits",
                    container.name
                ));
                break;
            }
        }
    }
    
    let review = response.into_review();
    HttpResponse::Ok().json(review)
}

#[actix_web::main]
async fn main() -> std::io::Result<()> {
    HttpServer::new(|| {
        App::new().route("/validate", web::post().to(validate))
    })
    .bind_rustls("0.0.0.0:8443", tls_config())?
    .run()
    .await
}

Compared to the Go equivalent, this webhook:

  • Uses 50% less memory at rest
  • Handles 30% more requests/second under load
  • Has zero GC pauses (important for admission latency SLAs)

The Rust Cloud-Native Toolkit

CratePurpose
kubeKubernetes client and controller runtime
actix-webHTTP server for webhooks and APIs
tokioAsync runtime
tonicgRPC framework
opentelemetryObservability instrumentation
clapCLI argument parsing
# Cargo.toml for a Kubernetes controller
[dependencies]
kube = { version = "0.92", features = ["runtime", "derive"] }
k8s-openapi = { version = "0.22", features = ["latest"] }
tokio = { version = "1", features = ["full"] }
actix-web = "4"
serde = { version = "1", features = ["derive"] }
tracing = "0.1"
opentelemetry = "0.23"

Building a Kubernetes Operator in Rust

use kube::{Api, Client, runtime::controller::{Action, Controller}};
use kube::api::ListParams;
use std::sync::Arc;
use tokio::time::Duration;

// Custom Resource Definition
#[derive(CustomResource, Deserialize, Serialize, Clone, Debug, JsonSchema)]
#[kube(group = "platform.acme.com", version = "v1", kind = "DatabaseClaim")]
pub struct DatabaseClaimSpec {
    pub engine: String,      // postgres, mysql
    pub size: String,        // small, medium, large
    pub team: String,
}

async fn reconcile(
    claim: Arc<DatabaseClaim>,
    ctx: Arc<Context>,
) -> Result<Action, Error> {
    let name = claim.name_any();
    let spec = &claim.spec;
    
    // Provision database based on claim
    match spec.engine.as_str() {
        "postgres" => provision_postgres(&name, &spec.size).await?,
        "mysql" => provision_mysql(&name, &spec.size).await?,
        _ => return Err(Error::UnsupportedEngine(spec.engine.clone())),
    }
    
    // Requeue after 5 minutes for health check
    Ok(Action::requeue(Duration::from_secs(300)))
}

When to Use Rust vs Go

Use Rust for:

  • Data plane components (proxies, sidecars, filters)
  • Security-critical admission controllers
  • High-throughput observability pipelines
  • Custom CNI plugins
  • Anything where tail latency matters

Use Go for:

  • Standard Kubernetes controllers and operators
  • CLI tools
  • API servers
  • Anything where ecosystem maturity matters more than raw performance

Use Python/Ansible for:

  • Automation and orchestration (obviously — see Ansible Pilot)
  • Scripting and glue code
  • ML/AI workloads

Deploying Rust Services on Kubernetes

# Multi-stage build for minimal image
FROM rust:1.78 AS builder
WORKDIR /app
COPY . .
RUN cargo build --release

FROM gcr.io/distroless/cc-debian12
COPY --from=builder /app/target/release/admission-webhook /
EXPOSE 8443
ENTRYPOINT ["/admission-webhook"]

Final image size: ~15MB. Compare that to a typical Go service at 30-50MB or a Java service at 200MB+.

For deploying these minimal container images on Kubernetes, Kubernetes Recipes has detailed patterns for distroless containers and security scanning integration.

The Learning Curve Reality

Rust’s learning curve is real. The borrow checker will fight you for weeks. But once it clicks:

  • If it compiles, it (usually) works correctly
  • Refactoring is fearless — the compiler catches everything
  • Concurrency bugs that would haunt you in Go/C++ are caught at compile time

For platform teams, my recommendation: start with a non-critical admission webhook, get comfortable, then tackle more complex components. Don’t rewrite your entire platform in Rust — that’s the kind of mistake I’ve seen at Open Empower consulting engagements. Rewrite the right things in Rust.

Share:

Luca Berton

AI & Cloud Advisor with 18+ years experience. Author of 8 technical books, creator of Ansible Pilot, and instructor at CopyPasteLearn Academy. Speaker at KubeCon EU & Red Hat Summit 2026.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens TechMeOut