Solomon Hykes (Docker co-founder) said in 2019: “If WASM+WASI existed in 2008, we wouldn’t have needed to create Docker.” In 2026, that prediction is materializing. Rust is the primary language targeting WebAssembly for server-side workloads, and the ecosystem has reached production readiness.
Why Wasm for Cloud Native?
| Property | Containers | Wasm |
|---|---|---|
| Cold start | 100ms-5s | 1-10ms |
| Image size | 50MB-1GB | 1-10MB |
| Sandbox isolation | Linux namespaces | Capability-based |
| Cross-platform | OCI + kernel | Universal bytecode |
| Language support | Any | Rust, Go, C, JS, Python |
| Startup memory | 50-500MB | 1-20MB |
WASI: The Server-Side Interface
WASI (WebAssembly System Interface) provides standardized system access:
// A WASI HTTP handler in Rust
use wasi::http::types::{IncomingRequest, ResponseOutparam};
#[export_name = "wasi:http/incoming-handler#handle"]
pub fn handle(request: IncomingRequest, response_out: ResponseOutparam) {
let path = request.path_with_query().unwrap_or_default();
let method = request.method();
let body = match (method, path.as_str()) {
(Method::Get, "/health") => "OK".to_string(),
(Method::Get, "/metrics") => collect_metrics(),
(Method::Post, "/inference") => {
let input = read_body(&request);
run_inference(&input)
}
_ => {
set_response(response_out, 404, "Not Found");
return;
}
};
set_response(response_out, 200, &body);
}Wasm Components: Composable Modules
The Component Model (standardized in 2025) enables composing Wasm modules like Unix pipes:
// WIT (Wasm Interface Type) definition
package ai:inference;
interface model {
record input {
tokens: list<u32>,
max-length: u32,
temperature: float32,
}
record output {
tokens: list<u32>,
latency-ms: u64,
}
infer: func(input: input) -> result<output, string>;
}
world inference-service {
import wasi:logging/logging;
import wasi:config/runtime;
export model;
}// Implement the component in Rust
wit_bindgen::generate!({
world: "inference-service",
});
struct InferenceComponent;
impl Guest for InferenceComponent {
fn infer(input: Input) -> Result<Output, String> {
let config = wasi::config::runtime::get("model-path")
.map_err(|e| format!("config error: {e}"))?;
// Run inference
let start = std::time::Instant::now();
let result = model::forward(&input.tokens, input.max_length)?;
Ok(Output {
tokens: result,
latency_ms: start.elapsed().as_millis() as u64,
})
}
}Spin (Fermyon): The Serverless Framework
use spin_sdk::http::{IntoResponse, Request, Response};
use spin_sdk::http_component;
use spin_sdk::key_value::Store;
#[http_component]
fn handle_request(req: Request) -> anyhow::Result<impl IntoResponse> {
let store = Store::open_default()?;
match req.method() {
&Method::GET => {
let key = req.path().trim_start_matches('/');
let value = store.get(key)?;
Ok(Response::builder()
.status(200)
.body(value.unwrap_or_default())
.build())
}
&Method::PUT => {
let key = req.path().trim_start_matches('/');
let body = req.body().to_vec();
store.set(key, &body)?;
Ok(Response::builder()
.status(201)
.body("Created")
.build())
}
_ => Ok(Response::builder().status(405).build()),
}
}Deploy:
spin build
spin deploy # deploys to Fermyon Cloud in secondswasmCloud: Distributed Wasm
wasmCloud runs Wasm components across a distributed mesh:
use wasmcloud_interface_httpserver::*;
use wasmcloud_interface_keyvalue::*;
#[async_trait]
impl HttpServer for MyActor {
async fn handle_request(&self, ctx: &Context, req: &HttpRequest) -> HttpResponse {
// This component can run on any node in the lattice
// wasmCloud handles capability negotiation
let kv = KeyValueSender::new();
match kv.get(ctx, &req.path).await {
Ok(GetResponse { value, exists: true }) => {
HttpResponse::ok(value)
}
_ => HttpResponse::not_found(),
}
}
}Kubernetes + Wasm: SpinKube
Run Wasm workloads alongside containers in Kubernetes:
apiVersion: core.spinoperator.dev/v1alpha1
kind: SpinApp
metadata:
name: inference-edge
spec:
image: "ghcr.io/my-org/inference:v1.0"
replicas: 3
executor: containerd-shim-spin
runtime-config:
key_value_stores:
default:
type: redis
url: redis://redis:6379Benefits over traditional pods:
- 1ms cold start vs 2-5s for containers
- 2MB memory per instance vs 50-200MB
- Sandboxed by default — no filesystem access unless explicitly granted
- Scale to zero actually works (instant restart)
When Wasm Beats Containers
Choose Wasm when:
- Edge deployments with constrained resources
- Serverless functions needing sub-10ms cold starts
- Plugin systems (extend applications safely)
- Multi-tenant workloads (stronger isolation per tenant)
- Polyglot composition (mix Rust, Go, JS components)
Stick with containers when:
- Running existing applications unchanged
- GPU workloads (no GPU access in Wasm yet)
- Complex system dependencies (databases, runtimes)
- Long-running stateful services
- Full Linux syscall surface needed
The Rust Advantage for Wasm
Rust produces the smallest, fastest Wasm binaries:
| Language | Hello World Wasm Size | HTTP Handler Size |
|---|---|---|
| Rust | 1.8 KB | ~2 MB |
| Go (TinyGo) | 250 KB | ~8 MB |
| C | 1.5 KB | ~1.5 MB |
| JavaScript | N/A (needs runtime) | ~15 MB |
| Python | N/A (needs runtime) | ~30 MB |
Rust’s no-runtime, no-GC design maps perfectly to Wasm’s linear memory model.
Building Wasm Components in Rust
# Install tools
rustup target add wasm32-wasip2
cargo install cargo-component
# Create a new component project
cargo component new my-service
cd my-service
# Build
cargo component build --release
# Run locally
wasmtime serve target/wasm32-wasip2/release/my_service.wasm
# Publish to OCI registry
wasm-tools component push \
target/wasm32-wasip2/release/my_service.wasm \
ghcr.io/my-org/my-service:v1.0Production Considerations
Security
Wasm’s capability-based security model means workloads get zero access by default:
- No filesystem access unless granted
- No network access unless granted
- No environment variables unless granted
- Memory fully isolated between modules
Observability
// OpenTelemetry works in Wasm
use opentelemetry::trace::Tracer;
fn handle_request(req: Request) -> Response {
let span = tracer.start("handle_request");
let _guard = span.enter();
// Traces, metrics, and logs work the same as in native code
counter!("requests_total", 1);
// ...
}Limitations in 2026
- No threads (shared-nothing model only)
- No GPU/accelerator access
- Limited socket support (HTTP only in most runtimes)
- Component Model tooling still maturing
- Debugging experience inferior to native
Related Articles
- Rust in 2026 — the broader ecosystem
- Edge AI Inference — edge computing patterns
- Kubernetes vs Docker Swarm — workload orchestration
Wasm won’t replace containers — but it will handle the workloads where containers are overkill. The 1ms cold start changes what’s architecturally possible.