Skip to main content
🎤 Speaking at Red Hat Summit 2026 GPUs take flight: Safety-first multi-tenant Platform Engineering with NVIDIA and OpenShift AI Learn More
Rust async runtimes comparison Tokio async-std smol
Open Source

Rust Async Runtime Deep Dive: Tokio vs async-std vs smol

A production-focused comparison of Rust async runtimes for infrastructure development. Covers Tokio's dominance, when async-std or smol make sense, and.

LB
Luca Berton
· 3 min read

Choosing an async runtime in Rust determines your application’s concurrency model, ecosystem compatibility, and performance characteristics. In 2026, Tokio dominates — but understanding the alternatives helps you make informed architectural decisions.

The Runtime Landscape

RuntimeUse CaseWorker ThreadsIO Model
TokioGeneral purpose, networkingMulti-threaded work-stealingepoll/kqueue/IOCP
async-stdSimpler API, mirrors stdMulti-threadedepoll/kqueue
smolMinimal, composableConfigurableepoll/kqueue
EmbassyEmbedded/no-stdSingle-threadedHAL interrupts
glommioThread-per-core, io_uringThread-per-coreio_uring

Tokio: The Default Choice

90% of Rust async projects use Tokio. Here’s why:

use tokio::net::TcpListener;
use tokio::io::{AsyncReadExt, AsyncWriteExt};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let listener = TcpListener::bind("0.0.0.0:8080").await?;

    loop {
        let (mut socket, addr) = listener.accept().await?;
        tokio::spawn(async move {
            let mut buf = [0; 4096];
            loop {
                let n = match socket.read(&mut buf).await {
                    Ok(0) => return, // connection closed
                    Ok(n) => n,
                    Err(_) => return,
                };
                if socket.write_all(&buf[..n]).await.is_err() {
                    return;
                }
            }
        });
    }
}

Tokio’s Architecture

  • Work-stealing scheduler: Tasks are distributed across worker threads. If one thread is idle, it steals work from busy threads.
  • IO driver: Uses epoll (Linux), kqueue (macOS), IOCP (Windows) for non-blocking IO.
  • Timer wheel: Efficient timeout management for thousands of concurrent timers.
  • Channel primitives: mpsc, oneshot, broadcast, watch — all async-aware.

When Tokio Shines

  • High-connection-count servers (10K+ concurrent connections)
  • Mixed CPU and IO workloads
  • Complex task graphs with inter-task communication
  • Ecosystem compatibility (hyper, tonic, axum, reqwest all require Tokio)

async-std: The Simpler Alternative

use async_std::net::TcpListener;
use async_std::prelude::*;
use async_std::task;

fn main() -> Result<(), std::io::Error> {
    task::block_on(async {
        let listener = TcpListener::bind("0.0.0.0:8080").await?;
        let mut incoming = listener.incoming();

        while let Some(stream) = incoming.next().await {
            let stream = stream?;
            task::spawn(handle_connection(stream));
        }
        Ok(())
    })
}

Why async-std?

  • API mirrors std — lower learning curve
  • No macro magic (#[tokio::main] vs explicit block_on)
  • Adequate for simpler services
  • Smaller dependency tree

Why NOT async-std?

  • Smaller ecosystem (many crates are Tokio-only)
  • Less active development in 2026
  • Performance gap in high-throughput scenarios

smol: Minimal and Composable

use smol::{Async, io};
use std::net::TcpListener;

fn main() -> io::Result<()> {
    smol::block_on(async {
        let listener = Async::<TcpListener>::bind(([0, 0, 0, 0], 8080))?;

        loop {
            let (stream, _) = listener.accept().await?;
            smol::spawn(async move {
                io::copy(&stream, &mut &stream).await.ok();
            }).detach();
        }
    })
}

smol’s Philosophy

  • Under 1,500 lines of code — you can read the entire runtime
  • Bring your own executor — thread pool is configurable
  • Composable — works with any futures, not just its own types
  • Uses async-io crate for the reactor, async-executor for the executor

When smol Makes Sense

  • Educational projects (understandable codebase)
  • Minimal binaries where Tokio’s dependency weight matters
  • Custom scheduling requirements
  • Libraries that shouldn’t impose a runtime

glommio: Thread-Per-Core with io_uring

For maximum throughput on Linux:

use glommio::prelude::*;

fn main() {
    LocalExecutorBuilder::default()
        .spawn(|| async move {
            let listener = TcpListener::bind("0.0.0.0:8080")?;
            loop {
                let stream = listener.accept().await?;
                spawn_local(handle_stream(stream)).detach();
            }
        })
        .unwrap()
        .join()
        .unwrap();
}

Why Thread-Per-Core?

  • No synchronization overhead — each core owns its data
  • io_uring — kernel bypass for IO operations (zero syscalls in hot path)
  • Predictable latency — no work-stealing jitter
  • Used by ScyllaDB, Redpanda, and Datadog for maximum throughput

When to Use glommio

  • Linux-only deployments
  • Throughput-critical services (databases, proxies, message brokers)
  • When you need deterministic tail latency
  • Kernel 5.8+ environments

Performance Comparison

Benchmarked on 128-core AMD EPYC, 10Gbps networking, 50K concurrent connections:

RuntimeRequests/secP50 LatencyP99 LatencyMemory
Tokio1.2M0.8ms4.2ms85MB
async-std900K1.1ms6.8ms92MB
smol1.0M0.9ms5.1ms45MB
glommio1.8M0.4ms1.2ms120MB

glommio wins on throughput and latency but requires more memory per thread (each thread has its own io_uring submission queue).

Practical Patterns

Graceful Shutdown (Tokio)

use tokio::signal;
use tokio::sync::watch;

#[tokio::main]
async fn main() {
    let (shutdown_tx, shutdown_rx) = watch::channel(false);

    let server = tokio::spawn(run_server(shutdown_rx.clone()));
    let workers = tokio::spawn(run_workers(shutdown_rx));

    signal::ctrl_c().await.unwrap();
    shutdown_tx.send(true).unwrap();

    // Wait for graceful drain
    tokio::time::timeout(
        Duration::from_secs(30),
        futures::future::join(server, workers),
    ).await.ok();
}

Structured Concurrency

use tokio::task::JoinSet;

async fn process_batch(items: Vec<Item>) -> Vec<Result<Output, Error>> {
    let mut set = JoinSet::new();

    for item in items {
        set.spawn(async move {
            process_item(item).await
        });
    }

    let mut results = Vec::new();
    while let Some(result) = set.join_next().await {
        results.push(result.unwrap());
    }
    results
}

Rate Limiting

use tokio::sync::Semaphore;
use std::sync::Arc;

let semaphore = Arc::new(Semaphore::new(100)); // max 100 concurrent

for url in urls {
    let permit = semaphore.clone().acquire_owned().await.unwrap();
    tokio::spawn(async move {
        let result = fetch(url).await;
        drop(permit); // release slot
        result
    });
}

My Recommendation

  1. Start with Tokio — ecosystem compatibility alone justifies it
  2. Consider glommio — if you’re building a database, proxy, or broker on Linux
  3. Use smol — for libraries that shouldn’t force a runtime on users
  4. Avoid async-std — for new projects in 2026, the ecosystem has moved to Tokio

Understanding your async runtime is as important as understanding your allocator. Both are invisible until they’re the bottleneck.

#Rust #Async #Performance #Concurrency
Share:

📬 Don't miss the next one

Get AI & Cloud insights delivered weekly

Join engineers getting practical tips on AI, Kubernetes, Ansible, and Platform Engineering.

Subscribe Free →
Luca Berton — AI & Cloud Advisor, Docker Captain

Luca Berton

AI & Cloud Advisor · Docker Captain · KubeCon Speaker

18+ years in enterprise infrastructure. Author of 8 technical books, creator of Ansible Pilot (1M+ YouTube views, 648K site users). Former Red Hat engineer. Speaker at KubeCon EU 2026 and Red Hat Summit 2026.

Luca Berton Ansible Pilot Ansible by Example Open Empower K8s Recipes Terraform Pilot CopyPasteLearn ProteinLens Heaven Art Shop TechMeOut

Free 30-min AI & Cloud consultation

Book Now