What AI and cloud consulting services does Luca Berton offer?

Luca Berton provides expert consulting in AI/ML platform strategy, multi-tenant GPU orchestration on OpenShift AI, MLOps enablement, cloud infrastructure design, Kubernetes workshops, and Ansible & Python training.

What is Ansible Pilot?

Ansible Pilot is the leading resource for Ansible automation learning, featuring a YouTube channel with 6.1K subscribers and 1M+ views, plus AnsiblePilot.com with 648K total users.

How can I book a consultation with Luca Berton?

Schedule a free consultation through Calendly at calendly.com/lucaberton or visit lucaberton.com/contact.

Automation

Edge AI with Ansible: Automating Model Deployment Across Hundreds of Devices

Luca Berton • Thu Feb 26 2026 • 1 min read •

#edge-ai#ansible#automation#fleet-management#devops#deployment

The Fleet Management Problem

You have 200 Jetson Orin devices running quality inspection across 15 factories. A new model version is ready. How do you deploy it?

SSH into 200 devices? No. Kubernetes? Maybe, but many edge environments don’t have it. The answer for most edge fleets: Ansible.

Inventory: Organizing Your Edge Fleet

# inventory/edge_devices.ini

[factory_amsterdam]
edge-ams-01 ansible_host=10.1.1.10 gpu_type=orin_nano
edge-ams-02 ansible_host=10.1.1.11 gpu_type=orin_nano
edge-ams-03 ansible_host=10.1.1.12 gpu_type=orin_nano

[factory_berlin]
edge-ber-01 ansible_host=10.2.1.10 gpu_type=orin_nano
edge-ber-02 ansible_host=10.2.1.11 gpu_type=orin_nano

[factory_paris]
edge-par-01 ansible_host=10.3.1.10 gpu_type=orin_nx
edge-par-02 ansible_host=10.3.1.11 gpu_type=orin_nx

[all:vars]
ansible_user=edge-admin
ansible_ssh_private_key_file=~/.ssh/edge_fleet_key
model_registry=registry.internal:5000

Playbook: Model Deployment with Canary

---
# deploy_model.yml - Rolling model update with canary
- name: Deploy AI model to edge fleet
  hosts: all
  serial: "10%"  # Canary: 10% of devices at a time
  max_fail_percentage: 5
  vars:
    model_name: defect-detection
    model_version: "3.2"
    model_file: "{{ model_name }}-v{{ model_version }}-int8.onnx"
    rollback_version: "3.1"

  pre_tasks:
    - name: Check device health before update
      uri:
        url: "http://localhost:8080/health"
        return_content: yes
      register: health_check
      failed_when: health_check.json.status != 'healthy'

    - name: Record current model version for rollback
      shell: cat /opt/models/current_version
      register: current_version

  tasks:
    - name: Download new model from registry
      get_url:
        url: "{{ model_registry }}/models/{{ model_file }}"
        dest: "/opt/models/{{ model_file }}"
        checksum: "sha256:{{ model_checksums[model_version] }}"

    - name: Stop inference service
      systemd:
        name: inference-engine
        state: stopped

    - name: Update model symlink
      file:
        src: "/opt/models/{{ model_file }}"
        dest: /opt/models/active_model.onnx
        state: link

    - name: Update version tracker
      copy:
        content: "{{ model_version }}"
        dest: /opt/models/current_version

    - name: Start inference service
      systemd:
        name: inference-engine
        state: started

    - name: Wait for model to load
      uri:
        url: "http://localhost:8080/health"
        return_content: yes
      register: post_health
      retries: 12
      delay: 5
      until: post_health.json.model_loaded == true

    - name: Run validation inference
      uri:
        url: "http://localhost:8080/validate"
        method: POST
        body_format: json
        body:
          test_image: "/opt/test-data/reference.jpg"
          expected_class: "no_defect"
          min_confidence: 0.85
      register: validation
      failed_when: validation.json.passed != true

  handlers:
    - name: Rollback model
      block:
        - file:
            src: "/opt/models/{{ model_name }}-v{{ rollback_version }}-int8.onnx"
            dest: /opt/models/active_model.onnx
            state: link
        - systemd:
            name: inference-engine
            state: restarted

The serial: "10%" is critical — it deploys to 10% of devices, validates, then continues. If validation fails, the remaining 90% keep running the old model.

Role: Edge Device Setup

# roles/edge-ai-node/tasks/main.yml
---
- name: Install NVIDIA JetPack components
  apt:
    name:
      - nvidia-jetpack
      - nvidia-tensorrt
      - nvidia-cuda-toolkit
    state: present
  when: gpu_type is match("orin.*")

- name: Create model directory
  file:
    path: /opt/models
    state: directory
    owner: inference
    group: inference
    mode: '0755'

- name: Deploy inference engine service
  template:
    src: inference-engine.service.j2
    dest: /etc/systemd/system/inference-engine.service
  notify: reload systemd

- name: Configure log rotation
  template:
    src: inference-logrotate.j2
    dest: /etc/logrotate.d/inference-engine

- name: Set up health monitoring
  template:
    src: node-exporter-textfile.sh.j2
    dest: /opt/monitoring/collect-metrics.sh
    mode: '0755'

- name: Schedule metrics collection
  cron:
    name: "collect inference metrics"
    minute: "*/1"
    job: "/opt/monitoring/collect-metrics.sh"

Monitoring Playbook

---
# check_fleet.yml - Quick fleet health check
- name: Check edge AI fleet health
  hosts: all
  gather_facts: no

  tasks:
    - name: Get device status
      uri:
        url: "http://localhost:8080/status"
        return_content: yes
      register: status
      ignore_errors: yes

    - name: Report unhealthy devices
      debug:
        msg: |
          ALERT: {{ inventory_hostname }}
          Status: {{ status.json.status | default('UNREACHABLE') }}
          Model: {{ status.json.model_version | default('unknown') }}
          GPU Temp: {{ status.json.gpu_temp | default('N/A') }}°C
          Uptime: {{ status.json.uptime_hours | default('N/A') }}h
      when: status.failed or status.json.status != 'healthy'

Run it every 15 minutes from a cron job:

*/15 * * * * ansible-playbook -i inventory/edge_devices.ini check_fleet.yml --quiet 2>&1 | grep ALERT | mail -s "Edge AI Fleet Alert" [email protected]

Why Ansible Beats Custom Solutions

I’ve seen teams build custom fleet management tools in Python. They always underestimate:

SSH key management
Parallel execution with rate limiting
Idempotent operations (what if it fails halfway?)
Inventory management as devices come and go
Rollback logic

Ansible handles all of this out of the box. It’s not the sexiest tool, but for edge fleet management, it’s the most reliable.

📌 Need expert help with this topic?

🐍

Ansible & Python Training

Level up your automation skills with expert-led Ansible and Python training.

🧠

AI Integration & GPU Platforms

Need help deploying AI/ML platforms? Get expert consulting on OpenShift AI, GPU orchestration, and MLOps.

Book a free consultation →

Luca Berton

AI & Cloud Advisor with 18+ years experience. Author of 8 technical books, creator of Ansible Pilot, and instructor at CopyPasteLearn Academy. Speaker at KubeCon EU & Red Hat Summit 2026.

LinkedIn Bluesky YouTube Contact →

← Back to Blog

Automation

Ansible + AI: Using LLMs to Generate and Validate Playbooks

LLMs can write Ansible playbooks, but should you trust them? Here's how to use AI for playbook generation with proper validation, linting, and safety guardrails.

Thu Feb 26 2026

Automation

Ansible Collections Best Practices: Building Reusable Automation Content

Design, build, and distribute Ansible Collections that your team will actually reuse. Naming conventions, testing, versioning, and Galaxy publishing.

Thu Feb 26 2026

Automation

Automating GPU Cluster Provisioning with Ansible and NVIDIA Drivers

Automate the provisioning of GPU compute clusters with Ansible. NVIDIA driver installation, CUDA setup, container runtime configuration, and health checks.

Thu Feb 26 2026

Edge AI with Ansible: Automating Model Deployment Across Hundreds of Devices

The Fleet Management Problem

Inventory: Organizing Your Edge Fleet

Playbook: Model Deployment with Canary

Role: Edge Device Setup

Monitoring Playbook

Why Ansible Beats Custom Solutions

📌 Need expert help with this topic?

Ansible & Python Training

AI Integration & GPU Platforms

Luca Berton

Related Articles

Ansible + AI: Using LLMs to Generate and Validate Playbooks

Ansible Collections Best Practices: Building Reusable Automation Content

Automating GPU Cluster Provisioning with Ansible and NVIDIA Drivers