Ansible is agentless and uses YAML. Chef is agent-based and uses Ruby. This fundamental difference shapes everything β learning curve, architecture, debugging, and team adoption.
Architecture
| Aspect | Ansible | Chef |
|---|
| Agent | Agentless (SSH/WinRM) | Agent on every node (chef-client) |
| Language | YAML (playbooks) | Ruby DSL (recipes/cookbooks) |
| Execution | Push (controller β nodes) | Pull (agent polls server) |
| Server | None required (or AWX/AAP) | Chef Infra Server (required) |
| State | Stateless (each run is independent) | Server stores node state (run lists, attributes) |
| Transport | SSH (Linux), WinRM (Windows) | HTTPS (agent β server) |
Language comparison
Ansible (YAML)
---
- name: Configure web server
hosts: webservers
become: true
tasks:
- name: Install nginx
ansible.builtin.package:
name: nginx
state: present
- name: Deploy configuration
ansible.builtin.template:
src: nginx.conf.j2
dest: /etc/nginx/nginx.conf
notify: Restart nginx
- name: Ensure nginx is running
ansible.builtin.service:
name: nginx
state: started
enabled: true
handlers:
- name: Restart nginx
ansible.builtin.service:
name: nginx
state: restarted
Chef (Ruby DSL)
# recipes/default.rb
package 'nginx' do
action :install
end
template '/etc/nginx/nginx.conf' do
source 'nginx.conf.erb'
notifies :restart, 'service[nginx]'
end
service 'nginx' do
action [:enable, :start]
end
Both achieve the same result. Ansible is readable by anyone. Chef requires Ruby knowledge.
Learning curve
| Aspect | Ansible | Chef |
|---|
| Time to first playbook | Hours | Days |
| Time to proficiency | 1-2 weeks | 1-3 months |
| Language background needed | None (YAML) | Ruby |
| Concept complexity | Low (tasks run in order) | High (convergence model, run lists, roles, environments) |
| Debugging | -vvv flag, readable output | Stack traces, Ruby debugging |
| Documentation | Excellent | Good but complex |
Scalability
| Metric | Ansible | Chef |
|---|
| 10 nodes | Direct SSH, seconds | Overkill |
| 100 nodes | Forks (parallel SSH) | Agent pull, natural |
| 1,000 nodes | AWX/AAP with execution environments | Chef Server + load balancing |
| 10,000+ nodes | AAP mesh topology | Chef Server cluster |
| Convergence time | On-demand (push) | 30-min intervals (configurable) |
| Drift detection | Only during runs | Continuous (agent reports) |
Chefβs agent-based model scales more naturally for continuous compliance β agents report state every 30 minutes without central coordination. Ansible requires explicit runs or scheduled jobs.
Ecosystem
| Feature | Ansible | Chef |
|---|
| Content hub | Ansible Galaxy (collections) | Chef Supermarket (cookbooks) |
| Testing | Molecule, ansible-lint | Test Kitchen, ChefSpec, InSpec |
| Cloud modules | 100+ cloud collections | Cloud cookbooks |
| Network automation | Strong (Cisco, Arista, Juniper) | Limited |
| Enterprise platform | Red Hat AAP | Chef Automate (Progress) |
| Compliance | Ansible + SCAP | InSpec (built-in) |
| AI assistant | Red Hat Ansible Lightspeed | None |
Decision guide
Choose Ansible when:
- Agentless is required (security teams often reject agents)
- Your team does not know Ruby (YAML is more accessible)
- You need network automation (routers, switches, firewalls)
- Ad-hoc tasks matter β run one-off commands across fleet
- You want faster time to value (hours not months)
- You use Red Hat ecosystem (RHEL, AAP, Satellite)
Choose Chef when:
- Continuous convergence β agents enforce state every 30 minutes
- Your team knows Ruby and prefers code over YAML
- Compliance as code with InSpec is a priority
- You manage 10,000+ nodes where agent-based pull model scales better
- Existing Chef infrastructure β migration cost outweighs benefits