Skip to main content
πŸŽ“ Claude Code Masterclass Learn AI-assisted development on Udemy β€” plus the companion book on Leanpub & Amazon. Start Learning
Huawei software-defined factory production domain IT/OT converged infrastructure architecture diagram
AI

Huawei Software-Defined Factory: IT/OT

Inside Huawei's software-defined factory architecture β€” converged IT and OT infrastructure with Wi-Fi 7, AI-driven operations, industrial ring networks.

LB
Luca Berton
Β· 16 min read

Modern manufacturing demands more than isolated production lines and disconnected systems. Huawei’s software-defined factory architecture demonstrates what happens when you converge IT and OT infrastructure into a single, AI-driven platform β€” managing over 10,000 devices across factory floors.

I had the opportunity to see this architecture up close, and the scale of integration is remarkable.

The Core Architecture: IT/OT Converged Production Network

Huawei’s factory architecture spans three layers β€” Factory, Workshop, and Shop Floor β€” connected through a converged production network that bridges the traditional gap between IT systems and OT control networks.

Factory Layer

At the top, factory applications run across five domains:

  • MES (Manufacturing Execution System) β€” production scheduling and tracking
  • LMS (Laboratory Management System) β€” quality testing workflows
  • QMS (Quality Management System) β€” defect tracking and compliance
  • Digital Operations β€” real-time dashboards and analytics
  • Equipment Management β€” asset lifecycle and maintenance

These connect through production core switches (S12700E-S) and firewalls (USG6655E) to the iMaster NCE β€” Huawei’s unified operations management platform that provides a single pane of glass across the entire factory network.

Workshop Layer

The workshop level uses AirEngine 8776 and AirEngine 8775 access points with S5730-H switches, creating a mesh that covers:

  • Production IT systems
  • Wireless networks for mobile devices
  • Dedicated video networks for quality inspection cameras
  • IoT sensor networks

Shop Floor (OT Domain)

The bottom layer is where physical production happens. An industrial ring network built on S5735I-H switches connects:

  • Energy meters and environmental sensors
  • Phones and tablets for operators
  • AGVs (Automated Guided Vehicles) for material transport
  • Cameras for visual quality inspection
  • Robots and PLCs (Programmable Logic Controllers)
  • Industrial computers running real-time control loops

The ring topology provides redundancy β€” if one link fails, traffic reroutes in milliseconds, critical for zero-downtime manufacturing.

Four Pillars of the Architecture

1. Full Wireless Connections with Advanced Frequency Technology

  • Wi-Fi 7 coverage across the entire production floor
  • Flexible production lines with faster adjustments β€” no rewiring when you reconfigure a line
  • Zero roaming for AGVs β€” this is the key innovation

Wi-Fi 7 Zero Roaming: How It Works

Traditional factory Wi-Fi has a fundamental problem: when an AGV moves between access points, it must roam β€” disconnecting from one AP, scanning for another, re-authenticating, and often switching channels. In a dense warehouse with steel racks blocking signals, this causes AGV slowdowns or complete breakdowns. Conventional dual-AP solutions require costly CPE modifications on every vehicle.

Huawei Wi-Fi 7 Zero Roaming for AGVs

Huawei’s approach eliminates roaming entirely with same-frequency AP synchronization:

  • Up to 128 APs synchronized as a single AP β€” all broadcasting on the same channel with the same BSSID
  • 40,000 mΒ² coverage appears as one access point to every device
  • AGVs see the entire network as a single AP β€” no roaming, no channel switching, no IP changes
  • No AGV CPE modifications needed β€” existing vehicles work without hardware changes, lowering costs and deployment time

This is fundamentally different from standard Wi-Fi roaming (even fast roaming protocols like 802.11r). The AGV never roams because from its perspective there is only one AP. The network handles the complexity of coordinating 128 physical access points behind the scenes.

The result: uninterrupted AGV operations even in high-bay warehouses where dense steel racks create dead zones that would break conventional Wi-Fi coverage.

2. AI-Driven Operations

  • End-to-end traffic tracing and analysis for production sites and data centers
  • Fast fault location with doubled troubleshooting efficiency
  • iMaster NCE uses AI models trained on network telemetry to predict failures before they cause downtime

When you are running 10,000 devices, manual network management is impossible. AI-driven operations shift from reactive troubleshooting to predictive maintenance.

3. Ultra-High Reliability

  • IT/OT converged, high-reliability network β€” one infrastructure, not two parallel ones
  • Zero MES instruction loss β€” every manufacturing instruction reaches the target device
  • Zero PLC command interruptions β€” control loops never break
  • 10 Gbps bandwidth supports PPB-grade (parts per billion) quality control

The reliability requirement is non-negotiable. A lost PLC command can mean a defective batch, a safety incident, or a production line shutdown. The network is designed for zero packet loss in the OT domain.

4. Security Enhanced

  • Three lines of defense across IT and OT domains
  • Network security collaboration between IT firewalls and OT zone segmentation
  • Zero data loss guarantee

The security model recognizes that converging IT and OT creates new attack surfaces. Traditional IT security (firewalls, IDS) combines with OT-specific protections (zone segmentation, protocol whitelisting, industrial DMZs).

The IT/OT Convergence Reality

The traditional factory has two separate networks: IT for business systems (ERP, MES, email) and OT for production control (PLCs, SCADA, robots). They are managed by different teams, use different protocols, and have different reliability requirements.

Huawei’s approach converges them into a single managed infrastructure. The benefits:

  • Single management plane β€” iMaster NCE manages both IT switches and OT industrial switches
  • Unified visibility β€” one dashboard shows the health of business applications and production control
  • Reduced hardware β€” one network infrastructure instead of two
  • Faster deployment β€” new production lines connect to the existing converged network

The tradeoff is complexity. IT/OT convergence requires careful network segmentation, strict QoS policies, and security zones that prevent IT traffic from ever interfering with real-time OT control.

OT Industrial Hardware: Switches and Gateways

The OT domain runs on purpose-built industrial hardware β€” not repurposed enterprise IT gear.

Huawei OT industrial switches and gateways

Industrial Gateway: AR502H-V2 Series

The AR502H-NRGL-V2 is a computing-network-security hyper-converged gateway:

  • 4Γ—GE electrical + 2Γ—10GE optical ports
  • 5G/LTE connectivity for wireless backhaul
  • SD-WAN for multi-site factory orchestration
  • IPS/AV/URL security built into the gateway β€” no separate firewall needed at the OT edge

This is the edge device that bridges factory floor OT traffic to the wider network, combining routing, security, and compute in a single ruggedized unit.

Industrial Switches: Ring Topology with ERPS

The industrial switches deliver three critical capabilities:

  • High reliability β€” Ethernet Ring Protection Switching (ERPS) provides fast failover within 20 ms upon network-level faults. This ring topology ensures that a single link failure never takes down production
  • High bandwidth β€” 10G backhaul on uplink/ring ports, supporting machine vision video acquisition for AI-powered quality inspection
  • IT/OT convergence β€” unified O&M management platform for the production IT network, simplifying operations

These switches support PROFINET β€” the industrial Ethernet standard used by Siemens PLCs and managed through Siemens TIA Portal (Totally Integrated Automation). This means Huawei’s network hardware integrates directly into existing Siemens automation environments without requiring a forklift upgrade of the control layer.

The ERPS ring topology is critical: in a traditional star topology, losing the central switch means losing the entire network segment. With ring protection, traffic reroutes in under 20 ms β€” fast enough that PLCs never lose a control cycle.

Key Hardware in the Stack

DeviceRoleLayer
USG6655ENext-gen firewallFactory perimeter
S12700E-SProduction core switchFactory backbone
AirEngine 9700Wi-Fi 7 controllerFactory
S5730-HAggregation switchWorkshop
AirEngine 8776/8775Wi-Fi 7 APsWorkshop floor
AR502H-V2Industrial gateway (5G/SD-WAN/IPS)OT edge
S5735I-HIndustrial ring switch (ERPS)OT domain
iMaster NCEAI network managementAll layers

IIoT Data Platform: Huawei + EMQ Partnership

The data layer is powered by a joint Huawei and EMQ solution β€” built on MQTT, the most widely adopted messaging protocol in industrial IoT.

Huawei and EMQ IIoT platform architecture

The platform runs on NeuronEX, EMQ’s industrial edge data processing engine, with three core pillars:

Data Collection

NeuronEX collects from every type of industrial source:

  • PLCs from ABB, Mitsubishi, Schneider, and Siemens via native drivers
  • OPC UA β€” the standard protocol for industrial interoperability
  • MQTT β€” the lightweight pub/sub protocol that has become the de facto standard for IIoT
  • REST/API for IT system integration
  • Database and File/Image sources for batch and vision data
  • Smart devices, CNC machines, and SCADA systems

MQTT is the backbone here. It is the most adopted IIoT protocol because of its lightweight footprint (runs on constrained devices), reliable QoS levels (exactly-once delivery for critical commands), and native pub/sub model that decouples producers from consumers.

Data Analysis

Before data leaves the edge, NeuronEX processes it locally:

  • Data normalization and transformation β€” converting vendor-specific formats to unified schemas
  • Streaming processing with time windows β€” real-time aggregation without cloud roundtrips
  • User-defined functions for custom business logic
  • AI/ML algorithms running at the edge for predictive maintenance and anomaly detection

Data Delivery

Processed data routes to enterprise systems via multiple protocols:

  • MQTT and SparkplugB (the MQTT specification for industrial data with standardized topic namespaces and payloads)
  • HTTP/HTTPS and WebSocket for web applications
  • Database connectors for time-series and relational stores
  • Kafka for high-throughput event streaming to data lakes

On the application side, EMQX (EMQ’s enterprise MQTT broker) feeds into the IIoT Platform, Analytics, Visualization, and custom Applications.

Why MQTT Won

MQTT has become the standardized protocol for industrial IoT because:

  • Lightweight β€” minimal packet overhead, runs on microcontrollers with 256 KB RAM
  • Reliable β€” QoS 0 (at most once), QoS 1 (at least once), QoS 2 (exactly once)
  • Scalable β€” EMQX handles 100+ million concurrent connections
  • Standardized β€” OASIS standard, vendor-neutral, supported by every major cloud (AWS IoT Core, Azure IoT Hub, GCP IoT)
  • Bidirectional β€” unlike HTTP polling, MQTT pushes data in both directions

AI Quality Inspection

One of the highest-value applications running on this converged infrastructure is AI-powered quality inspection. The 10 Gbps industrial ring network exists largely to support this use case.

Traditional quality control relies on human inspectors or basic optical sensors with fixed thresholds. AI quality inspection replaces this with computer vision models that:

  • Detect defects at PPB (parts per billion) precision β€” far beyond what human inspectors can achieve
  • Process machine vision video streams in real-time β€” the 10G backhaul on industrial switches was designed specifically for this bandwidth requirement
  • Learn and adapt β€” models retrain on new defect types without reprogramming inspection rules
  • Run at the edge β€” NeuronEX executes AI/ML algorithms locally, avoiding the latency of cloud roundtrips that would slow production lines

The pipeline works end-to-end: high-resolution cameras on the production line capture images at line speed, the industrial ring network delivers video to edge compute nodes with under 20 ms latency, AI models classify defects in real-time, and results feed back into the MES for automated reject/rework decisions.

This is why the entire infrastructure stack β€” Wi-Fi 7 zero-roaming, ERPS ring topology, 10G backhaul, NeuronEX edge processing β€” comes together. Each component enables the next: without reliable high-bandwidth networking, AI quality inspection at scale is impossible.

Real-World Use Case: Automotive Truck Manufacturing

This is not theoretical. Huawei demonstrated live use cases from a truck manufacturing customer in the automotive industry:

AI quality inspection use cases from automotive truck manufacturing

The AI vision system handles multiple inspection types across the production line:

  • Assembly verification β€” cameras with HoloSens AI verify that truck cab components are correctly positioned during assembly, catching misalignment before the vehicle moves to the next station
  • Gear and bearing inspection β€” computer vision detects surface defects on transmission gears and bearings at microscopic precision, with bounding boxes highlighting anomalies that human inspectors would miss
  • Robotic welding quality β€” AI monitors robotic welding arms in real-time, verifying weld quality and detecting deviations from the programmed path
  • Underbody inspection β€” automated cameras inspect the undercarriage of assembled vehicles, detecting missing bolts, misrouted cables, or assembly defects
  • VIN and label verification β€” OCR reads vehicle identification numbers and specification labels, cross-referencing against the MES to ensure the right configuration was built

Each of these inspection points generates high-resolution video streams that flow through the 10G industrial ring network to edge AI nodes running inference models. Defects trigger immediate alerts in the MES β€” stopping the line if critical, flagging for rework if minor.

Storage: OceanStor as the Foundation of Production

Manufacturing generates massive volumes of data that must be stored reliably for years. In industries like battery manufacturing and automotive, regulatory requirements mandate 10+ years of B2B data retention β€” every quality inspection image, every sensor reading, every production parameter must be traceable.

Huawei OceanStor production storage architecture

Huawei’s OceanStor Dorado all-flash storage platform underpins the entire factory data infrastructure, serving three core production systems:

  • MES β€” plan management, production management, warehouse management, quality management
  • ERP System β€” order management, financial planning, procurement, HR
  • PLM/PDM β€” design drawings, quality control files, product documentation, supply information

All-Flash Acceleration

  • 1 million tpmC per database with fine-grained virtualization management
  • 40 million IOPS at 0.5 ms latency β€” critical for real-time MES queries during production
  • 6x bandwidth improvement for faster file access across design, quality, and production data

Resilience and Reliability

  • 99.99999% reliability (seven nines) β€” gateway-free active-active SAN and NAS integration with smooth upgrade to multi-datacenter active-active configuration
  • Six-layer ransomware protection with network-storage collaboration achieving 99.9% ransomware detection rate
  • Three-tier architecture: Production DC (HyperMetro) β†’ Intra-city Backup DC (Replication) β†’ Remote DR DC

Efficient Management

  • One system for multiple workloads β€” the only solution in the industry that provides native support for SAN, NAS, and object protocols simultaneously
  • DME (Data Management Engine) β€” unified lifecycle management across all storage tiers
  • Supports all major platforms: VMware vSphere, Hyper-V, Red Hat Virtualization, Kubernetes, Oracle, SAP HANA, SQL Server

Why 10-Year Retention Matters

In battery manufacturing, automotive, and other regulated B2B industries, traceability is non-negotiable. If a battery cell fails in the field five years after production, the manufacturer must trace back to the exact production batch, the quality inspection results, the raw materials used, and the machine parameters at the time of manufacturing. OceanStor’s tiered architecture β€” all-flash for active data, NAS for warm data, object storage for long-term archive β€” makes 10+ year retention economically viable without sacrificing access speed for recent production data.

AI-Quality Inspection Storage: Distributed Multi-Protocol Architecture

The AI quality inspection pipeline generates enormous volumes of image and video data from thousands of industrial computers. Huawei’s dedicated AI-Quality Inspection Storage Solution addresses this with a distributed storage system supporting S3, iSCSI, and NFS natively.

Huawei AI-Quality Inspection Storage Solution

The Data Flow

At the factory level, multiple workshops run specialized production lines β€” tire manufacturing, door manufacturing, power supply manufacturing β€” each with industrial computers performing five types of AI inspection:

  • Operation standard checks β€” verifying workers follow correct procedures
  • Surface defect checks β€” detecting scratches, dents, discoloration
  • Wrong/Missing/Reverse detection β€” components in wrong position or missing entirely
  • Industrial OCR/Code reading β€” reading serial numbers, barcodes, QR codes
  • Safe production monitoring β€” ensuring safety compliance on the factory floor

Industrial computers on the production lines connect to inference servers that run the AI models, while a REST API feeds results back to the MES for real-time production decisions.

Data Center Storage Layer

In the data center, two OceanStor Pacific systems handle different workloads:

  • OceanStor Pacific 9920 β€” serves the AI training cluster and big data yield analytics via NFS/POSIX and HDFS/S3
  • OceanStor Pacific 9548 β€” serves the DME (Data Management Engine) via NFS
  • Data tiering between the two systems optimizes cost vs. performance
  • Model synchronization flows from the data center back to factory inference servers via SMB/FTP/NFS/S3

Hour-Level Synchronization of Thousands of Industrial Computers

  • Efficient multimodal file write with I/O size adaptation β€” inspection images, video clips, and sensor data all have different I/O profiles
  • One storage system for the entire process β€” multi-protocol interworking (S3 + NFS + iSCSI) eliminates data copying between systems
  • SmartQoS ensures core service priority β€” production data never gets starved by analytics workloads

50% Lower Data Storage Costs

  • 2:1 inspection image compression without quality loss
  • Automatic tiering of hot and cold data β€” recent inspections on fast storage, historical data migrates automatically
  • 91.6% capacity utilization with 22+2 high-ratio erasure coding
  • High-density hardware saves energy and rack space

30% Higher Management Efficiency

  • File retrieval from 50 billion records within seconds β€” periodic synchronization of storage metadata enables GUI-based batch retrieval on DME
  • Multi-dimensional retrieval across 18 criteria: name, type, size, time, tag, and more
  • Secure data isolation between multiple tenants β€” different production lines and customers share infrastructure safely
  • Global data visibility across multiple regions via global file system

End-to-End Computer Vision: Huawei + K2Tech Partnership

Beyond the infrastructure, Huawei partners with K2Tech for a complete end-to-end computer vision (E2E CV) solution that covers the full inspection pipeline from loading to packaging.

Huawei and K2Tech E2E computer vision solution

Industry Verticals

The E2E CV solution targets three major manufacturing verticals:

Auto-Parts β€” inspecting discs, gears, valve seat rings, gear rings, oil bearings, piston pins, sprockets, helical gears, stators, rotors, clutch pressure plates, planetary gears, valve seats, and various powder metallurgy workpieces. Every component that goes into a vehicle drivetrain gets AI-inspected.

Bearings β€” covering the full range: bearing rollers, tapered rollers, cylindrical rollers, self-aligning rollers, needle rollers, ferrules, dust covers, and steel balls. Each type requires different inspection models trained on specific defect patterns.

New Energy (EV Batteries) β€” detecting appearance defects such as indentation, notch, foreign matter, and pollution on cylindrical and square battery cells, as well as positive and negative electrodes. With EV production scaling globally, battery quality inspection is one of the highest-stakes use cases β€” a defective cell can cause thermal runaway.

Consumer Products β€” concentricity inspection for tablets, cigarettes, and other consumer goods.

The Inspection Production Line

The physical pipeline follows five stages:

  1. Loading β€” automated feeding of parts into the inspection line
  2. Wash and Dry β€” cleaning parts to remove manufacturing residue that could trigger false defect detections
  3. AOI Inspection β€” Automated Optical Inspection using AI computer vision models
  4. Oiling β€” applying protective coatings to passed parts
  5. Packaging β€” automated packing with multiple packing configurations

Business Impact

The K2Tech partnership delivers four measurable outcomes:

  • Multiple packing ways β€” flexible output configurations
  • Lower production cost β€” automated inspection replaces manual quality teams
  • Improved efficiency β€” line-speed inspection without bottlenecks
  • Improved accuracy β€” AI catches defects that human inspectors miss, especially on high-speed lines

Open Discussion: The Real Challenges

The session ended with open discussion topics that reveal where the industry is today β€” and what problems remain unsolved.

Open discussion topics β€” Retail and Manufacturing challenges

Retail Challenges

  1. Building stores and multi-branch networks during chain store expansion β€” how do you deploy consistent IT/OT infrastructure across hundreds of locations without an army of on-site engineers?
  2. Planning and using electronic shelf labels (ESLs), RFID, and energy-saving technologies β€” the convergence of digital pricing, inventory tracking, and sustainability in physical retail

Manufacturing Challenges

  1. Problems faced by new or old factories during construction or reconstruction to meet intelligent requirements β€” this is the brownfield vs. greenfield question. New factories can design for IT/OT convergence from day one. Existing factories must retrofit decades-old infrastructure without stopping production.
  2. Challenges in improving product quality inspection efficiency β€” exactly what the K2Tech E2E CV solution and AI quality inspection pipeline address. The question is not whether AI inspection works, but how to scale it across thousands of production lines with different products, defect types, and line speeds.

These are the questions that matter most to the manufacturing and retail industries adopting smart infrastructure in 2026.

Why This Matters for Platform Engineers

If you work in cloud infrastructure, this architecture might seem far from Kubernetes clusters. But the patterns are identical:

  • Declarative management β€” iMaster NCE is essentially GitOps for factory networks
  • Observability β€” AI-driven traffic analysis is the factory equivalent of Prometheus + Grafana
  • Zero-trust segmentation β€” IT/OT zones mirror Kubernetes NetworkPolicies
  • Self-healing β€” ring network failover is the physical equivalent of pod rescheduling

As AI workloads increasingly connect to physical systems (robotics, quality inspection, predictive maintenance), the boundary between cloud infrastructure and factory infrastructure is dissolving.


Want to bridge AI infrastructure with industrial operations? Book a consultation to discuss IT/OT convergence strategies.

Free 30-min AI & Cloud consultation

Book Now