Designing a Rule Engine for Industrial Automation

Master industrial rule engines: from simple alerts to complex multi-device workflows. Learn how Proxus combines visual rules with C# scripting to handle 10,000+ rules per second with zero allocation overhead.

Industrial automation projects typically start with a simple requirement:

"Send an alarm if this tag exceeds that value."

But real plants quickly outgrow simple threshold checks. Within weeks, customers ask for:

Multi‑step workflows (IF X THEN Y, THEN Z)
Time-based dependencies (IF X for >5 minutes THEN...)
Dependencies across multiple lines or sites (IF Line1.OEE < 60% AND Line2.Running THEN...)
Integration with ERP, CMMS, and ticketing systems (THEN create SAP maintenance order)
Feedback loops and state machines (Rule A triggers Rule B, which triggers Rule A again?)

Most off-the-shelf rule engines falter here. They either offer simplicity (and shatter under complex logic) or complexity (and require a PhD to understand). Proxus takes a different approach: combining visual rules for the 80% case with code for the 20% case, all on a single, unified execution runtime that scales to 10,000+ rules per second.

The Problem with Traditional Rule Engines

Before diving into Proxus's solution, let's understand why rule engines are so hard:

Challenge #1: Scope Creep

A simple threshold rule grows into a state machine. What started as "alert when temp > 80°C" becomes "alert when temp > 80°C for >2 minutes, but not if we're in cooldown mode, but do alert if temp > 85°C immediately."

Challenge #2: Multi-Device Correlation

"Trigger alarm when Robot A is running AND Pressure Sensor B shows declining pressure AND not in maintenance window." This requires:

Reading multiple devices
Correlating their states
Checking context (time windows, override flags)
Triggering actions conditionally

Challenge #3: Performance at Scale

A plant with 5,000 devices and 2,000 rules needs to evaluate millions of conditions per second. A naive approach (query database for each rule, parse JSON) hits performance limits fast.

Challenge #4: Versioning & Deployment

When you update a rule, how do you:

Test it without affecting production?
Roll back if it causes issues?
Track which version is running where?

Most rule engines offer poor answers.

Proxus Rule Engine: Visual + Code Unified

Proxus solves these by offering two complementary approaches on a single runtime:

1. Visual Rule Builder (No-Code)

For engineers without programming background, the visual rule builder provides an intuitive drag-and-drop interface. Think of it as "Excel for automation."

Building Blocks

Triggers (When to evaluate)

Tag change: Rule triggers when a specific tag changes value
Schedule: Rule triggers on time-based patterns (daily 8 AM, every hour, etc.)
Event from external system: Rule triggers when MES publishes a message or webhook fires
Manual trigger: Operators can trigger rules via UI for testing

Conditions (What to check)

Simple comparisons: Temperature > 80
Range checks: Humidity between 40% and 60%
Boolean logic: (A > 10) AND (B < 5) OR (C == "Fault")
Time windows: During 06:00-18:00 or Not during maintenance_window
State checks: If device status == "Running"
Thresholds with hysteresis: Prevents flickering (trigger at >80, clear at <75)

Actions (What to do)

Notifications: Email, SMS, Slack, Teams
PLC writes: Write values back to devices (turn on pump, close valve)
MQTT publish: Send messages to the UNS for other systems to consume
HTTP calls: Webhook integration (trigger webhook on external API)
Create tickets: Automatic CMMS/SAP integration (create maintenance order, change request)
Log event: Record to local database or ClickHouse
Escalation: If not acknowledged in X minutes, escalate to higher tier

Example Visual Rule

TRIGGER: Tag "Line1.Robot.Temperature" changes

CONDITION:
  IF Temperature > 85°C
    AND Robot.Running == true
    AND NOT (MaintenanceMode == true)
    AND (Alert_Temperature_Cooldown < 5 minutes ago)

ACTION:
  - Stop Robot (write 0 to "Line1.Robot.MotorCommand")
  - Publish to MQTT: "ProxusMfg/Istanbul/Line1/alerts/TemperatureEmergency"
  - Create SAP notification: "Emergency stop triggered - high temperature"
  - Set flag: Alert_Temperature_Cooldown = now()
  - Send Slack: "@production_team Robot stopped due to temperature"

2. C# Scripting for Power Users

When visual rules cannot express the logic, drop down to C# code. Proxus provides a well-defined SDK:

Why C#?

Familiar: C# is widely used in industry
Safe: Compiled, typed (no runtime interpretation errors)
Fast: JIT-compiled to machine code
Batteries included: LINQ, async/await, memory pools

Example: Multi-Device Correlation with ML

public class PredictiveMaintenanceRule : RuleScript
{
  public async Task EvaluateAsync(RuleContext context)
  {
    // Get latest values for multiple devices
    var vibration = context.GetTag("Production/Motor/Vibration");
    var temperature = context.GetTag("Production/Motor/Temperature");
    var age_hours = context.GetTag("Production/Motor/AgeHours");
    
    // Complex logic: apply ML model
    var anomaly_score = await context.AI.ScoreAsync(new[] 
    {
      vibration, temperature, age_hours 
    });
    
    if (anomaly_score > 0.85)
    {
      // Multi-step action
      await context.Publish("alert/motor_failure_risk", new 
      {
        score = anomaly_score,
        timestamp = DateTime.UtcNow,
        recommended_action = "Schedule maintenance within 48 hours"
      });
      
      // Create CMMS ticket
      await context.External.CreateTicket(new TicketRequest
      {
        System = "SAP",
        Type = "Maintenance",
        Priority = "High",
        Description = $"Motor failure predicted (ML score: {anomaly_score:P})"
      });
    }
  }
}

Auto-Injected Safety

Proxus automatically wraps your code with:

// Your code is wrapped like this:
try
{
  // Your code executes here
  // With automatic disposal of resources
}
catch (Exception ex)
{
  logger.LogError($"Rule failed: {ex.Message}");
  // Automatic alerting
}

This means you cannot accidentally crash the platform or leave resources open.

Zero-Allocation Patterns

For rules that execute 1,000+ times per second, Proxus recommends pooling:

// Allocate once, reuse forever
private static readonly ArrayPool<byte> _pool = ArrayPool<byte>.Shared;

public async Task EvaluateAsync(RuleContext context)
{
  // Rent from pool, not allocate
  var buffer = _pool.Rent(1024);
  try
  {
    // Use buffer
    ProcessData(buffer);
  }
  finally
  {
    _pool.Return(buffer);
  }
}

This minimizes garbage collection pressure (critical for sub-millisecond latency rules).

3. A Single Execution Model

Here is the magic: both visual rules and C# scripts compile to the same actor-based runtime. This means:

Unified Performance

Visual rule:    Compiled to bytecode → Actor execution engine
C# script:      Compiled to IL → Actor execution engine

Result: Same performance profile, same execution semantics

Horizontal Scalability

Single gateway: 10,000 rules/second ✓
2 gateways:     20,000 rules/second ✓
10 gateways:    100,000 rules/second ✓

Rules execute in parallel across gateways with zero coordination needed

Clear Evolution Path

Start: Simple visual rule
Grow:  Add conditions and time windows (still visual)
Scale: Extract complex logic to C# (reuse visual rule as scaffold)
Evolve: Full C# engine with caching and optimization

No rewriting. No rip-and-replace. Just smooth evolution.

Advanced Use Cases

Use Case 1: OEE Calculation

Trigger: Every minute (on schedule)

Action: Compute OEE for each production line
  Availability = (Planned Time - Downtime) / Planned Time
  Performance = (Ideal Cycle Time × Pieces Produced) / Run Time
  Quality = Good Pieces / Total Pieces
  OEE = Availability × Performance × Quality
  
Result: Publish OEE to UNS for dashboards

Use Case 2: Dynamic Threshold Based on Context

Trigger: Tag "Production.Temperature" changes

Condition (C# logic):
  - If night shift: Temperature threshold = 75°C
  - If day shift: Temperature threshold = 78°C
  - If production line warming up: Threshold = 85°C (first 10 min)
  - If maintenance mode: No alert (bypass rule)
  
Action: Alert only if exceeds context-aware threshold

Use Case 3: Multi-Site Correlation

Trigger: Any line reports downtime

Action (C#):
  1. Query all lines: Are 2+ lines down simultaneously?
  2. If yes: Check if it's a utility failure (power, compressed air, water)
  3. If utility failure confirmed: Escalate to site director
  4. If specific to one line: Alert line supervisor
  5. Log correlation for analytics

Performance: 10,000+ Rules Per Second

How does Proxus achieve this?

1. Compiled Execution

Rules are compiled to native code, not interpreted.

2. Actor-Based Concurrency

Each rule runs in isolation. No locks, no contention.

3. Smart Caching

First evaluation: Read tags from device (slow)
Cached result:   Read from in-memory cache (fast)
Invalidation:    Cache cleared only when tag actually changes

4. Lazy Evaluation

If Condition A is false, Condition B and C are never evaluated.

5. Batch Processing

Multiple rules that depend on the same device are evaluated together, with a single device read.

Result: A modern laptop can evaluate 10,000 rules/second with <1 ms latency.

Security & Code Injection Prevention

Allowing users to upload C# code is risky. Proxus mitigates this via:

1. Namespace Blacklist

Dangerous namespaces are blocked:

❌ System.Reflection (code generation)
❌ System.Net (external network access)
❌ System.IO (file system access)
✅ Whitelisted namespaces: System.Linq, System.Collections, System.Text

2. Code Analysis

Before compilation, the C# code is analyzed for suspicious patterns:

Infinite loops
Allocations exceeding limits
Calls to blacklisted APIs

3. Sandboxing

Each C# rule runs in its own AppDomain with resource limits:

Memory: 512 MB max
Execution time: 5 seconds max (timeout)
CPU affinity: Constrained to specific cores

4. Audit Trail

All rule uploads, modifications, and executions are logged with user attribution.

Versioning & Deployment Strategy

Rules evolve. The question is: how do you manage versions safely?

Proxus Versioning Model

Rule: "Emergency Stop - Temperature"

Version 1 (2025-01-10):
  Threshold = 85°C
  Status: ACTIVE (all gateways)

Version 2 (2025-01-15):
  Threshold = 82°C (more conservative)
  Status: STAGING (test gateway only)
  Testing for 7 days...

Version 3 (2025-01-22):
  Threshold = 80°C (even more conservative)
  Status: APPROVED (but not deployed yet)
  Ready to deploy on user approval

Version 2 (2025-01-20):
  Status: ROLLED_BACK (bug found, reverted to v1)

Safe Deployment

Write rule on central server
Test on designated gateway
Approve after validation
Deploy to production gateways (can schedule for off-hours)
Monitor execution for anomalies
Rollback instantly if issues detected

Integration with External Systems

Rules do not live in isolation. They trigger actions in:

Webhooks

Action: HTTP POST to external API

Example: When production exceeds target
  POST https://api.supplier.com/orders/create
  Body: { "product_id": "X", "quantity": 100, "due_date": "2025-01-20" }

Message Brokers

Action: Publish to Kafka topic

Example: Real-time anomaly detection
  Topic: "manufacturing/anomalies"
  Message: { "device_id": "motor_1", "anomaly_type": "vibration", "score": 0.92 }
  Consumers: ML models, dashboards, archival systems

ERP Integration

Action: Create SAP maintenance order

Proxus automatically:
  - Authenticates to SAP (OAuth/SAML)
  - Creates notification
  - Links to equipment master data
  - Assigns to maintenance team

FAQ: Rule Engine Myths

Q: Can I use visual rules for everything?

A: ~80% of use cases. The remaining 20% (complex correlations, ML) need C#. Start visual; evolve to code only where needed.

Q: What if my rule has a bug?

A: Rules execute in sandboxes with timeouts. A runaway rule cannot crash the platform. Plus, you can rollback instantly to a previous version.

Q: Can I run thousands of rules?

A: Yes. Proxus can handle 10,000+ rules/second per gateway. Distribute across gateways for higher throughput.

Q: How do I test rules?

A: Deploy to a test gateway, execute with replay data, validate results. Rules are versioned, so you can test in parallel with production.

Q: Can rules call external APIs?

A: Yes, via webhook actions. But be cautious: external APIs can fail or be slow. Use timeouts and circuit breakers.

Migration from Legacy Rule Engines

If you are migrating from Wonderware, FactoryTalk, or other platforms:

Phase 1: Audit Existing Rules (1 week)

Export all rules from legacy system
Document logic and actions
Identify patterns

Phase 2: Re-implement in Proxus (2-4 weeks)

Start with most-used rules (80/20 principle)
Implement in visual builder first
Migrate complex rules to C# as needed

Phase 3: Validation (1-2 weeks)

Run both systems in parallel
Compare outputs
Validate edge cases

Phase 4: Cutover (1 day)

Switch to Proxus
Rollback procedure ready
Monitor closely

Best Practices

Start simple: Use visual rules for initial implementations
Version everything: Never modify rules; create new versions
Test rigorously: Use test gateways before production rollout
Monitor executions: Track rule latency, errors, and trigger frequency
Document logic: Add comments explaining why, not just what
Use timeouts: All C# rules have built-in timeouts; respect them
Iterate based on data: Tune thresholds based on production telemetry
Use sandboxing: Contain complex rules to prevent cascading failures

Getting Started

Simple Rule (5 minutes)

Open Proxus UI
Click "New Rule"
Select trigger: Tag change
Select condition: Temperature > 80
Select action: Send email
Deploy to edge gateway

Advanced Rule (30 minutes)

Create C# rule script
Implement multi-device correlation
Test locally
Deploy to test gateway
Monitor for 24 hours
Promote to production

Conclusion: Rules as First-Class Citizens

In Proxus, rules are not an afterthought bolted onto the platform. They are first-class citizens: versioned, tested, monitored, and optimized. Whether you need simple alerts or complex industrial workflows, the unified visual + code approach scales from hobby projects to mission-critical automation running thousands of rules per second.

Ready to automate? Deploy your first rule or explore complex rule engine patterns in our guide.

For advanced consultation on rule design and optimization, contact our automation team.