Errors Handled, Gracefully: Building Resilient Flows

Andy Esser
Apr 13, 2026

Errors Handled, Gracefully: Building Resilient Flows

Part 8 of our Workflows Unleashed series. This week: why your automation should recover, not just report.

Here's how error handling works in most automation tools:

  1. An action fails
  2. The flow stops
  3. An email notification is sent
  4. Someone reads it tomorrow
  5. They re-run the flow manually
  6. It fails again because the underlying issue hasn't been fixed

This isn't error handling. It's error reporting. And reporting is not enough.

Errors Are Just Edges

In Flomation, an error doesn't stop the flow. It routes execution to a different path — just like a conditional branch routes execution based on a value. The difference is that a conditional evaluates a planned expression, while an error path activates when something unplanned happens.

The On Error node is the entry point. Place it in your flow and connect it to the actions you want to protect. When any upstream action throws an error — a database query fails, an API returns a 500, a timeout expires, a connection is refused — execution routes to the On Error path instead of halting.

From the On Error node, you have full access to:

  • The error message and code
  • The node that failed
  • The inputs that were passed to the failing action
  • Everything upstream that succeeded

This means your error path isn't just a notification — it's a recovery strategy with complete context.

Recovery Patterns

Log and Notify

The simplest pattern. Write the error details to a database (for later analysis), post to Slack (for immediate visibility), and let the flow terminate. This is the minimum viable error handling — better than nothing, but still reactive.

Retry with Backoff

Wire the On Error path to a Wait node (Sleep action), then loop back to the original action. Add a counter via Set Variable to limit retries. On the third failure, fall through to the notification path.

This pattern handles transient errors — network blips, rate limits, temporary service outages — without human intervention.

Fallback to Secondary

When a primary API fails, route to a secondary. When the main database is unreachable, try the read replica. When the webhook endpoint returns a 503, queue the payload in Redis and retry later.

The flow structure itself defines the fallback — no runtime configuration, no feature flags, no "check the status page and switch manually."

Conditional Escalation

Use an If node on the error path to evaluate severity. A 429 (rate limit) gets a retry. A 401 (authentication failure) gets an immediate alert to the security team. A 500 gets logged and retried once. A connection timeout gets escalated to infrastructure.

Different errors deserve different responses. The flow graph makes those decisions explicit and visible.

Beyond On Error: Conditional Logic

Error handling is one form of resilience. Conditional branching is the other.

If / Else

The If node evaluates an expression — ${node.status_code} == 200, ${var.retry_count} < 3, ${node.rows} != null — and routes execution accordingly. True path for the happy case, false path for the alternative.

Switch

The Switch node handles multi-way branching with unlimited output handles. Each case has an operator (equals, contains, starts_with, ends_with, regex) and a value. The first matching case executes; unmatched inputs fall through to the default handle.

Loops

The For node iterates a fixed number of times — useful for processing arrays, retrying a known number of times, or generating a sequence. The While node iterates until a condition becomes false — useful for polling until ready, retrying until success, or processing until a queue is empty.

Both loop types expose the current iteration index, support nested loops, and have full access to the flow's variable context. Each iteration sees the results of the previous one, enabling accumulator patterns and progressive refinement.

The Result

Flows that don't just run — they adapt. A network error at 3 AM doesn't become a Slack message someone reads at 9 AM. It becomes a retry that succeeds at 3:01 AM, and nobody needs to know.

That's the difference between error reporting and error handling. One tells you something went wrong. The other fixes it.

Next Week

Version control for workflows — immutable revisions, environment-scoped deployments, and why "which version was running when it broke?" should never be an unanswerable question.

www.flomation.co — free to start, no credit card.