Architecture
Most integration platforms are workflow engines: a sequence of steps that wakes up when a trigger fires. Orchesty is a distributed data pipeline: live infrastructure that is always on, always ready to ingest data, and built to process one message or a million with the same shape.
Understanding that shift is the prerequisite for everything else in these docs. The rest of this page builds on it: what you see in the editor, what runs underneath, how the platform isolates topologies from each other, and where the boundary between your code and the platform falls.
What you see #
Two objects do most of the work in your day-to-day mental model:
- Topology — a runnable graph you compose in the visual editor. Once published, the topology is not a drawing of intent; it is the running pipeline. Nodes are real services standing by, edges are real message queues holding data in flight.
- Process — every entry into the topology (a webhook, a Cron tick, an API call, a manual Start) creates a process. A process is a tracked execution of one initial message as it travels the graph. The platform records every node visit and every outcome against the process so you can replay history end to end.
What is behind the edges #
The connections you draw between nodes are not visual aids. Each edge is a real message queue owned by the platform (RabbitMQ underneath). This has direct, observable consequences:
- Persistence at every step. A message sitting between two nodes is durably stored. If the consuming worker crashes, the message stays in the queue and is redelivered.
- Asynchronous by default. A node finishes its work, hands the result off to the queue, and is immediately free to take the next message. Nothing waits for the next stage to be ready.
- Backpressure for free. A slow consumer's queue grows; depth is visible in the dashboards. No node ever blocks on a downstream node.
- Horizontal scale per node. Multiple worker replicas can consume the same queue in parallel, subject to the consumer node's prefetch setting.
The two layers #
With those primitives in place, the architecture splits cleanly in two:
- Integration layer — your code. Workers (microservices with the Orchesty SDK installed), the components they expose (Applications, Connectors, Custom Nodes, Batch nodes), and the third-party systems those components talk to.
- Orchestration layer — the platform. Queues, persistence, routing, the topology editor, the worker registry, dashboards, the Trash inbox, and the audit log.
The boundary is the manifest and process protocol the SDK speaks. Everything above the line is your code in your stack; everything below it is the platform.
| What it owns | Integration layer (you) | Orchestration layer (Orchesty) |
|---|---|---|
| Business logic, transformations, API calls | Yes | No |
| Authentication and credentials for third-party APIs | Yes (via Application installs) | Stores, never inspects |
| Choosing which queue a message goes to next | Optional (routing rules) | Yes (default fan-out) |
| Holding messages between steps | No | Yes (queues, persistence) |
| Retries, throttling, rate limiting, error capture | Configures policy | Enforces policy |
| Observability — process records, dashboards, logs | Emits structured logs | Aggregates and presents |
| Worker placement, scaling, replication | You operate | Routes to whichever replica is up |
You write code only against the integration layer. The orchestration layer is configured through the Admin UI and the platform REST API.
Per-topology isolation: the Bridge #
When you publish a topology, the platform spins up a dedicated control microservice for it called the Bridge. Each Bridge is a small Go service running in its own container, and it drives only its own topology — the routing decisions, the queue bindings, the process tracking for that one graph.
This is the unit of operational isolation: an overload, a misbehaving node, or a bursty event source on one topology cannot starve another, because they do not share a control process. The platform scales by its own architecture — every published topology adds an independent unit of execution.
Atomization and parallelism #
Because edges are queues and every node has its own queue, parallelism is a property of the architecture rather than a feature you opt into.
- Atomization. A Batch node fed 10 000 records emits 10 000 individual processes. Each one is independent and can be retried, traced, or trashed in isolation.
- Per-node scaling. If one transformation becomes the bottleneck, you scale that node's worker replicas. The rest of the topology is unaffected.
- Fan-out is the default. When a node finishes, the Bridge sends the result down every outgoing edge unless the node returns a routing rule that picks a subset (see Topologies → Routing).
What the platform does for you #
The split exists so that you write business logic and the platform handles everything around it:
- Persistence. Every message is durably stored at every step.
- Retries. Connectors declare a retry policy in code; the platform enforces it without you writing any retry loop. See Building nodes — Error handling and retries.
- Throttling and rate limiting. Configured per node; the platform paces dispatch to respect downstream limits.
- Observability. Structured logs, process records, dashboards, and the Trash inbox all come from the orchestration layer.
- Resilience. A worker crash does not lose data; the queue holds the message until a replica picks it up, and unrecoverable failures land in Trash for manual recovery.
- Scale. You scale workers for throughput and the platform scales by adding Bridges as you publish more topologies.
Reference architecture #
A typical production deployment looks like:
Workers can sit on the same host as the orchestration layer (full-stack setup, on-prem) or far away (slim worker setup pointing at Orchesty Cloud). The choice is operational, not architectural — the layers and protocols are the same in both cases. See Deployment Models for the supported variants and what changes between them.
See also #
- Learn — Architecture & Core Principles: the narrative version, with the why up front.
- Topologies — the runnable graph in detail.
- Workers and SDK — what lives in the integration layer.
- Processes and Messages — how a single execution is tracked end to end.
- Deployment Models — where the layers actually run.