Architecture

Most integration platforms are workflow engines: a sequence of steps that wakes up when a trigger fires. Orchesty is a distributed data pipeline: live infrastructure that is always on, always ready to ingest data, and built to process one message or a million with the same shape.

Understanding that shift is the prerequisite for everything else in these docs. The rest of this page builds on it: what you see in the editor, what runs underneath, how the platform isolates topologies from each other, and where the boundary between your code and the platform falls.

What you see #

Two objects do most of the work in your day-to-day mental model:

Topology — a runnable graph you compose in the visual editor. Once published, the topology is not a drawing of intent; it is the running pipeline. Nodes are real services standing by, edges are real message queues holding data in flight.
Process — every entry into the topology (a webhook, a Cron tick, an API call, a manual Start) creates a process. A process is a tracked execution of one initial message as it travels the graph. The platform records every node visit and every outcome against the process so you can replay history end to end.

A topology with nodes connected by message queues, with one process flowing from the entry point to the end — A topology is a live pipeline; a process is one message making its way through it.

What is behind the edges #

The connections you draw between nodes are not visual aids. Each edge is a real message queue owned by the platform (RabbitMQ underneath). This has direct, observable consequences:

Persistence at every step. A message sitting between two nodes is durably stored. If the consuming worker crashes, the message stays in the queue and is redelivered.
Asynchronous by default. A node finishes its work, hands the result off to the queue, and is immediately free to take the next message. Nothing waits for the next stage to be ready.
Backpressure for free. A slow consumer's queue grows; depth is visible in the dashboards. No node ever blocks on a downstream node.
Horizontal scale per node. Multiple worker replicas can consume the same queue in parallel, subject to the consumer node's prefetch setting.

A node passing a message to a queue, with multiple worker replicas consuming from the queue in parallel — Edges between nodes are message queues; replicas consume in parallel and a slow consumer cannot block its producer.

The two layers #

With those primitives in place, the architecture splits cleanly in two:

Integration layer — your code. Workers (microservices with the Orchesty SDK installed), the components they expose (Applications, Connectors, Custom Nodes, Batch nodes), and the third-party systems those components talk to.
Orchestration layer — the platform. Queues, persistence, routing, the topology editor, the worker registry, dashboards, the Trash inbox, and the audit log.

The boundary is the manifest and process protocol the SDK speaks. Everything above the line is your code in your stack; everything below it is the platform.

What it owns	Integration layer (you)	Orchestration layer (Orchesty)
Business logic, transformations, API calls	Yes	No
Authentication and credentials for third-party APIs	Yes (via Application installs)	Stores, never inspects
Choosing which queue a message goes to next	Optional (routing rules)	Yes (default fan-out)
Holding messages between steps	No	Yes (queues, persistence)
Retries, throttling, rate limiting, error capture	Configures policy	Enforces policy
Observability — process records, dashboards, logs	Emits structured logs	Aggregates and presents
Worker placement, scaling, replication	You operate	Routes to whichever replica is up

You write code only against the integration layer. The orchestration layer is configured through the Admin UI and the platform REST API.

Per-topology isolation: the Bridge #

When you publish a topology, the platform spins up a dedicated control microservice for it called the Bridge. Each Bridge is a small Go service running in its own container, and it drives only its own topology — the routing decisions, the queue bindings, the process tracking for that one graph.

This is the unit of operational isolation: an overload, a misbehaving node, or a bursty event source on one topology cannot starve another, because they do not share a control process. The platform scales by its own architecture — every published topology adds an independent unit of execution.

Three published topologies, each with its own dedicated Bridge microservice, sharing the platform's queues and persistence layer underneath — Every published topology gets its own Bridge — isolation is built into the deployment model, not bolted on.

Atomization and parallelism #

Because edges are queues and every node has its own queue, parallelism is a property of the architecture rather than a feature you opt into.

Atomization. A Batch node fed 10 000 records emits 10 000 individual processes. Each one is independent and can be retried, traced, or trashed in isolation.
Per-node scaling. If one transformation becomes the bottleneck, you scale that node's worker replicas. The rest of the topology is unaffected.
Fan-out is the default. When a node finishes, the Bridge sends the result down every outgoing edge unless the node returns a routing rule that picks a subset (see Topologies → Routing).

A batch entering a topology being split into many parallel processes, with one node having more replicas than the others — Throughput is added node by node; a slow stage gets more replicas without touching the rest of the topology.

What the platform does for you #

The split exists so that you write business logic and the platform handles everything around it:

Persistence. Every message is durably stored at every step.
Retries. Connectors declare a retry policy in code; the platform enforces it without you writing any retry loop. See Building nodes — Error handling and retries.
Throttling and rate limiting. Configured per node; the platform paces dispatch to respect downstream limits.
Observability. Structured logs, process records, dashboards, and the Trash inbox all come from the orchestration layer.
Resilience. A worker crash does not lose data; the queue holds the message until a replica picks it up, and unrecoverable failures land in Trash for manual recovery.
Scale. You scale workers for throughput and the platform scales by adding Bridges as you publish more topologies.

A worker box surrounded by platform-provided capabilities: persistence, retries, throttling, observability, resilience, and scaling — You ship business logic; persistence, retries, throttling, observability, and scaling are part of the platform.

Reference architecture #

A typical production deployment looks like:

Workers can sit on the same host as the orchestration layer (full-stack setup, on-prem) or far away (slim worker setup pointing at Orchesty Cloud). The choice is operational, not architectural — the layers and protocols are the same in both cases. See Deployment Models for the supported variants and what changes between them.