Rate limiting

External APIs almost always cap how many requests per second you can make against them. The cap is shared by every topology, every worker, every replica that calls the same API for the same user (Application install). Coordinating that cap by hand is brittle, so Orchesty does it for you with the Limiter.

What the Limiter does #

The Limiter is a keyed throttle that lives between your worker and the rest of the platform. When you tag a connector with a Limiter key, every request through that connector counts against the same bucket regardless of which topology or worker replica issued it.

Concretely:

One key per upstream API per user. For example hubspot:user-a and hubspot:user-b are independent buckets.
Configurable rate. "100 requests per minute" is a typical setting; expressed as "N tokens / window".
Backpressure, not failure. When the bucket is empty, the next request is held until a token is available. From the topology's point of view it just runs slightly slower.
No retries needed for 429s. If the upstream returns 429, treat it as a sign the Limiter setting is too high and lower it; don't try to absorb 429s downstream.

When to use it #

Use the Limiter for any connector calling an external API, especially:

APIs with strict per-minute or per-second quotas.
APIs that bill per request and need flow control to keep cost predictable.
APIs that respond well at low rates and start failing at high rates.

You don't need the Limiter for purely internal calls (DB, cache, internal microservice) where you control the receiving side and can scale it.

How to configure it #

Limiter settings live on the connector node in the Admin UI:

Screenshot pending

Limiter settings on a connector node

Show key, rate, window, and the live counter for the current bucket.

target 900 x 460

Choose:

Key. A stable string. Recommended shape: <service>:<user> so each user gets their own bucket. For services with per-endpoint quotas, include the endpoint: hubspot-search:<user>.
Rate. Tokens per window. Set it to just below the upstream's published quota; leave headroom for non-Orchesty traffic against the same API.
Window. Seconds, minutes, or hours depending on how the upstream defines its quota.

Combining with batch nodes #

A batch node that pages through 100,000 items will hammer the upstream as fast as your prefetch lets it. With the Limiter in place, the batch still pages quickly, but the per-item connector downstream is paced. The Limiter is what makes "load 100k records" safe to run during business hours.

For the batch shape itself see Patterns: Pagination and batch.

Operational notes #

Observability. The Admin UI shows the current bucket level and the rate of token consumption. If the bucket is constantly empty, raise the rate or split the key further (e.g. per endpoint).
Bursts. Some APIs allow short bursts above the steady rate. The Limiter's token bucket honours that: tokens accumulate up to the bucket size, then steady-state pacing kicks in.
Per-IP vs per-API key. If the upstream throttles by IP and you have multiple worker IPs, the Limiter still works, but you may be over-restricting yourself. In that case either set the rate per-IP, or use the Limiter to throttle to the combined allowed rate across all IPs.

Per-application time-to-drain & terminate #

The Limiter view in the Admin UI is the operational counterpart to the configuration above. For each rate-limited application it shows:

The number of messages currently queued waiting for a token.
The currently configured rate.
A computed estimated time to drain the backlog at that rate (e.g. "hubspot:user-a — 18,400 messages queued, ETA 2.5h").
A breakdown of which topologies are contributing to the queue.

When the backlog is the symptom of an upstream incident or a runaway producer, an operator can terminate specific processes or every process of an offending topology directly from this view. The Limiter then drains and unrelated traffic queued behind it gets its turn. This is the view to open first when the symptom is "everything against vendor X is slow" rather than "a topology is failing".

For the operational workflow that uses this view end-to-end (notification → Limiter → terminate → verify drain), see Observability in practice.