Notifications
Notifications are Orchesty's "tap on the shoulder" channel. They exist for the moments where a human needs to take a look right now: a failed process that landed in Trash, or an instance that is about to run out of room.
What we notify on #
Orchesty 3.0 supports two notification categories:
1. Failed messages #
When a process fails after all configured retries and lands in the Trash inbox, the platform can send a notification with:
- The topology name and the user the process ran for.
- The node where the failure happened.
- A short error excerpt and a deep link to the process detail in the Admin UI.
This is the notification you almost always want on for production topologies. A failed message means a real process did not complete, and it should not be discovered hours later by a customer or by a stale dashboard.
2. Instance resource limits #
When the instance is approaching a limit that could lead to data loss or process crashes (low disk on the message broker, exhausted in-flight quota, broker memory pressure), the platform sends a heads-up notification while there is still time to act.
This category is intentionally narrow. It is for "fix this or things will break", not for routine performance reports. Other operational metrics live in the dashboards (see Integration monitoring).
What we do not notify on #
To keep notifications meaningful we deliberately do not send:
- Routine "process completed" pings. Use logs or the Admin UI for that.
- Per-message warnings. They go to the Trash inbox if they failed, or to logs if they didn't.
- Performance reports. Look at the dashboards.
Channels #
Notifications can be delivered to:
- Email. One or more recipients.
- Webhook. A POST to your URL of choice; useful for routing into Slack, PagerDuty, Opsgenie, or your own incident tool.
Channels are configured in the Admin UI.
Screenshot pending
Notification settings in the Admin UI
Show recipient list, channel type, and the per-category enable toggles.
target 900 x 520
Recommended baseline #
For a new production instance:
- Failed messages -> a shared inbox that on-call rotations watch. Email plus a webhook into your incident tool is a good combo.
- Instance resource limits -> the same channel, marked higher severity. If you only have one channel, this is fine; just make sure the on-call sees it.
- Snoozes for known maintenance. When you know an upstream is down for an hour, you can mute Failed-message notifications for a topology to avoid pager fatigue, then unmute when work resumes.