laceEmitRecovery¶
Recovery notification extension for Lace. Bundled with every executor as builtin:laceEmitRecovery.
When enabled, this extension emits a notification when a probe transitions from a failing state (failure or timeout) back to success. It complements laceNotifications: while laceNotifications handles the "went down" direction (assertion failure notifications with silentOnRepeat suppression), laceEmitRecovery handles the "came back up" direction.
Activation¶
# lace.config
[extensions.laceNotifications]
laceext = "builtin:laceNotifications"
[extensions.laceEmitRecovery]
laceext = "builtin:laceEmitRecovery"
Both extensions must be listed. laceEmitRecovery declares require = ["laceNotifications"] and will fail startup if laceNotifications is absent.
Behavior¶
The extension fires a single rule on the script hook (after all calls complete and the result outcome is finalized):
- Skip if no previous result -- first runs have nothing to compare against.
- Check previous outcome -- only proceeds if
prev.outcomewas"failure"or"timeout". - Check current outcome -- only proceeds if
result.outcomeis"success". - Emit notification -- dispatches via
laceNotifications.pushNotification().
Recovery detection rule¶
when not is_null(prev)
when not is_null(prev.outcome)
let $was_down = prev.outcome eq "failure" or prev.outcome eq "timeout"
when $was_down
when result.outcome eq "success"
let $notif = is_null(config.notification) ? text(config.recovery_message) : config.notification
laceNotifications.pushNotification({
callIndex: -1,
conditionIndex: -1,
trigger: "recovered",
scope: null,
notification: $notif
})
When notifications fire¶
| Previous outcome | Current outcome | Notification? |
|---|---|---|
| (null -- first run) | success | No |
| (null -- first run) | failure | No |
| success | success | No |
| success | failure | No (handled by laceNotifications) |
| failure | failure | No |
| failure | success | Yes -- recovered |
| timeout | success | Yes -- recovered |
| timeout | failure | No |
| failure | timeout | No |
| timeout | timeout | No |
Notification format¶
Recovery notifications use text() by default and are emitted into actions.notifications with trigger: "recovered":
{
"callIndex": -1,
"conditionIndex": -1,
"trigger": "recovered",
"scope": null,
"notification": {
"tag": "text",
"value": "Service recovered"
}
}
The callIndex is always -1 because recovery is a script-level event, not tied to a specific call. The trigger field is "recovered" -- backends can key on this for special handling (e.g., including downtime duration from their own state tracking).
Configuration¶
| Key | Type | Default | Description |
|---|---|---|---|
recovery_message |
string | "Service recovered" |
Default text used when no custom notification is set |
notification |
notification_val | (unset) | Optional override -- set to template("name") or text("custom message") |
Override in lace.config:
[extensions.laceEmitRecovery]
laceext = "builtin:laceEmitRecovery"
recovery_message = "Probe is healthy again"
Or use a named template:
[extensions.laceEmitRecovery]
laceext = "builtin:laceEmitRecovery"
notification = template("recovery-alert")
Backend responsibilities¶
The backend receives recovery notifications as regular entries in result.actions.notifications. It should:
- Detect the
"recovered"trigger to distinguish recovery events from failure notifications. - Correlate with the previous failure -- the backend has access to
prevand can compute downtime duration fromprev.startedAttoresult.startedAt. - Deliver via the configured transport, potentially with different formatting or routing than failure alerts (e.g., "all clear" messages to the same channel that received the initial alert).
Interaction with laceNotifications¶
The two extensions work together to provide a complete notification lifecycle:
| Event | Extension | Trigger |
|---|---|---|
| First failure | laceNotifications | "expect", "check", "assert", or "timeout" |
| Repeated failure | laceNotifications | (suppressed by silentOnRepeat) |
| Recovery | laceEmitRecovery | "recovered" |
| Stable success | (neither) | (no notification) |
This mirrors the alerting model of monitoring systems: alert on the initial failure, suppress noise during persistent outages, and notify again when service is restored.