At 9:14 on a Monday morning, an engineer at a small SaaS company opened a ticket to restore a customer's database. Then she found out the nightly backup had been failing for nineteen days. Nobody had touched the server. Nothing had broken loudly. The job had simply stopped running — and in monitoring, silence looks exactly like success.
The job that fails without a sound
That story is unremarkable precisely because it is so common. Almost every team has a scheduled task quietly doing important work in the background: a database dump at 2am, a billing run on the first of the month, a queue worker chewing through jobs, an export that feeds a partner's system. They run unattended for months — until one day they don't.
When a background job stops, it rarely announces it. A container is rescheduled and never comes back. A deploy changes a path. A dependency throws an exception that gets swallowed by an empty `except`. The cron line is there, the server is up, the dashboard is green — and the work is simply not happening. You discover it at the worst possible moment: when you finally need the result.
Why ordinary monitoring can't see it
Most monitoring is built around things that answer when you knock. A website returns a status code, an API responds to a request, a port accepts a connection. You poll them on a schedule and act on what comes back.
A nightly backup answers to no one. It has no public URL, no port, nothing to poll. From the outside it is indistinguishable whether it ran perfectly or never ran at all. That is the blind spot — and it is exactly the kind of work whose failure stays invisible until it is expensive.
How heartbeat monitoring flips the logic
Heartbeat monitoring turns the relationship inside out. Instead of a monitor reaching out to your job, your job reaches out to the monitor. At the end of a successful run it sends a quick HTTP request — a “check-in” or “ping” — to a unique URL. The service learns to expect that ping on a schedule.
If the check-in arrives on time, all is well and you hear nothing. If it is late or never comes, the service alerts you. Engineers call this a “dead man's switch”: the absence of a signal is the signal. You are no longer trusting that a job ran — you are being told the moment it doesn't.
Setting it up so it actually helps
A heartbeat is only as good as how you wire it in. Create one heartbeat per job rather than lumping several together, so an alert points straight at the thing that failed. Set the expected interval to match the schedule, and add a grace period so a run that is a few minutes slow doesn't page anyone at 3am.
The detail that matters most: put the check-in at the very end of the job, after the work has actually succeeded — not at the start. A ping fired before the real work runs will happily report “healthy” on a job that crashes halfway through. Send the ping only on success, and route the alert to the team that owns the job, not to a channel no one reads.
Catching the silence with WatchControl
This is exactly the gap WatchControl is built to close. You create a heartbeat monitor, copy its check-in URL, and add a single curl line to the end of your script or cron entry — no agent to install, no inbound access to open. It works from anywhere your job can make an HTTP request, including servers behind a firewall.
Set the interval and grace period, and WatchControl watches for the silence on your behalf. The moment a check-in is missed, it alerts you by email, webhook or SMS — and because WatchControl is built in Denmark and hosted in the EU, that check-in data never leaves the EU. The backup that fails for nineteen days becomes the backup you hear about on day one.