I was wiring up the scheduler for an uptime monitor when the question landed: how does this thing know which monitors are due?
Each monitor checks on its own interval — one every 3 minutes, another every 10. The scheduler wakes up every 10 seconds and has to answer, for every monitor in the system: is it your turn yet?
The answer requires one piece of data per monitor — when it was last checked. That sounds trivial. Where you store it isn’t.
A common instinct in database design is: don’t store what you can derive. If the data already exists somewhere in your schema, compute it — don’t duplicate it. It’s good instinct. I ignored it, and I’d do it again.
The obvious answer
The data already exists. Every time a monitor is checked, a row lands in the results table with a timestamp. So the “pure” answer is to derive last_checked_at from there.
SELECT MAX(timestamp) FROM results WHERE monitor_id = $1
No duplication. Always consistent. The query is also fast — results has a composite primary key on (monitor_id, timestamp), exactly the index this query uses. Postgres jumps straight to the last entry for that monitor without scanning the table.
I almost went with this.
The problem with elegant
Then I thought about what this query looks like at scale.
The scheduler ticks every 10 seconds. Each tick, it needs last_checked_at for every monitor in the system. Picture modest growth: 1,000 users, 10 monitors each. That’s 10,000 MAX queries every 10 seconds — 1,000 per second, forever, just to figure out who’s due.
Each query is fast. But “fast” multiplied by “every monitor, every tick, forever” still climbs. And the question being asked is mundane — most ticks, nothing is due, and we’ve just burned a thousand queries to confirm it.
The index helps. It doesn’t change the shape of the problem.
The options
Three paths:
1. Derive from results. Elegant, no duplication. But query volume scales linearly with the number of monitors — every monitor, every tick.
2. In-memory map. Keep a map[monitorID]time.Time in the process. Zero DB queries. But it’s lost on restart — every monitor fires the instant the service comes back, regardless of when it was last checked. A thundering herd on every deploy.
3. Dedicated last_checked_at column on monitors. One indexed row read per monitor per tick, regardless of how many historical results exist. Scales flat. Survives restarts. Adds one write per check.
The choice — and what makes it acceptable
I added last_checked_at TIMESTAMPTZ to the monitors table.
The obvious objection: now there are two sources of truth. last_checked_at on monitors and the timestamps in results. They can drift. Someone can write a bug that updates one and not the other.
The answer is the transaction.
tx, err := m.Pool.Begin(ctx)
if err != nil {
return err
}
defer tx.Rollback(ctx)
// insert the result
_, err = tx.Exec(ctx, insertResultQuery, ...)
if err != nil {
return err
}
// update last_checked_at atomically
_, err = tx.Exec(ctx, `UPDATE monitors SET last_checked_at = NOW() WHERE public_id = $1`, monitorID)
if err != nil {
return err
}
return tx.Commit(ctx)internal/results/store.go
last_checked_at only moves when a result is inserted. Both writes happen in the same transaction — either both commit or neither does. There’s no code path that updates one without the other, because no such path exists.
That’s the distinction worth naming: trusted duplication vs enforced duplication. Trusted duplication asks every caller to remember to update both columns. Enforced duplication is a single wiring point the database guarantees.
The takeaway
The general rule:
Don’t store what you can derive — unless the calculation is computationally expensive and the data is read far more often than it’s updated.
Both halves apply here. last_checked_at gets read on every tick (every monitor, every 10s) and updated only when a check completes (every few minutes per monitor). Read-to-write is heavily skewed.
The “expensive” half needs a small reframe, though. Any single MAX() query is cheap. The workload isn’t — it runs on a hot path, forever, regardless of whether the answer is interesting. The cost lives in frequency × volume, not per-call.
The rule tells you when to denormalize. The transaction tells you how — without duplication becoming a bug factory.
Think there’s a cleaner way to solve this — one I talked myself out of, or didn’t consider at all? I’d genuinely like to hear it. Reach me on X.