Rate Limiting and Cooldowns: Protecting Sender Reputation
2026-04-08
A human sender waking up and deciding to email 5,000 people in an hour is noticeable. It takes effort. Someone has to compose the messages, review the list, click send. There are natural speed governors everywhere.
An AI agent has none of those governors. A loop that goes wrong, a batch operation someone triggers without thinking, a retry logic bug that compounds on itself - any of these can drive thousands of emails out the door in minutes. Mailbox providers notice. They notice fast. And once you have a reputation problem, you are looking at weeks of recovery time, not hours.
Rate limiting and cooldown enforcement are not optional guardrails you add later. They are the infrastructure that makes AI-powered email safe to operate at scale.
Why AI senders fail differently
The standard deliverability advice for human senders is built around gradual behavior: warm your domain slowly, don't change your sending pattern suddenly, keep volumes consistent. Inbox providers have learned to trust senders who look like humans.
AI agents break every one of those patterns. They do not warm up. They do not send at human speeds. They do not space out follow-ups across a natural workday. And when something goes wrong - a runaway retry loop, a misread condition in an if-statement, a test that hit production - the volume spike is instant and steep.
Gmail and Outlook track engagement per sender domain across rolling windows. A sudden spike in outbound volume with no corresponding engagement history reads as a spam blast, regardless of your intent. The result is soft filtering first (messages land in spam), then hard reputation damage (messages get rejected or silently dropped), then potential blocklisting.
The mechanics of this failure mode are documented in our email deliverability guide for AI agents. What this post covers is the specific enforcement layer that prevents it.
Three windows, three problems
Effective rate limiting for email requires enforcement across three time horizons, because the threat each window addresses is different.
Hourly limits: stopping runaway spikes
Hourly limits are your first defense against bugs and accidents. A loop that escapes, a retry function that doubles back on itself, a batch job that runs twice - these all show up as hourly spikes before they show up anywhere else.
The goal of an hourly limit is not to constrain normal operation. It is to create a ceiling that abnormal behavior cannot punch through. A well-calibrated hourly limit should never trigger during expected usage - it should only activate when something has gone wrong.
When an hourly limit is hit, the right behavior is to block sends and surface the reason through the decision trace, not to queue silently. Queuing silently means you lose the signal that something is wrong. A structured block reason gives the agent (and the operator) something to act on.
Daily limits: protecting the warmup curve
Daily limits protect your sender domain's engagement curve. Inbox providers track daily volume per domain against engagement signals (opens, replies, clicks, unsubscribes). A day where you send 10x your normal volume looks suspicious even if each individual message is legitimate.
Daily limits also give you a predictable cost ceiling. AI agents that trigger sends in response to events can generate surprising volume if the event stream is unexpectedly large. A daily cap converts an unpredictable variable cost into a bounded one.
Monthly limits: billing predictability and plan alignment
Monthly limits align your sending behavior with your plan capacity and give you a forward-looking usage signal. Running out of monthly quota on the 15th is a product problem - you need to either upgrade or throttle outbound activity for the rest of the period.
Molted's policy engine enforces all three windows in sequence. Every send request is checked against hourly, daily, and monthly usage before it is approved. Any window that is exhausted blocks the send with a specific reason code.
What a blocked send looks like
When a send is blocked by rate limiting, the Agent Runtime API returns a 200 with status: "blocked" in the body - not a 4xx. This is intentional. The request was processed correctly; the policy engine evaluated it and made a decision. Your agent needs to check the response body, not just the HTTP status.
{
"requestId": "req_abc123",
"status": "blocked",
"reason": "hourly_limit_exceeded",
"policyTrace": {
"decision": { "allow": false, "reason": "hourly_limit_exceeded" },
"auditEvents": [...]
}
}
The reason field tells you exactly which window was hit. This matters for retry logic. If the reason is hourly_limit_exceeded, the right response is to wait for the hourly window to reset. If it is daily_limit_exceeded, waiting until the top of the next hour accomplishes nothing.
Block reasons related to rate limits:
| Reason | Window |
|---|---|
rate_limited | Hourly quota exhausted |
hourly_limit_exceeded | Hourly quota exhausted |
daily_limit_exceeded | Daily quota exhausted |
budget_exceeded | Daily send quota exceeded |
monthly_limit_exceeded | Monthly quota exhausted |
monthly_budget_exceeded | Monthly send quota exceeded |
overage_cap_exceeded | Hard cap for paid overages exceeded |
You can also check current usage before sending large batches:
GET /v1/me/usage
Authorization: Bearer YOUR_API_KEY
{
"monthly": { "used": 2341, "limit": 3000, "remaining": 659 },
"daily": { "used": 187, "limit": 500, "remaining": 313 },
"hourly": { "used": 23, "limit": 75, "remaining": 52 }
}
If you are building a batch send job, checking usage first and comparing remaining capacity to your planned volume is cleaner than sending until you get blocked.
Cooldown windows: the per-recipient protection
Rate limits operate at the account level. Cooldowns operate at the recipient level. They solve a different problem.
The scenario: your agent has a follow-up template. It sends one to a contact, gets no response, and triggers another follow-up. Under some conditions - a bug, a misconfigured trigger, a retry loop - the agent might attempt to send the same template to the same contact multiple times within a short window.
Without a cooldown, the recipient gets hammered with duplicate messages. From their perspective, that is spam. From your reputation's perspective, a complaint is a complaint regardless of whether it was intentional.
Molted enforces a 10-minute cooldown per template per recipient. If the same template is sent to the same recipient within that window, the second attempt is blocked with reason: "cooldown". The idempotency key (dedupeKey) catches exact duplicates; the cooldown window catches near-duplicates where different keys are used.
This is distinct from reason: "duplicate", which fires when the exact same dedupeKey is reused - a signal that the agent is replaying an already-processed send rather than generating a new one.
Risk budget: reputation-aware sending
Beyond time-window limits, Molted tracks a risk budget per mailbox. The risk budget accumulates based on negative signals: hard bounces, soft bounces, and spam complaints. Each signal adds a weighted score to the daily risk total.
When the risk budget is exceeded, sends are blocked with risk_budget_exceeded. When the ratio of negative signals (bounces and complaints) crosses a threshold within a 24-hour window, sends are blocked with negative_signal_budget_exceeded.
Both of these are auto-pauses that exist to protect you from yourself. If your agent is generating an abnormal bounce rate - because it has a bad list, because a recipient domain started rejecting, because something in your contact import went wrong - you want sending to stop before the problem compounds into a reputation crisis.
The mailbox auto-pause (mailbox_paused) is the most aggressive form of this: when reputation signals breach the configured threshold, the mailbox stops sending entirely until manually reviewed and resumed. This is a last-resort protection for cases where the risk budget and negative signal checks have both been exhausted.
Per-domain throttling
In addition to account-level limits, Molted applies per-recipient-domain throttling. If your agent is sending to many recipients at the same company domain (say, multiple contacts at bigcorp.com), the hourly volume to that single receiving domain is capped.
This matters because some receiving domains - particularly large corporate mail servers - actively rate-limit inbound connections per sender. Exceeding their limits causes deferrals and soft bounces, which accumulate as negative signals even though the messages were legitimate.
The block reason for this is domain_throttled. If you are seeing it consistently for a specific recipient domain, you are likely hitting their inbound rate limits and need to spread sends across a longer time window.
Simulate before sending at scale
Before running a large batch send, use the simulate endpoint to test how the policy engine would respond without actually sending anything:
POST /v1/agent/simulate-send
{
"tenantId": "your-tenant-id",
"recipientEmail": "alice@example.com",
"templateId": "quarterly-checkin",
"dedupeKey": "test-dry-run-alice",
"mailboxId": "mbx_abc123"
}
The simulate endpoint evaluates every policy rule - including rate limit state, cooldown windows, suppression lists, and risk budget - and returns the same decision trace as a real send, without committing anything. Run it for a sample of your recipients before starting a large batch job.
One caveat: there is a small window between simulation and actual send where rate limit state can change. Simulate is a planning tool, not a guarantee. But for catching obvious problems before they become expensive, it is worth the extra round-trip.
Handling rate limit responses in agent code
Your agent needs to treat status: "blocked" as a meaningful signal, not an error to retry immediately. The pattern:
response = send_email(payload)
if response.status == "blocked":
reason = response.reason
if reason in ["hourly_limit_exceeded", "rate_limited"]:
# Wait for the hourly window to reset, then continue
wait_until_next_hour()
elif reason in ["daily_limit_exceeded", "budget_exceeded"]:
# Done for today. Resume tomorrow.
defer_until_tomorrow()
elif reason in ["monthly_limit_exceeded", "monthly_budget_exceeded"]:
# Need operator attention - plan limit hit
notify_operator("monthly limit reached")
elif reason == "cooldown":
# Already sent this template to this recipient recently
skip_and_continue()
elif reason == "duplicate":
# This exact dedupeKey was already used - idempotent, treat as success
treat_as_sent()
This pattern - branch on reason, take an appropriate action, do not retry blindly - is the difference between an agent that degrades gracefully and one that compounds a rate limit problem into a reputation problem.
Why policy enforcement belongs in infrastructure, not application code
You could implement rate limiting in your agent's code. Many teams start there: a counter in Redis, a check before each send, a backoff when the counter hits a threshold.
The problem is that application-layer rate limiting breaks under load, under bugs, and under the specific failure modes that agents introduce. A bug in your rate limiting code does not limit you - it just stops rate limiting. A new agent that someone deploys without wiring up to your shared counter has no limits at all. A retry logic bug that compounds does not get caught by a counter that was never incremented.
The policy engine evaluates every send as a transaction against the current state of all relevant windows, risk budgets, and cooldowns. It is the same separation of concerns you already rely on for database connection limits and authentication. It belongs at the infrastructure layer because that is where it cannot be bypassed.
Molted gives your agent a managed mailbox with rate limiting, cooldown enforcement, and risk budget tracking built in. Start your free trial or read the docs to see how the limits are configured for each plan.
Keep reading
-
Email Deliverability for AI Agents: A Technical Guide - the full picture of how agent email behavior affects inbox placement
-
Why AI Agents Need Email Guardrails - why policy enforcement belongs at the infrastructure layer
-
The Policy Rules That Protect Your Sender Reputation - the full set of policy rules evaluated on every send
-
What Is Agent-Native Email? — the category this infrastructure belongs to