Alert Rules & Monitoring
ServerPlane can monitor your servers and notify you by email when something goes wrong — high CPU, low disk space, memory pressure, or a server going offline entirely. You configure alert rules per server, and the platform evaluates them every minute.
How It Works
- You create one or more alert rules on a server's Alerts tab.
- Every minute, ServerPlane checks the server's recent stats against your rules.
- If a rule's condition is met for the configured duration, an email notification is sent to all team members who have notifications enabled.
- After triggering, the rule enters a cooldown period to prevent repeated emails for the same issue.
Creating an Alert Rule
Navigate to Servers > (your server) > Alerts and click Add Rule. Each rule has five fields:
| Field | Description |
|---|---|
| Metric | What to monitor: CPU Usage, RAM Usage, Disk Usage, or Heartbeat (offline detection). |
| Condition | The comparison operator: greater than, greater or equal, less than, or less or equal. |
| Threshold (%) | The percentage value to compare against. Not applicable for Heartbeat rules. |
| Duration (min) | How long the condition must persist before triggering. For heartbeat rules, this is how many minutes without a heartbeat before the server is considered offline. |
| Cooldown (min) | After an alert fires, how long to wait before it can fire again. Prevents notification spam during prolonged incidents. |
Click Create Rule to save. The rule takes effect immediately on the next evaluation cycle.
Available Metrics
CPU Usage
Monitors the average CPU utilization percentage over the duration window. Useful for detecting runaway processes, resource contention, or sustained load that might degrade application performance.
Recommended starting point: CPU > 90% for 5 minutes, 60-minute cooldown.
RAM Usage
Monitors the percentage of RAM in use (used / total). High memory usage can cause the Linux OOM killer to terminate processes, including your applications or database.
Recommended starting point: RAM > 85% for 5 minutes, 60-minute cooldown.
Disk Usage
Monitors the percentage of disk space used on the root filesystem. Running out of disk space can cause databases to crash, logs to stop writing, and deployments to fail.
Recommended starting point: Disk > 90% for 5 minutes, 60-minute cooldown.
Heartbeat (Offline Detection)
The agent on your server sends a heartbeat to ServerPlane every 30 seconds. If no heartbeat is received for the configured duration, the server is considered offline and you are notified. When the server comes back online, you receive a follow-up notification.
Recommended starting point: Heartbeat missing for 5 minutes, 60-minute cooldown.
Managing Rules
On the Alerts tab, each rule is displayed with:
- A colored badge indicating the metric type (CPU, RAM, Disk, Heartbeat).
- The rule condition and threshold.
- The cooldown period and when it was last triggered.
- A toggle switch to enable or disable the rule without deleting it.
- A delete button to permanently remove the rule.
Disabled rules are not evaluated and will not trigger notifications.
Notification Preferences
Each team member can individually control which alert types they receive emails for. Go to the user menu (top right) and select Notifications to toggle specific event types on or off.
Alert-related notification types:
- Server went offline — Triggered by heartbeat rules.
- Server came back online — Sent automatically when a previously offline server resumes heartbeats.
- High CPU usage — Triggered by CPU rules.
- High RAM usage — Triggered by RAM rules.
- Disk space warning — Triggered by disk rules.
See environment-variables.md for other notification types related to deployments, backups, and billing.
Example Setup
A typical monitoring setup for a production server:
- CPU > 90% for 5 min, cooldown 60 min.
- RAM > 85% for 5 min, cooldown 60 min.
- Disk > 90% for 5 min, cooldown 120 min.
- Heartbeat missing for 5 min, cooldown 60 min.
This gives you coverage for the most common failure modes while avoiding excessive notifications.
Troubleshooting
I'm not receiving alert emails
- Check that the alert rule is enabled (toggle is on).
- Check your notification preferences under Notifications in the user menu — ensure the relevant event types are enabled.
- Verify the queue worker is running (
php artisan queue:work), as notification emails are sent via the queue. - Check that your mail configuration is correct in the server's environment.
Alerts are firing too often
- Increase the cooldown period (e.g., from 60 to 120 minutes).
- Increase the duration so brief spikes don't trigger alerts (e.g., from 5 to 10 minutes).
- Raise the threshold if the current value is too sensitive for your workload.
Server shows offline but it's actually running
- The agent may have stopped. SSH into the server and check if the agent process is running.
- Network issues between the server and ServerPlane can prevent heartbeats from arriving.
- Check the server's firewall rules to ensure outbound HTTPS traffic is allowed.