Home Knowledge base Skyline Cloud How to Monitor Server Uptime and Set Up Alerts KNOWLEDGE BASE

How to Monitor Server Uptime and Set Up Alerts

A practical, step-by-step guide to monitoring server uptime from the outside, watching health from the inside, and getting alerted within seconds when something breaks — using free, open tools on a Skyline Cloud VPS.

What "uptime monitoring" actually means

Uptime monitoring answers two different questions, and you need both to run a reliable service:

  • Is my server reachable from the outside? (external/black-box monitoring) — does a real client over the internet get a healthy response?
  • Is my server healthy on the inside? (internal/white-box monitoring) — CPU, memory, disk, and individual services like Nginx, MySQL, or your app process.

External checks catch outages your users see. Internal checks catch the causes — a disk filling up, a runaway process, swap thrashing — often before they turn into an outage. This guide sets up both on a Linux VPS, plus alerting that reaches you in seconds. Commands assume Ubuntu 22.04/24.04 or Debian on a Skyline Cloud server; adapt package names for AlmaLinux/RHEL.

Step 1 — External uptime checks

The simplest external check is an HTTP request from a machine other than your server. Run this from a second host (or your laptop) to confirm the site answers:

curl -sS -o /dev/null -w "HTTP %{http_code} in %{time_total}s\n" \
  https://example.com/

A healthy result looks like HTTP 200 in 0.184s. To monitor continuously without writing code, use a hosted checker such as UptimeRobot or a self-hosted one like Uptime Kuma. Self-hosting keeps your monitoring data in the Kingdom — useful for PDPL and NCA alignment.

Run Uptime Kuma on a separate, small VPS (never the server it watches) using Docker:

docker run -d --restart=always \
  -p 3001:3001 \
  -v uptime-kuma:/app/data \
  --name uptime-kuma louislam/uptime-kuma:1

Open http://<monitor-ip>:3001, create your admin account, then Add New Monitor:

  • Monitor Type: HTTP(s)
  • URL: https://example.com/health (use a lightweight health endpoint, not the homepage)
  • Heartbeat Interval: 60 seconds
  • Retries: 2 (avoids alerting on a single blip)
  • Accepted Status Codes: 200-299

Always monitor a dedicated /health endpoint that confirms your app and its database are up, not just that the web server returns a page.

Step 2 — Internal health with node_exporter and a check script

For a single server, you do not need a full Prometheus stack. A short script run by cron or a systemd timer covers the essentials. Create /usr/local/bin/health-check.sh:

#!/usr/bin/env bash
set -euo pipefail

THRESH_DISK=90      # percent
THRESH_MEM=90       # percent
WEBHOOK="https://hooks.example.com/your-webhook"

alert() {
  curl -fsS -X POST -H 'Content-Type: application/json' \
    -d "{\"text\":\"[$(hostname)] $1\"}" "$WEBHOOK" || true
}

# Disk usage on /
disk=$(df --output=pcent / | tail -1 | tr -dc '0-9')
[ "$disk" -ge "$THRESH_DISK" ] && alert "Disk at ${disk}% on /"

# Memory usage
mem=$(free | awk '/Mem:/ {printf "%d", $3/$2*100}')
[ "$mem" -ge "$THRESH_MEM" ] && alert "Memory at ${mem}%"

# Critical service must be active
for svc in nginx mysql; do
  systemctl is-active --quiet "$svc" || alert "Service $svc is DOWN"
done

Make it executable and test it:

sudo chmod +x /usr/local/bin/health-check.sh
sudo /usr/local/bin/health-check.sh

Schedule it with a systemd timer, which is more reliable than cron for logging and missed-run handling. Create /etc/systemd/system/health-check.service:

[Unit]
Description=Server health check

[Service]
Type=oneshot
ExecStart=/usr/local/bin/health-check.sh

And /etc/systemd/system/health-check.timer:

[Unit]
Description=Run health check every 2 minutes

[Timer]

[Install]
WantedBy=timers.target

Enable it:

sudo systemctl daemon-reload
sudo systemctl enable --now health-check.timer
systemctl list-timers health-check.timer

For richer metrics and history, install Prometheus node_exporter, which exposes CPU, memory, disk, and network as metrics on port 9100:

sudo apt update && sudo apt install -y prometheus-node-exporter
sudo systemctl enable --now prometheus-node-exporter
curl -s http://localhost:9100/metrics | head

Bind it to localhost or restrict port 9100 in your firewall so the metrics are not public:

sudo ufw allow from <monitor-ip> to any port 9100 proto tcp

Step 3 — Alerting that actually reaches you

An alert is only useful if it arrives fast and through more than one channel. Configure at least two so a single failing provider does not silence you.

Channel Best for Latency Notes
Email Audit trail, non-urgent Seconds–minutes Use a real SMTP service, not the server itself
Webhook (Slack/Teams) Team visibility Seconds Easy to wire into the script above
SMS / push True emergencies Seconds Reserve for "site is down" only

Send alert email through a proper SMTP relay — never rely on the monitored server's own mail, because if the server is down it cannot warn you. Use business email hosting or any SMTP provider. A minimal msmtp-based email alert:

sudo apt install -y msmtp msmtp-mta
printf 'Subject: ALERT %s\n\n%s\n' "$(hostname)" "Disk high" \
  | msmtp -a default you@example.com

In Uptime Kuma, add notifications under Settings → Notifications (Email/SMTP, Slack, Telegram, or a generic webhook) and attach them to each monitor.

Step 4 — Tune thresholds and avoid alert fatigue

Bad alerting is worse than none — people learn to ignore it. Follow these rules:

  • Require 2+ failed checks before alerting (the Retries setting) to ignore transient blips.
  • Alert on symptoms users feel, like HTTP 5xx or high latency, not every internal metric.
  • Set sane thresholds: 90% disk, 90% memory sustained, latency above your normal p95.
  • Send a recovery notification so you know when the issue clears.
  • Review alerts monthly and delete or tune anything that cried wolf.

Verify the whole pipeline

Test the alert path end to end before you rely on it. Temporarily lower a threshold or stop a non-critical service:

sudo systemctl stop nginx        # triggers the service-down alert
# confirm the alert arrives, then:
sudo systemctl start nginx

If the alert lands in your inbox and chat within a couple of minutes, your monitoring is real. An untested alert pipeline is the same as no monitoring.

Wrapping up

You now have external uptime checks, internal health monitoring via a systemd timer and node_exporter, and multi-channel alerting that you have actually verified. Keep your monitoring host separate from the servers it watches, keep the data in-Kingdom for PDPL and NCA compliance, and revisit thresholds as traffic grows.

Need a reliable, in-Kingdom VPS to host your apps and your monitoring stack — with local Arabic support and transparent pricing? Create your Skyline Cloud account and deploy in minutes.

SKYLINE Engineering

@skyline

The engineering team at SKYLINE Industrial Solutions. We publish field-tested guides drawn from real KSA and GCC deployments.

See author profile
SKYLINE engineering services

Need this implemented for you?

Reading is free — building it right takes a team. SKYLINE engineers ship Skyline Cloud for Aramco vendors, banks, hospitals and government agencies across Saudi Arabia. Talk to us before you start.

Aramco Approved Contractor ISO 9001 · ISO 27001 SAMA CSF aligned NCA ECC ready 247+ KSA clients

Comments

0 total · 0 threads
Be the first to leave a comment.