Bash Script for Monitoring Server Uptime
When managing multiple servers, you need reliable ways to verify which hosts are reachable. While orchestration tools like Kubernetes and Ansible handle monitoring at scale, a lightweight Bash script works well for quick spot checks, operational tasks, or smaller deployments.
Basic ICMP Reachability Check
The simplest approach uses ping to test ICMP reachability:
ping -W 1 -c 1 "$host" &>/dev/null && echo "up" || echo "down"
Breaking down the options:
-W 1: timeout after 1 second-c 1: send only 1 packet&>/dev/null: suppress all output&& echo "up" || echo "down": use the exit status to determine output
This approach is cleaner than parsing ping output, which varies across different systems (Linux, macOS, BSD).
Sequential Server Check Script
Here’s a production-ready script that processes a host list sequentially:
#!/bin/bash
if [[ $# -ne 1 ]]; then
echo "usage: $0 <server-list-file>"
echo "Each server hostname/IP on a separate line"
exit 1
fi
if [[ ! -f "$1" ]]; then
echo "Error: file '$1' not found"
exit 1
fi
alive_count=0
dead_count=0
while IFS= read -r host || [[ -n "$host" ]]; do
# Skip empty lines and comments
[[ -z "$host" || "$host" =~ ^# ]] && continue
if ping -W 1 -c 1 "$host" &>/dev/null; then
echo "✓ $host is alive"
((alive_count++))
else
echo "✗ $host is down"
((dead_count++))
fi
done < "$1"
echo ""
echo "Summary: $alive_count alive, $dead_count down"
Save as check-alive-servers.sh and make it executable:
chmod +x check-alive-servers.sh
Create a server list file with one host per line:
192.168.1.10
web01.example.com
db-server
10.0.0.5
# commented-out host
legacy.internal
Run the script:
./check-alive-servers.sh server-list.txt
Output:
✓ 192.168.1.10 is alive
✓ web01.example.com is alive
✗ db-server is down
✓ 10.0.0.5 is alive
✓ legacy.internal is alive
Summary: 4 alive, 1 down
Parallel Checking for Large Inventories
Sequential pings become slow with large server lists. Use background jobs with controlled concurrency:
#!/bin/bash
HOST_FILE="$1"
MAX_JOBS=10
TEMP_DIR=$(mktemp -d)
RESULTS_FILE="$TEMP_DIR/results"
if [[ ! -f "$HOST_FILE" ]]; then
echo "Error: file '$HOST_FILE' not found"
exit 1
fi
check_host() {
local host="$1"
if ping -W 1 -c 1 "$host" &>/dev/null; then
echo "✓ $host" >> "$RESULTS_FILE"
else
echo "✗ $host" >> "$RESULTS_FILE"
fi
}
while IFS= read -r host || [[ -n "$host" ]]; do
[[ -z "$host" || "$host" =~ ^# ]] && continue
# Wait if we hit the job limit
while (( $(jobs -r -p | wc -l) >= MAX_JOBS )); do
sleep 0.05
done
check_host "$host" &
done < "$HOST_FILE"
wait
sort "$RESULTS_FILE"
rm -rf "$TEMP_DIR"
This version runs up to 10 pings concurrently, dramatically reducing total runtime. The temporary file preserves results across concurrent execution.
Adjust MAX_JOBS based on your system’s resources. Values between 10–50 work well for most deployments; higher concurrency can exhaust file descriptors or overwhelm slower network links.
TCP Port Checks as an Alternative
Some networks block ICMP traffic, making hosts appear unreachable even when they’re running. Test TCP connectivity to a specific port instead:
check_tcp() {
local host="$1"
local port="${2:-22}"
if timeout 1 bash -c "echo >/dev/tcp/$host/$port" 2>/dev/null; then
echo "✓ $host:$port open"
else
echo "✗ $host:$port unreachable"
fi
}
check_tcp web01.example.com 443
check_tcp db-server 5432
check_tcp app-server 8080
Common ports to check:
22: SSH80: HTTP443: HTTPS3306: MySQL5432: PostgreSQL6379: Redis27017: MongoDB
This method is more reliable in restrictive network environments where firewalls drop ICMP packets.
Export Results as JSON
For integration with monitoring systems or log aggregation, export results as JSON:
#!/bin/bash
HOST_FILE="$1"
check_host() {
local host="$1"
local status="down"
if ping -W 1 -c 1 "$host" &>/dev/null; then
status="up"
fi
printf '{"host":"%s","status":"%s","timestamp":"%s"}\n' \
"$host" "$status" "$(date -u +%Y-%m-%dT%H:%M:%SZ)"
}
while IFS= read -r host || [[ -n "$host" ]]; do
[[ -z "$host" || "$host" =~ ^# ]] && continue
check_host "$host"
done < "$HOST_FILE"
Pipe results to jq for filtering:
./check-alive-servers.sh servers.txt | jq 'select(.status=="down")'
Or save to a file for later analysis:
./check-alive-servers.sh servers.txt > server-status.jsonl
DNS Resolution and Error Handling
The script relies on your system’s DNS resolver via /etc/resolv.conf. In air-gapped environments or when custom DNS is required, validate DNS resolution first:
resolve_host() {
local host="$1"
if getent hosts "$host" &>/dev/null; then
return 0
else
echo "✗ $host (DNS resolution failed)"
return 1
fi
}
For scripts running in containers or chroot environments, verify /etc/resolv.conf is properly mounted or populated. Test with:
cat /etc/resolv.conf
nslookup example.com
Performance Characteristics
- Sequential mode: ~1 second per host (with 1-second timeout)
- Parallel mode (10 jobs): ~0.1–0.2 seconds per host
- TCP mode: Often faster than ICMP on networks with firewall delays
For inventories exceeding 1000 hosts or when you need continuous monitoring, use dedicated monitoring systems like Prometheus with node_exporter, Icinga2, or Zabbix. For ad-hoc operational tasks and smaller deployments, these scripts are lightweight, portable, and require no external dependencies.
