AB
Amin Boostani
Infrastructure Engineer ยท Monitoring Specialist
Real-Time
Production
Live Infrastructure
Monitoring Sentinel
A distributed real-time monitoring system that collects metrics from dozens of servers across multiple providers (AWS, Hetzner), visualizes health via dual interfaces (terminal TUI + web dashboard), and alerts on anomalies โ€” all with sub-second refresh rates.
NODES
๐Ÿ–ฅ๏ธ
AWS Instances
eu-north ยท eu-west
๐Ÿ–ฅ๏ธ
Hetzner Nodes
Dedicated Servers
๐Ÿ–ฅ๏ธ
Service Nodes
Application Hosts
โ–ฒ
FastAPI :880 โ†’ JSON metrics (CPU, RAM, Net, Uptime, Traffic)
โ–ผ
COLLECT
๐Ÿ“ฅ
Data Collectors
Multiprocessing Pool
๐Ÿ“„
CSV Store
Per-Provider Files
CSV Read ยท 0.5s Refresh
โ–ผ
DISPLAY
๐Ÿ–ฅ๏ธ
Terminal TUI
Curses ยท Color-coded
๐ŸŒ
Web Dashboard
PHP ยท Long-Polling
๐Ÿ””
Alert Engine
Threshold Rules
โšก
Parallel Data Collection
Multiprocessing Pool fetches metrics from all servers simultaneously. Scales to 30+ nodes with no bottleneck.
๐Ÿ–ฅ๏ธ
Curses Terminal UI
Full-screen ncurses dashboard with color-coded health indicators, scrollable sections, and 0.5s auto-refresh.
๐Ÿ“Š
Multi-Provider View
Unified view across AWS, Hetzner, and custom nodes โ€” each with provider-specific health thresholds and logic.
๐Ÿšจ
Smart Anomaly Rules
Context-aware alerts: high CPU + low traffic = problem. High traffic + high CPU = normal. No false positives.
๐Ÿ”„
FastAPI Resource Agent
Async agent on each node exposing CPU, RAM, network speed, traffic, and service uptime via REST endpoint.
๐Ÿ›ก๏ธ
Auto-Healing Triggers
Automatic service restarts when RAM exceeds 96%. Service management endpoints for remote reboot and restart.
Python FastAPI asyncio Multiprocessing ncurses Pandas psutil uvicorn REST API CSV Pipeline Subprocess Mgmt Telegram Alerts
30+
Servers
3
Providers
0.5s
Refresh
6
Metrics
24/7
Uptime