Project Showcase — Live Infrastructure Sentinel

Live Infrastructure
Monitoring Sentinel

A distributed real-time monitoring system that collects metrics from dozens of servers across multiple providers (AWS, Hetzner), visualizes health via dual interfaces (terminal TUI + web dashboard), and alerts on anomalies — all with sub-second refresh rates.

System Architecture — 3 Layers

NODES

🖥️

AWS Instances

eu-north · eu-west

🖥️

Hetzner Nodes

Dedicated Servers

🖥️

Service Nodes

Application Hosts

▲

FastAPI :880 → JSON metrics (CPU, RAM, Net, Uptime, Traffic)

▼

COLLECT

📥

Data Collectors

Multiprocessing Pool

📄

CSV Store

Per-Provider Files

CSV Read · 0.5s Refresh

▼

DISPLAY

🖥️

Terminal TUI

Curses · Color-coded

🌐

Web Dashboard

PHP · Long-Polling

🔔

Alert Engine

Threshold Rules

Key Features

⚡

Parallel Data Collection

Multiprocessing Pool fetches metrics from all servers simultaneously. Scales to 30+ nodes with no bottleneck.

🖥️

Curses Terminal UI

Full-screen ncurses dashboard with color-coded health indicators, scrollable sections, and 0.5s auto-refresh.

📊

Multi-Provider View

Unified view across AWS, Hetzner, and custom nodes — each with provider-specific health thresholds and logic.

🚨

Smart Anomaly Rules

Context-aware alerts: high CPU + low traffic = problem. High traffic + high CPU = normal. No false positives.

🔄

FastAPI Resource Agent

Async agent on each node exposing CPU, RAM, network speed, traffic, and service uptime via REST endpoint.

🛡️

Auto-Healing Triggers

Automatic service restarts when RAM exceeds 96%. Service management endpoints for remote reboot and restart.

Tech Stack

Python FastAPI asyncio Multiprocessing ncurses Pandas psutil uvicorn REST API CSV Pipeline Subprocess Mgmt Telegram Alerts

System Scale

30+

Servers

Providers

0.5s

Refresh

Metrics

24/7

Uptime