πŸš€ -> Project on GitHub <-

πŸ₯ Health Monitoring Dashboard v1.2

πŸ“š Navigation: 🏠 Home πŸ“– Docs πŸš€ Quickstart πŸ“Š Dashboard πŸ” OSINT

The integrated health monitoring system provides comprehensive monitoring and diagnostics for CrawlLama.

🌟 Features

πŸ“Š Live System Metrics

πŸ” Component Health Checks

πŸ“ˆ Performance Tracking

🚨 Alert System

🎨 Rich Terminal UI

πŸš€ Usage

Unified Health Dashboard

The Health Dashboard offers two modes in one application:

Interactive Menu:

# Windows
health-dashboard.bat

# Linux/Mac
./health-dashboard.sh

# Direct with Python
python health-dashboard.py

Direct Start Modes:

# Live System Monitor
python health-dashboard.py --monitor

# Test Dashboard
python health-dashboard.py --tests

Mode 1: Terminal-based Live Monitoring

Real-time monitoring with Rich Terminal UI:

Mode 2: GUI Test Dashboard

Tkinter-based GUI for test management:

πŸ’» Programmatic Usage

System Monitor

from pathlib import Path
from core.health import SystemMonitor

# Initialize
monitor = SystemMonitor(update_interval=1.0)
monitor.start()

# Get metrics
metrics = monitor.get_latest_metrics()
print(f"CPU: {metrics.cpu_percent}%")
print(f"Memory: {metrics.memory_used_gb}/{metrics.memory_total_gb} GB")
print(f"Disk: {metrics.disk_percent}%")

# Stop
monitor.stop()

Component Health Checker

from pathlib import Path
from core.health import ComponentHealthChecker, HealthStatus

# Initialize
checker = ComponentHealthChecker(Path.cwd())

# Check all components
health = checker.check_all()

# Display results
for name, status in health.items():
    print(f"{name}: {status.status.value} - {status.message}")
    print(f"  Response Time: {status.response_time_ms:.2f}ms")

Performance Tracker

from core.health import PerformanceTracker, PerformanceTimer

# Initialize
tracker = PerformanceTracker()

# Track operation
with PerformanceTimer(tracker, "llm_query") as timer:
    # Your operation here
    result = expensive_operation()

# Get statistics
stats = tracker.get_stats("llm_query")
print(f"Average: {stats.avg_duration_ms:.2f}ms")
print(f"P95: {stats.p95_duration_ms:.2f}ms")
print(f"Success Rate: {stats.success_rate:.1f}%")

Alert System

from core.health import AlertSystem, AlertLevel

# Initialize
alerts = AlertSystem()

# Register alert callback
def on_alert(alert):
    print(f"[{alert.level.value}] {alert.component}: {alert.message}")

alerts.register_callback(on_alert)

# Check system data
alerts.check_alerts({
    'system_metrics': monitor.get_latest_metrics(),
    'component_health': checker.check_all(),
    'performance_stats': tracker.get_all_stats()
})

# Get active alerts
active = alerts.get_alerts(unacknowledged_only=True)
for alert in active:
    print(f"{alert.level.value}: {alert.message}")

Rich Terminal Dashboard

from pathlib import Path
from core.health import RichHealthDashboard

# Start dashboard
dashboard = RichHealthDashboard(
    project_root=Path.cwd(),
    update_interval=2.0  # Seconds
)

dashboard.start()  # Blocks until Ctrl+C

πŸ”§ Integration in Your Code

LLM Client with Performance Tracking

from core.llm_client import LLMClient
from core.health import PerformanceTracker

tracker = PerformanceTracker()
client = LLMClient("config.json")

# Wrapper function
def tracked_query(prompt: str):
    with PerformanceTimer(tracker, "llm_query") as timer:
        try:
            response = client.generate(prompt)
            return response
        except Exception as e:
            timer.mark_failure()
            raise

# Use
response = tracked_query("What is AI?")

# Display statistics
stats = tracker.get_stats("llm_query")
print(f"Average response time: {stats.avg_duration_ms:.2f}ms")

Web Search with Monitoring

from tools.web_search import web_search
from core.health import PerformanceTracker

tracker = PerformanceTracker()

def monitored_search(query: str):
    with PerformanceTimer(tracker, "web_search"):
        return web_search(query)

# Use
results = monitored_search("Python tutorials")

πŸ“‹ Alert Rules

Default Rules

Rule Threshold Level Description
CPU Warning 85% WARNING High CPU usage
CPU Error 95% ERROR Critical CPU usage
Memory Warning 85% WARNING High RAM usage
Memory Error 95% ERROR Critical RAM usage
Disk Warning 5 GB free WARNING Low storage space
Disk Critical 1 GB free CRITICAL Very low storage space
Component Health Unhealthy ERROR Component failure
Performance P95 > 5s WARNING Slow performance

Custom Alert Rules

from core.health import AlertRule, AlertLevel

class CustomAlertRule(AlertRule):
    def __init__(self):
        super().__init__(
            name="Custom Rule",
            level=AlertLevel.WARNING,
            cooldown_minutes=10
        )

    def check(self, data: dict) -> str | None:
        # Your custom logic
        if some_condition:
            return "Custom alert message"
        return None

# Add rule
alerts.add_rule(CustomAlertRule())

Production Environment

Development Environment

πŸ› Troubleshooting

Dashboard Won’t Start

Problem: ModuleNotFoundError: No module named 'rich'

Solution:

pip install rich psutil

No System Metrics

Problem: Metrics not displayed

Solution: Ensure psutil is installed:

pip install psutil

Component Checks Fail

Problem: All components show β€œUnhealthy”

Solution:

  1. Check config.json
  2. Ensure all directories exist:
    mkdir -p data/cache data/embeddings logs
    

Performance Data Missing

Problem: No performance statistics

Solution: Integrate PerformanceTimer in your code (see examples above)

πŸ“Š Dashboard Layout

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ πŸ¦™ CrawlLama Health Dashboard | 2025-10-24 14:30:00      ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ πŸ“Š System Metrics       β”‚ πŸ“ˆ Performance          β”‚
β”‚                         β”‚                         β”‚
β”‚ CPU      45.2%  β–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘  β”‚ llm_query   1250ms  βœ“  β”‚
β”‚ Memory   62.1%  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘ β”‚ web_search   850ms  βœ“  β”‚
β”‚ Disk     38.5%  β–ˆβ–ˆβ–ˆβ–‘β–‘β–‘  β”‚ cache_read    25ms  βœ“  β”‚
β”‚ Network  ↓1.2/↑0.3 MB/s β”‚                         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€                         β”‚
β”‚ πŸ” Component Health     β”‚                         β”‚
β”‚                         β”‚                         β”‚
β”‚ LLM Client      βœ“ 45ms  β”‚                         β”‚
β”‚ Cache System    βœ“ 12ms  β”‚                         β”‚
β”‚ RAG System      βœ“ 89ms  β”‚                         β”‚
β”‚ Search Tools    βœ“ 23ms  β”‚                         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 🚨 Alerts (2)                                        β”‚
β”‚                                                      β”‚
β”‚ 🟑 High CPU usage: 87.5% (threshold: 85.0%)        β”‚
β”‚ 🟠 Slow operations: llm_query (P95: 5200ms)        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
┃ Alerts: πŸ”΄ 0 🟠 1 🟑 1 | Press Ctrl+C to exit     ┃
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ” Best Practices

  1. Regular Monitoring: Start the dashboard during development
  2. Performance Integration: Use PerformanceTimer for critical operations
  3. Alert Callbacks: Implement logging or notifications
  4. Adjust Thresholds: Adapt alerts to your environment
  5. Historical Data: Regularly export performance statistics

πŸ“ Changelog

v1.2.0 (2025-10-24)

v1.0.0

πŸ“š Further Resources