gnx/ETB

Files

Iliyan Angelov 6b247e5b9f Updates

2025-09-19 11:58:53 +03:00

11 KiB

Raw Blame History

ETB-API Monitoring System Deployment Guide

Overview

This guide provides step-by-step instructions for deploying the comprehensive monitoring system for your ETB-API platform. The monitoring system provides enterprise-grade observability across all modules.

Prerequisites

System Requirements

Python 3.8+
Django 5.2+
PostgreSQL 12+ (recommended) or SQLite (development)
Redis 6+ (for Celery)
Celery 5.3+

Dependencies

psutil>=5.9.0
requests>=2.31.0
celery>=5.3.0
redis>=4.5.0

Installation Steps

1. Install Dependencies

# Install Python dependencies
pip install psutil>=5.9.0 requests>=2.31.0

# Install Redis (Ubuntu/Debian)
sudo apt-get install redis-server

# Install Redis (CentOS/RHEL)
sudo yum install redis

2. Database Setup

# Create and run migrations
python manage.py makemigrations monitoring
python manage.py migrate

# Create superuser (if not exists)
python manage.py createsuperuser

3. Initialize Monitoring Configuration

# Set up default monitoring targets, metrics, and alert rules
python manage.py setup_monitoring --admin-user admin

4. Configure Celery

Create or update celery.py in your project:

from celery import Celery
from django.conf import settings
import os

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'core.settings')

app = Celery('core')
app.config_from_object('django.conf:settings', namespace='CELERY')

# Add monitoring tasks schedule
app.conf.beat_schedule = {
    'health-checks': {
        'task': 'monitoring.tasks.execute_health_checks',
        'schedule': 60.0,  # Every minute
    },
    'metrics-collection': {
        'task': 'monitoring.tasks.collect_metrics',
        'schedule': 300.0,  # Every 5 minutes
    },
    'alert-evaluation': {
        'task': 'monitoring.tasks.evaluate_alerts',
        'schedule': 60.0,  # Every minute
    },
    'data-cleanup': {
        'task': 'monitoring.tasks.cleanup_old_data',
        'schedule': 86400.0,  # Daily
    },
    'system-status-report': {
        'task': 'monitoring.tasks.generate_system_status_report',
        'schedule': 300.0,  # Every 5 minutes
    },
}

app.autodiscover_tasks()

5. Environment Configuration

Create .env file or set environment variables:

# Monitoring Settings
MONITORING_ENABLED=true
MONITORING_HEALTH_CHECK_INTERVAL=60
MONITORING_METRICS_COLLECTION_INTERVAL=300
MONITORING_ALERT_EVALUATION_INTERVAL=60

# Alerting Settings
ALERTING_EMAIL_FROM=monitoring@yourcompany.com
ALERTING_SLACK_WEBHOOK_URL=https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK
ALERTING_WEBHOOK_URL=https://your-webhook-url.com/alerts

# Performance Thresholds
PERFORMANCE_API_RESPONSE_THRESHOLD=2000
PERFORMANCE_CPU_THRESHOLD=80
PERFORMANCE_MEMORY_THRESHOLD=80
PERFORMANCE_DISK_THRESHOLD=80

# Email Configuration (for alerts)
EMAIL_HOST=smtp.gmail.com
EMAIL_PORT=587
EMAIL_USE_TLS=True
EMAIL_HOST_USER=your-email@gmail.com
EMAIL_HOST_PASSWORD=your-app-password
DEFAULT_FROM_EMAIL=monitoring@yourcompany.com

6. Start Services

# Start Django development server
python manage.py runserver

# Start Celery worker (in separate terminal)
celery -A core worker -l info

# Start Celery beat scheduler (in separate terminal)
celery -A core beat -l info

# Start Redis (if not running as service)
redis-server

Production Deployment

1. Database Configuration

For production, use PostgreSQL:

# settings.py
DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql',
        'NAME': 'etb_api_monitoring',
        'USER': 'monitoring_user',
        'PASSWORD': 'secure_password',
        'HOST': 'localhost',
        'PORT': '5432',
    }
}

2. Redis Configuration

# settings.py
CELERY_BROKER_URL = 'redis://localhost:6379/0'
CELERY_RESULT_BACKEND = 'redis://localhost:6379/0'

3. Static Files and Media

# Collect static files
python manage.py collectstatic

# Configure web server (Nginx example)
server {
    listen 80;
    server_name your-domain.com;
    
    location /static/ {
        alias /path/to/your/static/files/;
    }
    
    location /media/ {
        alias /path/to/your/media/files/;
    }
    
    location / {
        proxy_pass http://127.0.0.1:8000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

4. Process Management

Use systemd services for production:

Django Service (/etc/systemd/system/etb-api.service):

[Unit]
Description=ETB-API Django Application
After=network.target

[Service]
Type=simple
User=www-data
Group=www-data
WorkingDirectory=/path/to/etb-api
Environment=PATH=/path/to/etb-api/venv/bin
ExecStart=/path/to/etb-api/venv/bin/python manage.py runserver 0.0.0.0:8000
Restart=always

[Install]
WantedBy=multi-user.target

Celery Worker Service (/etc/systemd/system/etb-celery.service):

[Unit]
Description=ETB-API Celery Worker
After=network.target

[Service]
Type=simple
User=www-data
Group=www-data
WorkingDirectory=/path/to/etb-api
Environment=PATH=/path/to/etb-api/venv/bin
ExecStart=/path/to/etb-api/venv/bin/celery -A core worker -l info
Restart=always

[Install]
WantedBy=multi-user.target

Celery Beat Service (/etc/systemd/system/etb-celery-beat.service):

[Unit]
Description=ETB-API Celery Beat Scheduler
After=network.target

[Service]
Type=simple
User=www-data
Group=www-data
WorkingDirectory=/path/to/etb-api
Environment=PATH=/path/to/etb-api/venv/bin
ExecStart=/path/to/etb-api/venv/bin/celery -A core beat -l info
Restart=always

[Install]
WantedBy=multi-user.target

5. Enable Services

# Enable and start services
sudo systemctl enable etb-api
sudo systemctl enable etb-celery
sudo systemctl enable etb-celery-beat
sudo systemctl enable redis

sudo systemctl start etb-api
sudo systemctl start etb-celery
sudo systemctl start etb-celery-beat
sudo systemctl start redis

Monitoring Configuration

1. Customize Monitoring Targets

Access the admin interface at http://your-domain.com/admin/monitoring/ to:

Add custom monitoring targets
Configure health check intervals
Set up external service monitoring
Customize alert thresholds

2. Configure Alert Rules

Create alert rules for:

Performance Alerts: High response times, error rates
Business Alerts: SLA breaches, incident volume spikes
Security Alerts: Failed logins, security events
Infrastructure Alerts: High CPU, memory, disk usage

3. Set Up Notification Channels

Configure notification channels:

Email: Set up SMTP configuration
Slack: Configure webhook URLs
Webhooks: Set up external alerting systems
PagerDuty: Integrate with incident management

4. Create Custom Dashboards

Design dashboards for different user roles:

Executive Dashboard: High-level KPIs and trends
Operations Dashboard: Real-time system status
Security Dashboard: Security metrics and alerts
Development Dashboard: Application performance metrics

Verification

1. Check System Health

# Check health check summary
curl -H "Authorization: Token your-token" \
     http://localhost:8000/api/monitoring/health-checks/summary/

# Check system overview
curl -H "Authorization: Token your-token" \
     http://localhost:8000/api/monitoring/overview/

2. Verify Celery Tasks

# Check Celery worker status
celery -A core inspect active

# Check scheduled tasks
celery -A core inspect scheduled

3. Test Alerting

# Trigger a test alert
python manage.py shell
>>> from monitoring.models import AlertRule
>>> rule = AlertRule.objects.first()
>>> # Manually trigger alert for testing

Maintenance

1. Data Cleanup

The system automatically cleans up old data, but you can manually run:

python manage.py shell
>>> from monitoring.tasks import cleanup_old_data
>>> cleanup_old_data.delay()

2. Performance Tuning

Monitor and tune:

Health check intervals
Metrics collection frequency
Alert evaluation intervals
Data retention periods

3. Scaling

For high-volume environments:

Use multiple Celery workers
Implement Redis clustering
Use database read replicas
Consider time-series databases for metrics

Troubleshooting

Common Issues

Health Checks Failing

# Check logs
tail -f /var/log/etb-api.log

# Test individual targets
python manage.py shell
>>> from monitoring.services.health_checks import HealthCheckService
>>> service = HealthCheckService()
>>> service.execute_all_health_checks()

Celery Tasks Not Running

# Check Celery status
celery -A core inspect active

# Check Redis connection
redis-cli ping

# Restart services
sudo systemctl restart etb-celery
sudo systemctl restart etb-celery-beat

Alerts Not Sending

# Check email configuration
python manage.py shell
>>> from django.core.mail import send_mail
>>> send_mail('Test', 'Test message', 'from@example.com', ['to@example.com'])

# Check Slack webhook
curl -X POST -H 'Content-type: application/json' \
     --data '{"text":"Test message"}' \
     YOUR_SLACK_WEBHOOK_URL

Log Locations

Django logs: /var/log/etb-api.log
Celery logs: /var/log/celery.log
Nginx logs: /var/log/nginx/
System logs: /var/log/syslog

Security Considerations

1. Authentication

Use strong authentication tokens
Implement token rotation
Use HTTPS in production
Restrict admin access

2. Data Protection

Encrypt sensitive configuration data
Use secure database connections
Implement data retention policies
Regular security audits

3. Network Security

Use firewalls to restrict access
Implement rate limiting
Monitor for suspicious activity
Regular security updates

Backup and Recovery

1. Database Backup

# PostgreSQL backup
pg_dump etb_api_monitoring > backup_$(date +%Y%m%d_%H%M%S).sql

# Automated backup script
#!/bin/bash
BACKUP_DIR="/backups/monitoring"
DATE=$(date +%Y%m%d_%H%M%S)
pg_dump etb_api_monitoring > $BACKUP_DIR/backup_$DATE.sql
find $BACKUP_DIR -name "backup_*.sql" -mtime +7 -delete

2. Configuration Backup

# Backup configuration files
tar -czf monitoring_config_$(date +%Y%m%d).tar.gz \
    /path/to/etb-api/core/settings.py \
    /path/to/etb-api/.env \
    /etc/systemd/system/etb-*.service

3. Recovery Procedures

Restore database from backup
Restore configuration files
Restart services
Verify monitoring functionality
Check alert rules and thresholds

Support and Maintenance

Regular Tasks

Daily: Check system health and alerts
Weekly: Review metrics trends and thresholds
Monthly: Update monitoring configuration
Quarterly: Review and optimize performance

Monitoring the Monitor

Set up external monitoring for the monitoring system
Monitor Celery worker health
Track database performance
Monitor disk space usage

This deployment guide provides a comprehensive foundation for implementing enterprise-grade monitoring for your ETB-API system. Adjust configurations based on your specific requirements and infrastructure.

11 KiB Raw Blame History