# ETB-API Monitoring System Deployment Guide

## Overview

This guide provides step-by-step instructions for deploying the comprehensive monitoring system for your ETB-API platform. The monitoring system provides enterprise-grade observability across all modules.

## Prerequisites

### System Requirements
- Python 3.8+
- Django 5.2+
- PostgreSQL 12+ (recommended) or SQLite (development)
- Redis 6+ (for Celery)
- Celery 5.3+

### Dependencies
- psutil>=5.9.0
- requests>=2.31.0
- celery>=5.3.0
- redis>=4.5.0

## Installation Steps

### 1. Install Dependencies

```bash
# Install Python dependencies
pip install psutil>=5.9.0 requests>=2.31.0

# Install Redis (Ubuntu/Debian)
sudo apt-get install redis-server

# Install Redis (CentOS/RHEL)
sudo yum install redis
```

### 2. Database Setup

```bash
# Create and run migrations
python manage.py makemigrations monitoring
python manage.py migrate

# Create superuser (if not exists)
python manage.py createsuperuser
```

### 3. Initialize Monitoring Configuration

```bash
# Set up default monitoring targets, metrics, and alert rules
python manage.py setup_monitoring --admin-user admin
```

### 4. Configure Celery

Create or update `celery.py` in your project:

```python
from celery import Celery
from django.conf import settings
import os

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'core.settings')

app = Celery('core')
app.config_from_object('django.conf:settings', namespace='CELERY')

# Add monitoring tasks schedule
app.conf.beat_schedule = {
    'health-checks': {
        'task': 'monitoring.tasks.execute_health_checks',
        'schedule': 60.0,  # Every minute
    },
    'metrics-collection': {
        'task': 'monitoring.tasks.collect_metrics',
        'schedule': 300.0,  # Every 5 minutes
    },
    'alert-evaluation': {
        'task': 'monitoring.tasks.evaluate_alerts',
        'schedule': 60.0,  # Every minute
    },
    'data-cleanup': {
        'task': 'monitoring.tasks.cleanup_old_data',
        'schedule': 86400.0,  # Daily
    },
    'system-status-report': {
        'task': 'monitoring.tasks.generate_system_status_report',
        'schedule': 300.0,  # Every 5 minutes
    },
}

app.autodiscover_tasks()
```

### 5. Environment Configuration

Create `.env` file or set environment variables:

```bash
# Monitoring Settings
MONITORING_ENABLED=true
MONITORING_HEALTH_CHECK_INTERVAL=60
MONITORING_METRICS_COLLECTION_INTERVAL=300
MONITORING_ALERT_EVALUATION_INTERVAL=60

# Alerting Settings
ALERTING_EMAIL_FROM=monitoring@yourcompany.com
ALERTING_SLACK_WEBHOOK_URL=https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK
ALERTING_WEBHOOK_URL=https://your-webhook-url.com/alerts

# Performance Thresholds
PERFORMANCE_API_RESPONSE_THRESHOLD=2000
PERFORMANCE_CPU_THRESHOLD=80
PERFORMANCE_MEMORY_THRESHOLD=80
PERFORMANCE_DISK_THRESHOLD=80

# Email Configuration (for alerts)
EMAIL_HOST=smtp.gmail.com
EMAIL_PORT=587
EMAIL_USE_TLS=True
EMAIL_HOST_USER=your-email@gmail.com
EMAIL_HOST_PASSWORD=your-app-password
DEFAULT_FROM_EMAIL=monitoring@yourcompany.com
```

### 6. Start Services

```bash
# Start Django development server
python manage.py runserver

# Start Celery worker (in separate terminal)
celery -A core worker -l info

# Start Celery beat scheduler (in separate terminal)
celery -A core beat -l info

# Start Redis (if not running as service)
redis-server
```

## Production Deployment

### 1. Database Configuration

For production, use PostgreSQL:

```python
# settings.py
DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql',
        'NAME': 'etb_api_monitoring',
        'USER': 'monitoring_user',
        'PASSWORD': 'secure_password',
        'HOST': 'localhost',
        'PORT': '5432',
    }
}
```

### 2. Redis Configuration

```python
# settings.py
CELERY_BROKER_URL = 'redis://localhost:6379/0'
CELERY_RESULT_BACKEND = 'redis://localhost:6379/0'
```

### 3. Static Files and Media

```bash
# Collect static files
python manage.py collectstatic

# Configure web server (Nginx example)
server {
    listen 80;
    server_name your-domain.com;
    
    location /static/ {
        alias /path/to/your/static/files/;
    }
    
    location /media/ {
        alias /path/to/your/media/files/;
    }
    
    location / {
        proxy_pass http://127.0.0.1:8000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}
```

### 4. Process Management

Use systemd services for production:

**Django Service** (`/etc/systemd/system/etb-api.service`):
```ini
[Unit]
Description=ETB-API Django Application
After=network.target

[Service]
Type=simple
User=www-data
Group=www-data
WorkingDirectory=/path/to/etb-api
Environment=PATH=/path/to/etb-api/venv/bin
ExecStart=/path/to/etb-api/venv/bin/python manage.py runserver 0.0.0.0:8000
Restart=always

[Install]
WantedBy=multi-user.target
```

**Celery Worker Service** (`/etc/systemd/system/etb-celery.service`):
```ini
[Unit]
Description=ETB-API Celery Worker
After=network.target

[Service]
Type=simple
User=www-data
Group=www-data
WorkingDirectory=/path/to/etb-api
Environment=PATH=/path/to/etb-api/venv/bin
ExecStart=/path/to/etb-api/venv/bin/celery -A core worker -l info
Restart=always

[Install]
WantedBy=multi-user.target
```

**Celery Beat Service** (`/etc/systemd/system/etb-celery-beat.service`):
```ini
[Unit]
Description=ETB-API Celery Beat Scheduler
After=network.target

[Service]
Type=simple
User=www-data
Group=www-data
WorkingDirectory=/path/to/etb-api
Environment=PATH=/path/to/etb-api/venv/bin
ExecStart=/path/to/etb-api/venv/bin/celery -A core beat -l info
Restart=always

[Install]
WantedBy=multi-user.target
```

### 5. Enable Services

```bash
# Enable and start services
sudo systemctl enable etb-api
sudo systemctl enable etb-celery
sudo systemctl enable etb-celery-beat
sudo systemctl enable redis

sudo systemctl start etb-api
sudo systemctl start etb-celery
sudo systemctl start etb-celery-beat
sudo systemctl start redis
```

## Monitoring Configuration

### 1. Customize Monitoring Targets

Access the admin interface at `http://your-domain.com/admin/monitoring/` to:

- Add custom monitoring targets
- Configure health check intervals
- Set up external service monitoring
- Customize alert thresholds

### 2. Configure Alert Rules

Create alert rules for:

- **Performance Alerts**: High response times, error rates
- **Business Alerts**: SLA breaches, incident volume spikes
- **Security Alerts**: Failed logins, security events
- **Infrastructure Alerts**: High CPU, memory, disk usage

### 3. Set Up Notification Channels

Configure notification channels:

- **Email**: Set up SMTP configuration
- **Slack**: Configure webhook URLs
- **Webhooks**: Set up external alerting systems
- **PagerDuty**: Integrate with incident management

### 4. Create Custom Dashboards

Design dashboards for different user roles:

- **Executive Dashboard**: High-level KPIs and trends
- **Operations Dashboard**: Real-time system status
- **Security Dashboard**: Security metrics and alerts
- **Development Dashboard**: Application performance metrics

## Verification

### 1. Check System Health

```bash
# Check health check summary
curl -H "Authorization: Token your-token" \
     http://localhost:8000/api/monitoring/health-checks/summary/

# Check system overview
curl -H "Authorization: Token your-token" \
     http://localhost:8000/api/monitoring/overview/
```

### 2. Verify Celery Tasks

```bash
# Check Celery worker status
celery -A core inspect active

# Check scheduled tasks
celery -A core inspect scheduled
```

### 3. Test Alerting

```bash
# Trigger a test alert
python manage.py shell
>>> from monitoring.models import AlertRule
>>> rule = AlertRule.objects.first()
>>> # Manually trigger alert for testing
```

## Maintenance

### 1. Data Cleanup

The system automatically cleans up old data, but you can manually run:

```bash
python manage.py shell
>>> from monitoring.tasks import cleanup_old_data
>>> cleanup_old_data.delay()
```

### 2. Performance Tuning

Monitor and tune:

- Health check intervals
- Metrics collection frequency
- Alert evaluation intervals
- Data retention periods

### 3. Scaling

For high-volume environments:

- Use multiple Celery workers
- Implement Redis clustering
- Use database read replicas
- Consider time-series databases for metrics

## Troubleshooting

### Common Issues

1. **Health Checks Failing**
   ```bash
   # Check logs
   tail -f /var/log/etb-api.log
   
   # Test individual targets
   python manage.py shell
   >>> from monitoring.services.health_checks import HealthCheckService
   >>> service = HealthCheckService()
   >>> service.execute_all_health_checks()
   ```

2. **Celery Tasks Not Running**
   ```bash
   # Check Celery status
   celery -A core inspect active
   
   # Check Redis connection
   redis-cli ping
   
   # Restart services
   sudo systemctl restart etb-celery
   sudo systemctl restart etb-celery-beat
   ```

3. **Alerts Not Sending**
   ```bash
   # Check email configuration
   python manage.py shell
   >>> from django.core.mail import send_mail
   >>> send_mail('Test', 'Test message', 'from@example.com', ['to@example.com'])
   
   # Check Slack webhook
   curl -X POST -H 'Content-type: application/json' \
        --data '{"text":"Test message"}' \
        YOUR_SLACK_WEBHOOK_URL
   ```

### Log Locations

- Django logs: `/var/log/etb-api.log`
- Celery logs: `/var/log/celery.log`
- Nginx logs: `/var/log/nginx/`
- System logs: `/var/log/syslog`

## Security Considerations

### 1. Authentication
- Use strong authentication tokens
- Implement token rotation
- Use HTTPS in production
- Restrict admin access

### 2. Data Protection
- Encrypt sensitive configuration data
- Use secure database connections
- Implement data retention policies
- Regular security audits

### 3. Network Security
- Use firewalls to restrict access
- Implement rate limiting
- Monitor for suspicious activity
- Regular security updates

## Backup and Recovery

### 1. Database Backup

```bash
# PostgreSQL backup
pg_dump etb_api_monitoring > backup_$(date +%Y%m%d_%H%M%S).sql

# Automated backup script
#!/bin/bash
BACKUP_DIR="/backups/monitoring"
DATE=$(date +%Y%m%d_%H%M%S)
pg_dump etb_api_monitoring > $BACKUP_DIR/backup_$DATE.sql
find $BACKUP_DIR -name "backup_*.sql" -mtime +7 -delete
```

### 2. Configuration Backup

```bash
# Backup configuration files
tar -czf monitoring_config_$(date +%Y%m%d).tar.gz \
    /path/to/etb-api/core/settings.py \
    /path/to/etb-api/.env \
    /etc/systemd/system/etb-*.service
```

### 3. Recovery Procedures

1. Restore database from backup
2. Restore configuration files
3. Restart services
4. Verify monitoring functionality
5. Check alert rules and thresholds

## Support and Maintenance

### Regular Tasks

- **Daily**: Check system health and alerts
- **Weekly**: Review metrics trends and thresholds
- **Monthly**: Update monitoring configuration
- **Quarterly**: Review and optimize performance

### Monitoring the Monitor

- Set up external monitoring for the monitoring system
- Monitor Celery worker health
- Track database performance
- Monitor disk space usage

This deployment guide provides a comprehensive foundation for implementing enterprise-grade monitoring for your ETB-API system. Adjust configurations based on your specific requirements and infrastructure.