This commit is contained in:
Iliyan Angelov
2025-09-19 11:58:53 +03:00
parent 306b20e24a
commit 6b247e5b9f
11423 changed files with 1500615 additions and 778 deletions

View File

@@ -0,0 +1,633 @@
# Analytics & Predictive Insights API Documentation
## Overview
The Analytics & Predictive Insights module provides comprehensive analytics capabilities for incident management, including advanced KPIs, predictive analytics, ML-based anomaly detection, and cost impact analysis.
## Features
- **Advanced KPIs**: MTTA, MTTR, incident recurrence rate, availability metrics
- **Predictive Analytics**: ML-based incident prediction, severity prediction, resolution time prediction
- **Anomaly Detection**: Statistical, temporal, and pattern-based anomaly detection
- **Cost Analysis**: Downtime cost, lost revenue, penalty cost analysis
- **Dashboards**: Configurable dashboards with heatmaps and visualizations
- **Heatmaps**: Time-based incident frequency, resolution time, and cost impact visualizations
## API Endpoints
### Base URL
```
/api/analytics/
```
## KPI Metrics
### List KPI Metrics
```http
GET /api/analytics/kpi-metrics/
```
**Query Parameters:**
- `metric_type`: Filter by metric type (MTTA, MTTR, INCIDENT_COUNT, etc.)
- `is_active`: Filter by active status (true/false)
- `is_system_metric`: Filter by system metric status (true/false)
- `created_after`: Filter by creation date (ISO 8601)
- `created_before`: Filter by creation date (ISO 8601)
**Response:**
```json
{
"count": 10,
"next": null,
"previous": null,
"results": [
{
"id": "uuid",
"name": "Mean Time to Acknowledge",
"description": "Average time to acknowledge incidents",
"metric_type": "MTTA",
"aggregation_type": "AVERAGE",
"incident_categories": ["Infrastructure", "Application"],
"incident_severities": ["HIGH", "CRITICAL"],
"incident_priorities": ["P1", "P2"],
"calculation_formula": null,
"time_window_hours": 24,
"is_active": true,
"is_system_metric": true,
"created_by_username": "admin",
"created_at": "2024-01-01T00:00:00Z",
"updated_at": "2024-01-01T00:00:00Z",
"measurement_count": 100,
"latest_measurement": {
"value": "15.5",
"unit": "minutes",
"calculated_at": "2024-01-01T12:00:00Z",
"incident_count": 25
}
}
]
}
```
### Get KPI Metric Details
```http
GET /api/analytics/kpi-metrics/{id}/
```
### Create KPI Metric
```http
POST /api/analytics/kpi-metrics/
```
**Request Body:**
```json
{
"name": "Custom MTTR",
"description": "Custom Mean Time to Resolve metric",
"metric_type": "MTTR",
"aggregation_type": "AVERAGE",
"incident_categories": ["Infrastructure"],
"incident_severities": ["HIGH", "CRITICAL"],
"time_window_hours": 48,
"is_active": true
}
```
### Get KPI Measurements
```http
GET /api/analytics/kpi-metrics/{id}/measurements/
```
**Query Parameters:**
- `start_date`: Filter by measurement period start (ISO 8601)
- `end_date`: Filter by measurement period end (ISO 8601)
### Get KPI Summary
```http
GET /api/analytics/kpi-metrics/summary/
```
**Response:**
```json
[
{
"metric_type": "MTTA",
"metric_name": "Mean Time to Acknowledge",
"current_value": "15.5",
"unit": "minutes",
"trend": "down",
"trend_percentage": "-5.2",
"period_start": "2024-01-01T00:00:00Z",
"period_end": "2024-01-01T24:00:00Z",
"incident_count": 25,
"target_value": null,
"target_met": true
}
]
```
## KPI Measurements
### List KPI Measurements
```http
GET /api/analytics/kpi-measurements/
```
**Query Parameters:**
- `metric_id`: Filter by metric ID
- `start_date`: Filter by measurement period start
- `end_date`: Filter by measurement period end
## Incident Recurrence Analysis
### List Recurrence Analyses
```http
GET /api/analytics/recurrence-analyses/
```
**Query Parameters:**
- `recurrence_type`: Filter by recurrence type
- `min_confidence`: Filter by minimum confidence score
- `is_resolved`: Filter by resolution status
### Get Unresolved Recurrence Analyses
```http
GET /api/analytics/recurrence-analyses/unresolved/
```
## Predictive Models
### List Predictive Models
```http
GET /api/analytics/predictive-models/
```
**Query Parameters:**
- `model_type`: Filter by model type
- `status`: Filter by model status
### Create Predictive Model
```http
POST /api/analytics/predictive-models/
```
**Request Body:**
```json
{
"name": "Incident Severity Predictor",
"description": "Predicts incident severity based on historical data",
"model_type": "SEVERITY_PREDICTION",
"algorithm_type": "RANDOM_FOREST",
"model_config": {
"n_estimators": 100,
"max_depth": 10
},
"feature_columns": ["title_length", "description_length", "category"],
"target_column": "severity",
"training_data_period_days": 90,
"min_training_samples": 100
}
```
### Train Model
```http
POST /api/analytics/predictive-models/{id}/train/
```
### Get Model Performance
```http
GET /api/analytics/predictive-models/{id}/performance/
```
## Anomaly Detection
### List Anomaly Detections
```http
GET /api/analytics/anomaly-detections/
```
**Query Parameters:**
- `anomaly_type`: Filter by anomaly type
- `severity`: Filter by severity level
- `status`: Filter by status
- `start_date`: Filter by detection date
- `end_date`: Filter by detection date
### Get Anomaly Summary
```http
GET /api/analytics/anomaly-detections/summary/
```
**Response:**
```json
{
"total_anomalies": 50,
"critical_anomalies": 5,
"high_anomalies": 15,
"medium_anomalies": 20,
"low_anomalies": 10,
"unresolved_anomalies": 12,
"false_positive_rate": "8.5",
"average_resolution_time": "2:30:00"
}
```
### Acknowledge Anomaly
```http
POST /api/analytics/anomaly-detections/{id}/acknowledge/
```
### Resolve Anomaly
```http
POST /api/analytics/anomaly-detections/{id}/resolve/
```
## Cost Impact Analysis
### List Cost Analyses
```http
GET /api/analytics/cost-analyses/
```
**Query Parameters:**
- `cost_type`: Filter by cost type
- `is_validated`: Filter by validation status
- `start_date`: Filter by creation date
- `end_date`: Filter by creation date
### Get Cost Summary
```http
GET /api/analytics/cost-analyses/summary/
```
**Response:**
```json
{
"total_cost": "125000.00",
"currency": "USD",
"downtime_cost": "75000.00",
"lost_revenue": "40000.00",
"penalty_cost": "10000.00",
"resource_cost": "0.00",
"total_downtime_hours": "150.5",
"total_affected_users": 5000,
"cost_per_hour": "830.56",
"cost_per_user": "25.00"
}
```
## Dashboard Configurations
### List Dashboard Configurations
```http
GET /api/analytics/dashboard-configurations/
```
**Query Parameters:**
- `dashboard_type`: Filter by dashboard type
- `is_active`: Filter by active status
### Create Dashboard Configuration
```http
POST /api/analytics/dashboard-configurations/
```
**Request Body:**
```json
{
"name": "Executive Dashboard",
"description": "High-level metrics for executives",
"dashboard_type": "EXECUTIVE",
"layout_config": {
"rows": 2,
"columns": 3
},
"widget_configs": [
{
"type": "kpi_summary",
"position": {"row": 0, "column": 0},
"size": {"width": 2, "height": 1}
},
{
"type": "anomaly_summary",
"position": {"row": 0, "column": 2},
"size": {"width": 1, "height": 1}
}
],
"is_public": false,
"allowed_roles": ["executive", "manager"],
"auto_refresh_enabled": true,
"refresh_interval_seconds": 300
}
```
### Get Dashboard Data
```http
GET /api/analytics/dashboard/{id}/data/
```
**Response:**
```json
{
"kpi_summary": [...],
"anomaly_summary": {...},
"cost_summary": {...},
"insight_summary": {...},
"recent_anomalies": [...],
"recent_insights": [...],
"heatmap_data": [...],
"last_updated": "2024-01-01T12:00:00Z"
}
```
## Heatmap Data
### List Heatmap Data
```http
GET /api/analytics/heatmap-data/
```
**Query Parameters:**
- `heatmap_type`: Filter by heatmap type
- `time_granularity`: Filter by time granularity
## Predictive Insights
### List Predictive Insights
```http
GET /api/analytics/predictive-insights/
```
**Query Parameters:**
- `insight_type`: Filter by insight type
- `confidence_level`: Filter by confidence level
- `is_acknowledged`: Filter by acknowledgment status
- `is_validated`: Filter by validation status
- `include_expired`: Include expired insights (true/false)
### Acknowledge Insight
```http
POST /api/analytics/predictive-insights/{id}/acknowledge/
```
### Get Insight Summary
```http
GET /api/analytics/predictive-insights/summary/
```
**Response:**
```json
{
"total_insights": 25,
"high_confidence_insights": 8,
"medium_confidence_insights": 12,
"low_confidence_insights": 5,
"acknowledged_insights": 15,
"validated_insights": 10,
"expired_insights": 3,
"average_accuracy": "0.85",
"active_models": 4
}
```
## Data Models
### KPI Metric
```json
{
"id": "uuid",
"name": "string",
"description": "string",
"metric_type": "MTTA|MTTR|MTBF|MTBSI|AVAILABILITY|INCIDENT_COUNT|RESOLUTION_RATE|ESCALATION_RATE|CUSTOM",
"aggregation_type": "AVERAGE|MEDIAN|MIN|MAX|SUM|COUNT|PERCENTILE_95|PERCENTILE_99",
"incident_categories": ["string"],
"incident_severities": ["string"],
"incident_priorities": ["string"],
"calculation_formula": "string",
"time_window_hours": "integer",
"is_active": "boolean",
"is_system_metric": "boolean",
"created_by": "uuid",
"created_at": "datetime",
"updated_at": "datetime"
}
```
### Predictive Model
```json
{
"id": "uuid",
"name": "string",
"description": "string",
"model_type": "ANOMALY_DETECTION|INCIDENT_PREDICTION|SEVERITY_PREDICTION|RESOLUTION_TIME_PREDICTION|ESCALATION_PREDICTION|COST_PREDICTION",
"algorithm_type": "ISOLATION_FOREST|LSTM|RANDOM_FOREST|XGBOOST|SVM|NEURAL_NETWORK|ARIMA|PROPHET",
"model_config": "object",
"feature_columns": ["string"],
"target_column": "string",
"training_data_period_days": "integer",
"min_training_samples": "integer",
"accuracy_score": "float",
"precision_score": "float",
"recall_score": "float",
"f1_score": "float",
"status": "TRAINING|ACTIVE|INACTIVE|RETRAINING|ERROR",
"version": "string",
"model_file_path": "string",
"last_trained_at": "datetime",
"training_duration_seconds": "integer",
"training_samples_count": "integer",
"auto_retrain_enabled": "boolean",
"retrain_frequency_days": "integer",
"performance_threshold": "float",
"created_by": "uuid",
"created_at": "datetime",
"updated_at": "datetime"
}
```
### Anomaly Detection
```json
{
"id": "uuid",
"model": "uuid",
"anomaly_type": "STATISTICAL|TEMPORAL|PATTERN|THRESHOLD|BEHAVIORAL",
"severity": "LOW|MEDIUM|HIGH|CRITICAL",
"status": "DETECTED|INVESTIGATING|CONFIRMED|FALSE_POSITIVE|RESOLVED",
"confidence_score": "float",
"anomaly_score": "float",
"threshold_used": "float",
"detected_at": "datetime",
"time_window_start": "datetime",
"time_window_end": "datetime",
"related_incidents": ["uuid"],
"affected_services": ["string"],
"affected_metrics": ["string"],
"description": "string",
"root_cause_analysis": "string",
"impact_assessment": "string",
"actions_taken": ["string"],
"resolved_at": "datetime",
"resolved_by": "uuid",
"metadata": "object"
}
```
### Cost Impact Analysis
```json
{
"id": "uuid",
"incident": "uuid",
"cost_type": "DOWNTIME|LOST_REVENUE|PENALTY|RESOURCE_COST|REPUTATION_COST|COMPLIANCE_COST",
"cost_amount": "decimal",
"currency": "string",
"calculation_method": "string",
"calculation_details": "object",
"downtime_hours": "decimal",
"affected_users": "integer",
"revenue_impact": "decimal",
"business_unit": "string",
"service_tier": "string",
"is_validated": "boolean",
"validated_by": "uuid",
"validated_at": "datetime",
"validation_notes": "string",
"created_at": "datetime",
"updated_at": "datetime"
}
```
## Management Commands
### Calculate KPIs
```bash
python manage.py calculate_kpis [--metric-id METRIC_ID] [--time-window HOURS] [--force]
```
### Run Anomaly Detection
```bash
python manage.py run_anomaly_detection [--model-id MODEL_ID] [--time-window HOURS]
```
### Train Predictive Models
```bash
python manage.py train_predictive_models [--model-id MODEL_ID] [--force]
```
## Error Handling
All endpoints return appropriate HTTP status codes and error messages:
- `400 Bad Request`: Invalid request data
- `401 Unauthorized`: Authentication required
- `403 Forbidden`: Insufficient permissions
- `404 Not Found`: Resource not found
- `500 Internal Server Error`: Server error
**Error Response Format:**
```json
{
"error": "Error message",
"details": "Additional error details",
"code": "ERROR_CODE"
}
```
## Authentication
All endpoints require authentication. Use one of the following methods:
1. **Token Authentication**: Include `Authorization: Token <token>` header
2. **Session Authentication**: Use Django session authentication
3. **SSO Authentication**: Use configured SSO providers
## Rate Limiting
API endpoints are rate-limited to prevent abuse:
- 1000 requests per hour per user
- 100 requests per minute per user
## Pagination
List endpoints support pagination:
- `page`: Page number (default: 1)
- `page_size`: Items per page (default: 20, max: 100)
## Filtering and Sorting
Most list endpoints support:
- **Filtering**: Use query parameters to filter results
- **Sorting**: Use `ordering` parameter (e.g., `ordering=-created_at`)
- **Search**: Use `search` parameter for text search
## Webhooks
The analytics module supports webhooks for real-time notifications:
- **Anomaly Detected**: Triggered when new anomalies are detected
- **KPI Threshold Breached**: Triggered when KPI values exceed thresholds
- **Model Training Completed**: Triggered when model training finishes
- **Cost Threshold Exceeded**: Triggered when cost impact exceeds thresholds
## Integration Examples
### Python Client Example
```python
import requests
# Get KPI summary
response = requests.get(
'https://api.example.com/api/analytics/kpi-metrics/summary/',
headers={'Authorization': 'Token your-token-here'}
)
kpi_summary = response.json()
# Create predictive model
model_data = {
'name': 'Incident Predictor',
'description': 'Predicts incident occurrence',
'model_type': 'INCIDENT_PREDICTION',
'algorithm_type': 'RANDOM_FOREST',
'model_config': {'n_estimators': 100}
}
response = requests.post(
'https://api.example.com/api/analytics/predictive-models/',
json=model_data,
headers={'Authorization': 'Token your-token-here'}
)
model = response.json()
```
### JavaScript Client Example
```javascript
// Get dashboard data
fetch('/api/analytics/dashboard/123/data/', {
headers: {
'Authorization': 'Token your-token-here',
'Content-Type': 'application/json'
}
})
.then(response => response.json())
.then(data => {
console.log('Dashboard data:', data);
// Update dashboard UI
});
```
## Best Practices
1. **Use appropriate time windows** for KPI calculations
2. **Monitor model performance** and retrain when accuracy drops
3. **Validate cost analyses** before using for business decisions
4. **Set up alerts** for critical anomalies and threshold breaches
5. **Regular cleanup** of expired insights and old measurements
6. **Use pagination** for large datasets
7. **Cache frequently accessed data** to improve performance
## Support
For technical support or questions about the Analytics & Predictive Insights API:
- **Documentation**: This API documentation
- **Issues**: Report issues through the project repository
- **Contact**: Contact the development team for assistance

View File

@@ -0,0 +1,338 @@
"""
Admin configuration for analytics_predictive_insights app
"""
from django.contrib import admin
from django.utils.html import format_html
from django.urls import reverse
from django.utils.safestring import mark_safe
from .models import (
KPIMetric, KPIMeasurement, IncidentRecurrenceAnalysis, PredictiveModel,
AnomalyDetection, CostImpactAnalysis, DashboardConfiguration,
HeatmapData, PredictiveInsight
)
@admin.register(KPIMetric)
class KPIMetricAdmin(admin.ModelAdmin):
"""Admin interface for KPI metrics"""
list_display = [
'name', 'metric_type', 'aggregation_type', 'is_active',
'is_system_metric', 'created_by', 'created_at'
]
list_filter = [
'metric_type', 'aggregation_type', 'is_active',
'is_system_metric', 'created_at'
]
search_fields = ['name', 'description']
readonly_fields = ['id', 'created_at', 'updated_at']
fieldsets = (
('Basic Information', {
'fields': ('id', 'name', 'description', 'metric_type', 'aggregation_type')
}),
('Targeting Criteria', {
'fields': ('incident_categories', 'incident_severities', 'incident_priorities')
}),
('Configuration', {
'fields': ('calculation_formula', 'time_window_hours', 'is_active', 'is_system_metric')
}),
('Metadata', {
'fields': ('created_by', 'created_at', 'updated_at'),
'classes': ('collapse',)
})
)
@admin.register(KPIMeasurement)
class KPIMeasurementAdmin(admin.ModelAdmin):
"""Admin interface for KPI measurements"""
list_display = [
'metric', 'value', 'unit', 'incident_count',
'measurement_period_start', 'calculated_at'
]
list_filter = [
'metric__metric_type', 'unit', 'calculated_at'
]
search_fields = ['metric__name']
readonly_fields = ['id', 'calculated_at']
fieldsets = (
('Measurement Details', {
'fields': ('id', 'metric', 'value', 'unit')
}),
('Time Period', {
'fields': ('measurement_period_start', 'measurement_period_end')
}),
('Context', {
'fields': ('incident_count', 'sample_size', 'metadata')
}),
('Metadata', {
'fields': ('calculated_at',),
'classes': ('collapse',)
})
)
@admin.register(IncidentRecurrenceAnalysis)
class IncidentRecurrenceAnalysisAdmin(admin.ModelAdmin):
"""Admin interface for incident recurrence analysis"""
list_display = [
'primary_incident', 'recurrence_type', 'confidence_score',
'recurrence_rate', 'is_resolved', 'created_at'
]
list_filter = [
'recurrence_type', 'is_resolved', 'created_at'
]
search_fields = ['primary_incident__title']
readonly_fields = ['id', 'created_at', 'updated_at']
fieldsets = (
('Analysis Details', {
'fields': ('id', 'primary_incident', 'recurring_incidents', 'recurrence_type', 'confidence_score', 'recurrence_rate')
}),
('Pattern Characteristics', {
'fields': ('common_keywords', 'common_categories', 'time_pattern')
}),
('Impact Analysis', {
'fields': ('total_affected_users', 'total_downtime_hours', 'estimated_cost_impact')
}),
('Recommendations', {
'fields': ('prevention_recommendations', 'automation_opportunities')
}),
('Status', {
'fields': ('is_resolved', 'resolution_actions')
}),
('Metadata', {
'fields': ('created_at', 'updated_at', 'model_version'),
'classes': ('collapse',)
})
)
@admin.register(PredictiveModel)
class PredictiveModelAdmin(admin.ModelAdmin):
"""Admin interface for predictive models"""
list_display = [
'name', 'model_type', 'algorithm_type', 'status',
'accuracy_score', 'last_trained_at', 'created_at'
]
list_filter = [
'model_type', 'algorithm_type', 'status', 'created_at'
]
search_fields = ['name', 'description']
readonly_fields = ['id', 'created_at', 'updated_at']
fieldsets = (
('Basic Information', {
'fields': ('id', 'name', 'description', 'model_type', 'algorithm_type')
}),
('Model Configuration', {
'fields': ('model_config', 'feature_columns', 'target_column')
}),
('Training Configuration', {
'fields': ('training_data_period_days', 'min_training_samples')
}),
('Performance Metrics', {
'fields': ('accuracy_score', 'precision_score', 'recall_score', 'f1_score')
}),
('Status and Metadata', {
'fields': ('status', 'version', 'model_file_path', 'last_trained_at', 'training_duration_seconds', 'training_samples_count')
}),
('Retraining Configuration', {
'fields': ('auto_retrain_enabled', 'retrain_frequency_days', 'performance_threshold')
}),
('Metadata', {
'fields': ('created_by', 'created_at', 'updated_at'),
'classes': ('collapse',)
})
)
@admin.register(AnomalyDetection)
class AnomalyDetectionAdmin(admin.ModelAdmin):
"""Admin interface for anomaly detection results"""
list_display = [
'anomaly_type', 'severity', 'status', 'confidence_score',
'detected_at', 'resolved_at'
]
list_filter = [
'anomaly_type', 'severity', 'status', 'detected_at'
]
search_fields = ['description', 'model__name']
readonly_fields = ['id', 'detected_at']
fieldsets = (
('Detection Details', {
'fields': ('id', 'model', 'anomaly_type', 'severity', 'status')
}),
('Detection Metrics', {
'fields': ('confidence_score', 'anomaly_score', 'threshold_used')
}),
('Time Context', {
'fields': ('detected_at', 'time_window_start', 'time_window_end')
}),
('Related Data', {
'fields': ('related_incidents', 'affected_services', 'affected_metrics')
}),
('Analysis', {
'fields': ('description', 'root_cause_analysis', 'impact_assessment')
}),
('Actions', {
'fields': ('actions_taken', 'resolved_at', 'resolved_by')
}),
('Metadata', {
'fields': ('metadata',),
'classes': ('collapse',)
})
)
@admin.register(CostImpactAnalysis)
class CostImpactAnalysisAdmin(admin.ModelAdmin):
"""Admin interface for cost impact analysis"""
list_display = [
'incident', 'cost_type', 'cost_amount', 'currency',
'is_validated', 'created_at'
]
list_filter = [
'cost_type', 'currency', 'is_validated', 'created_at'
]
search_fields = ['incident__title', 'business_unit']
readonly_fields = ['id', 'created_at', 'updated_at']
fieldsets = (
('Cost Details', {
'fields': ('id', 'incident', 'cost_type', 'cost_amount', 'currency')
}),
('Calculation Details', {
'fields': ('calculation_method', 'calculation_details')
}),
('Impact Metrics', {
'fields': ('downtime_hours', 'affected_users', 'revenue_impact')
}),
('Business Context', {
'fields': ('business_unit', 'service_tier')
}),
('Validation', {
'fields': ('is_validated', 'validated_by', 'validated_at', 'validation_notes')
}),
('Metadata', {
'fields': ('created_at', 'updated_at'),
'classes': ('collapse',)
})
)
@admin.register(DashboardConfiguration)
class DashboardConfigurationAdmin(admin.ModelAdmin):
"""Admin interface for dashboard configurations"""
list_display = [
'name', 'dashboard_type', 'is_active', 'is_public',
'auto_refresh_enabled', 'created_by', 'created_at'
]
list_filter = [
'dashboard_type', 'is_active', 'is_public', 'auto_refresh_enabled', 'created_at'
]
search_fields = ['name', 'description']
readonly_fields = ['id', 'created_at', 'updated_at']
fieldsets = (
('Basic Information', {
'fields': ('id', 'name', 'description', 'dashboard_type')
}),
('Configuration', {
'fields': ('layout_config', 'widget_configs')
}),
('Access Control', {
'fields': ('is_public', 'allowed_users', 'allowed_roles')
}),
('Refresh Configuration', {
'fields': ('auto_refresh_enabled', 'refresh_interval_seconds')
}),
('Status', {
'fields': ('is_active',)
}),
('Metadata', {
'fields': ('created_by', 'created_at', 'updated_at'),
'classes': ('collapse',)
})
)
@admin.register(HeatmapData)
class HeatmapDataAdmin(admin.ModelAdmin):
"""Admin interface for heatmap data"""
list_display = [
'name', 'heatmap_type', 'time_granularity',
'aggregation_method', 'created_at'
]
list_filter = [
'heatmap_type', 'time_granularity', 'aggregation_method', 'created_at'
]
search_fields = ['name']
readonly_fields = ['id', 'created_at', 'updated_at']
fieldsets = (
('Basic Information', {
'fields': ('id', 'name', 'heatmap_type')
}),
('Time Configuration', {
'fields': ('time_period_start', 'time_period_end', 'time_granularity')
}),
('Data Configuration', {
'fields': ('data_points', 'color_scheme', 'aggregation_method')
}),
('Metadata', {
'fields': ('created_at', 'updated_at'),
'classes': ('collapse',)
})
)
@admin.register(PredictiveInsight)
class PredictiveInsightAdmin(admin.ModelAdmin):
"""Admin interface for predictive insights"""
list_display = [
'title', 'insight_type', 'confidence_level', 'confidence_score',
'is_acknowledged', 'is_validated', 'generated_at'
]
list_filter = [
'insight_type', 'confidence_level', 'is_acknowledged',
'is_validated', 'generated_at'
]
search_fields = ['title', 'description', 'model__name']
readonly_fields = ['id', 'generated_at']
fieldsets = (
('Insight Details', {
'fields': ('id', 'model', 'insight_type', 'title', 'description', 'confidence_level', 'confidence_score')
}),
('Prediction Details', {
'fields': ('predicted_value', 'prediction_horizon', 'prediction_date')
}),
('Context', {
'fields': ('input_features', 'supporting_evidence', 'related_incidents', 'affected_services')
}),
('Recommendations', {
'fields': ('recommendations', 'risk_assessment')
}),
('Status', {
'fields': ('is_acknowledged', 'acknowledged_by', 'acknowledged_at')
}),
('Validation', {
'fields': ('is_validated', 'actual_value', 'validation_accuracy')
}),
('Metadata', {
'fields': ('generated_at', 'expires_at'),
'classes': ('collapse',)
})
)

View File

@@ -0,0 +1,20 @@
"""
Analytics & Predictive Insights app configuration
"""
from django.apps import AppConfig
class AnalyticsPredictiveInsightsConfig(AppConfig):
"""Configuration for the analytics_predictive_insights app"""
default_auto_field = 'django.db.models.BigAutoField'
name = 'analytics_predictive_insights'
verbose_name = 'Analytics & Predictive Insights'
def ready(self):
"""Initialize the app when Django starts"""
# Import signal handlers
try:
import analytics_predictive_insights.signals
except ImportError:
pass

View File

@@ -0,0 +1 @@
# Management commands for analytics_predictive_insights

View File

@@ -0,0 +1 @@
# Management commands

View File

@@ -0,0 +1,216 @@
"""
Management command to calculate KPI measurements
"""
from django.core.management.base import BaseCommand, CommandError
from django.utils import timezone
from datetime import timedelta
from analytics_predictive_insights.models import KPIMetric, KPIMeasurement
from incident_intelligence.models import Incident
class Command(BaseCommand):
"""Calculate KPI measurements for all active metrics"""
help = 'Calculate KPI measurements for all active metrics'
def add_arguments(self, parser):
parser.add_argument(
'--metric-id',
type=str,
help='Calculate KPI for a specific metric ID only'
)
parser.add_argument(
'--time-window',
type=int,
default=24,
help='Time window in hours for KPI calculation (default: 24)'
)
parser.add_argument(
'--force',
action='store_true',
help='Force recalculation even if recent measurement exists'
)
def handle(self, *args, **options):
"""Handle the command execution"""
metric_id = options.get('metric_id')
time_window = options.get('time_window', 24)
force = options.get('force', False)
try:
if metric_id:
metrics = KPIMetric.objects.filter(id=metric_id, is_active=True)
if not metrics.exists():
raise CommandError(f'No active metric found with ID: {metric_id}')
else:
metrics = KPIMetric.objects.filter(is_active=True)
self.stdout.write(f'Calculating KPIs for {metrics.count()} metrics...')
total_calculated = 0
for metric in metrics:
try:
calculated = self._calculate_metric(metric, time_window, force)
if calculated:
total_calculated += 1
self.stdout.write(
self.style.SUCCESS(f'✓ Calculated KPI for {metric.name}')
)
else:
self.stdout.write(
self.style.WARNING(f'⚠ Skipped KPI for {metric.name} (recent measurement exists)')
)
except Exception as e:
self.stdout.write(
self.style.ERROR(f'✗ Error calculating KPI for {metric.name}: {str(e)}')
)
self.stdout.write(
self.style.SUCCESS(f'Successfully calculated {total_calculated} KPIs')
)
except Exception as e:
raise CommandError(f'Error executing command: {str(e)}')
def _calculate_metric(self, metric, time_window_hours, force=False):
"""Calculate KPI measurement for a specific metric"""
end_time = timezone.now()
start_time = end_time - timedelta(hours=time_window_hours)
# Check if recent measurement exists
if not force:
recent_measurement = KPIMeasurement.objects.filter(
metric=metric,
calculated_at__gte=end_time - timedelta(hours=1)
).first()
if recent_measurement:
return False
# Get incidents in the time window
incidents = Incident.objects.filter(
created_at__gte=start_time,
created_at__lte=end_time
)
# Apply metric filters
if metric.incident_categories:
incidents = incidents.filter(category__in=metric.incident_categories)
if metric.incident_severities:
incidents = incidents.filter(severity__in=metric.incident_severities)
if metric.incident_priorities:
incidents = incidents.filter(priority__in=metric.incident_priorities)
# Calculate metric value based on type
if metric.metric_type == 'MTTA':
value, unit = self._calculate_mtta(incidents)
elif metric.metric_type == 'MTTR':
value, unit = self._calculate_mttr(incidents)
elif metric.metric_type == 'INCIDENT_COUNT':
value, unit = incidents.count(), 'count'
elif metric.metric_type == 'RESOLUTION_RATE':
value, unit = self._calculate_resolution_rate(incidents)
elif metric.metric_type == 'AVAILABILITY':
value, unit = self._calculate_availability(incidents)
else:
value, unit = incidents.count(), 'count'
# Create or update measurement
measurement, created = KPIMeasurement.objects.get_or_create(
metric=metric,
measurement_period_start=start_time,
measurement_period_end=end_time,
defaults={
'value': value,
'unit': unit,
'incident_count': incidents.count(),
'sample_size': incidents.count()
}
)
if not created:
measurement.value = value
measurement.unit = unit
measurement.incident_count = incidents.count()
measurement.sample_size = incidents.count()
measurement.save()
return True
def _calculate_mtta(self, incidents):
"""Calculate Mean Time to Acknowledge"""
acknowledged_incidents = incidents.filter(
status__in=['IN_PROGRESS', 'RESOLVED', 'CLOSED']
).exclude(assigned_to__isnull=True)
if not acknowledged_incidents.exists():
return 0, 'minutes'
total_time = timedelta()
count = 0
for incident in acknowledged_incidents:
# Simplified calculation - in practice, you'd track acknowledgment time
if incident.updated_at and incident.created_at:
time_diff = incident.updated_at - incident.created_at
total_time += time_diff
count += 1
if count > 0:
avg_time = total_time / count
return avg_time.total_seconds() / 60, 'minutes' # Convert to minutes
return 0, 'minutes'
def _calculate_mttr(self, incidents):
"""Calculate Mean Time to Resolve"""
resolved_incidents = incidents.filter(
status__in=['RESOLVED', 'CLOSED'],
resolved_at__isnull=False
)
if not resolved_incidents.exists():
return 0, 'hours'
total_time = timedelta()
count = 0
for incident in resolved_incidents:
if incident.resolved_at and incident.created_at:
time_diff = incident.resolved_at - incident.created_at
total_time += time_diff
count += 1
if count > 0:
avg_time = total_time / count
return avg_time.total_seconds() / 3600, 'hours' # Convert to hours
return 0, 'hours'
def _calculate_resolution_rate(self, incidents):
"""Calculate resolution rate"""
total_incidents = incidents.count()
if total_incidents == 0:
return 0, 'percentage'
resolved_incidents = incidents.filter(
status__in=['RESOLVED', 'CLOSED']
).count()
rate = (resolved_incidents / total_incidents) * 100
return rate, 'percentage'
def _calculate_availability(self, incidents):
"""Calculate service availability"""
# Simplified availability calculation
# In practice, you'd need more sophisticated uptime tracking
total_incidents = incidents.count()
if total_incidents == 0:
return 100, 'percentage'
# Assume availability decreases with incident count
# This is a simplified calculation
availability = max(0, 100 - (total_incidents * 0.1))
return availability, 'percentage'

View File

@@ -0,0 +1,63 @@
"""
Management command to run anomaly detection
"""
from django.core.management.base import BaseCommand, CommandError
from analytics_predictive_insights.models import PredictiveModel
from analytics_predictive_insights.ml.anomaly_detection import AnomalyDetectionService
class Command(BaseCommand):
"""Run anomaly detection using active models"""
help = 'Run anomaly detection using all active anomaly detection models'
def add_arguments(self, parser):
parser.add_argument(
'--model-id',
type=str,
help='Run anomaly detection for a specific model ID only'
)
parser.add_argument(
'--time-window',
type=int,
default=24,
help='Time window in hours for anomaly detection (default: 24)'
)
def handle(self, *args, **options):
"""Handle the command execution"""
model_id = options.get('model_id')
time_window = options.get('time_window', 24)
try:
# Initialize anomaly detection service
anomaly_service = AnomalyDetectionService()
self.stdout.write('Starting anomaly detection...')
# Run anomaly detection
total_anomalies = anomaly_service.run_anomaly_detection(model_id)
if total_anomalies > 0:
self.stdout.write(
self.style.SUCCESS(f'✓ Detected {total_anomalies} anomalies')
)
else:
self.stdout.write(
self.style.WARNING('⚠ No anomalies detected')
)
# Get summary
summary = anomaly_service.get_anomaly_summary(time_window)
self.stdout.write('\nAnomaly Summary:')
self.stdout.write(f' Total anomalies: {summary["total_anomalies"]}')
self.stdout.write(f' Critical: {summary["critical_anomalies"]}')
self.stdout.write(f' High: {summary["high_anomalies"]}')
self.stdout.write(f' Medium: {summary["medium_anomalies"]}')
self.stdout.write(f' Low: {summary["low_anomalies"]}')
self.stdout.write(f' Unresolved: {summary["unresolved_anomalies"]}')
self.stdout.write(f' False positive rate: {summary["false_positive_rate"]:.2f}%')
except Exception as e:
raise CommandError(f'Error running anomaly detection: {str(e)}')

View File

@@ -0,0 +1,108 @@
"""
Management command to train predictive models
"""
from django.core.management.base import BaseCommand, CommandError
from analytics_predictive_insights.models import PredictiveModel
from analytics_predictive_insights.ml.predictive_models import PredictiveModelService
class Command(BaseCommand):
"""Train predictive models"""
help = 'Train predictive models that are in training status'
def add_arguments(self, parser):
parser.add_argument(
'--model-id',
type=str,
help='Train a specific model ID only'
)
parser.add_argument(
'--force',
action='store_true',
help='Force retraining of active models'
)
def handle(self, *args, **options):
"""Handle the command execution"""
model_id = options.get('model_id')
force = options.get('force', False)
try:
# Initialize predictive model service
model_service = PredictiveModelService()
# Get models to train
if model_id:
models = PredictiveModel.objects.filter(id=model_id)
if not models.exists():
raise CommandError(f'No model found with ID: {model_id}')
else:
if force:
models = PredictiveModel.objects.filter(
model_type__in=[
'INCIDENT_PREDICTION',
'SEVERITY_PREDICTION',
'RESOLUTION_TIME_PREDICTION',
'COST_PREDICTION'
]
)
else:
models = PredictiveModel.objects.filter(status='TRAINING')
self.stdout.write(f'Training {models.count()} models...')
total_trained = 0
total_failed = 0
for model in models:
try:
self.stdout.write(f'Training model: {model.name}...')
result = model_service.train_model(str(model.id))
if result['success']:
total_trained += 1
self.stdout.write(
self.style.SUCCESS(f'✓ Successfully trained {model.name}')
)
# Display metrics
if 'metrics' in result:
metrics = result['metrics']
self.stdout.write(f' Accuracy: {metrics.get("accuracy", "N/A")}')
self.stdout.write(f' Precision: {metrics.get("precision", "N/A")}')
self.stdout.write(f' Recall: {metrics.get("recall", "N/A")}')
self.stdout.write(f' F1 Score: {metrics.get("f1_score", "N/A")}')
self.stdout.write(f' R2 Score: {metrics.get("r2_score", "N/A")}')
self.stdout.write(f' Training samples: {result.get("training_samples", "N/A")}')
self.stdout.write(f' Training duration: {result.get("training_duration", "N/A")} seconds')
else:
total_failed += 1
self.stdout.write(
self.style.ERROR(f'✗ Failed to train {model.name}: {result.get("error", "Unknown error")}')
)
except Exception as e:
total_failed += 1
self.stdout.write(
self.style.ERROR(f'✗ Error training {model.name}: {str(e)}')
)
self.stdout.write('\nTraining Summary:')
self.stdout.write(f' Successfully trained: {total_trained}')
self.stdout.write(f' Failed: {total_failed}')
if total_trained > 0:
self.stdout.write(
self.style.SUCCESS(f'✓ Training completed successfully')
)
else:
self.stdout.write(
self.style.WARNING('⚠ No models were successfully trained')
)
except Exception as e:
raise CommandError(f'Error executing command: {str(e)}')

View File

@@ -0,0 +1,311 @@
# Generated by Django 5.2.6 on 2025-09-18 17:16
import django.core.validators
import django.db.models.deletion
import uuid
from django.conf import settings
from django.db import migrations, models
class Migration(migrations.Migration):
initial = True
dependencies = [
('incident_intelligence', '0004_incident_oncall_assignment_incident_sla_override_and_more'),
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
]
operations = [
migrations.CreateModel(
name='HeatmapData',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('name', models.CharField(max_length=200)),
('heatmap_type', models.CharField(choices=[('INCIDENT_FREQUENCY', 'Incident Frequency'), ('RESOLUTION_TIME', 'Resolution Time'), ('COST_IMPACT', 'Cost Impact'), ('ANOMALY_DENSITY', 'Anomaly Density'), ('SLA_PERFORMANCE', 'SLA Performance')], max_length=20)),
('time_period_start', models.DateTimeField()),
('time_period_end', models.DateTimeField()),
('time_granularity', models.CharField(choices=[('HOUR', 'Hour'), ('DAY', 'Day'), ('WEEK', 'Week'), ('MONTH', 'Month')], max_length=20)),
('data_points', models.JSONField(help_text='Heatmap data points with coordinates and values')),
('color_scheme', models.CharField(default='viridis', help_text='Color scheme for the heatmap', max_length=50)),
('aggregation_method', models.CharField(choices=[('SUM', 'Sum'), ('AVERAGE', 'Average'), ('COUNT', 'Count'), ('MAX', 'Maximum'), ('MIN', 'Minimum')], max_length=20)),
('created_at', models.DateTimeField(auto_now_add=True)),
('updated_at', models.DateTimeField(auto_now=True)),
],
options={
'ordering': ['-created_at'],
'indexes': [models.Index(fields=['heatmap_type', 'time_period_start'], name='analytics_p_heatmap_61786e_idx'), models.Index(fields=['time_granularity'], name='analytics_p_time_gr_6c8e73_idx')],
},
),
migrations.CreateModel(
name='KPIMetric',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('name', models.CharField(max_length=200)),
('description', models.TextField()),
('metric_type', models.CharField(choices=[('MTTA', 'Mean Time to Acknowledge'), ('MTTR', 'Mean Time to Resolve'), ('MTBF', 'Mean Time Between Failures'), ('MTBSI', 'Mean Time Between Service Incidents'), ('AVAILABILITY', 'Service Availability'), ('INCIDENT_COUNT', 'Incident Count'), ('RESOLUTION_RATE', 'Resolution Rate'), ('ESCALATION_RATE', 'Escalation Rate'), ('CUSTOM', 'Custom Metric')], max_length=20)),
('aggregation_type', models.CharField(choices=[('AVERAGE', 'Average'), ('MEDIAN', 'Median'), ('MIN', 'Minimum'), ('MAX', 'Maximum'), ('SUM', 'Sum'), ('COUNT', 'Count'), ('PERCENTILE_95', '95th Percentile'), ('PERCENTILE_99', '99th Percentile')], max_length=20)),
('incident_categories', models.JSONField(default=list, help_text='List of incident categories this metric applies to')),
('incident_severities', models.JSONField(default=list, help_text='List of incident severities this metric applies to')),
('incident_priorities', models.JSONField(default=list, help_text='List of incident priorities this metric applies to')),
('calculation_formula', models.TextField(blank=True, help_text='Custom calculation formula for complex metrics', null=True)),
('time_window_hours', models.PositiveIntegerField(default=24, help_text='Time window for metric calculation in hours')),
('is_active', models.BooleanField(default=True)),
('is_system_metric', models.BooleanField(default=False, help_text='Whether this is a system-defined metric')),
('created_at', models.DateTimeField(auto_now_add=True)),
('updated_at', models.DateTimeField(auto_now=True)),
('created_by', models.ForeignKey(null=True, on_delete=django.db.models.deletion.SET_NULL, to=settings.AUTH_USER_MODEL)),
],
options={
'ordering': ['name'],
},
),
migrations.CreateModel(
name='KPIMeasurement',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('value', models.DecimalField(decimal_places=4, max_digits=15)),
('unit', models.CharField(help_text='Unit of measurement (minutes, hours, percentage, etc.)', max_length=50)),
('measurement_period_start', models.DateTimeField()),
('measurement_period_end', models.DateTimeField()),
('incident_count', models.PositiveIntegerField(default=0, help_text='Number of incidents included in this measurement')),
('sample_size', models.PositiveIntegerField(default=0, help_text='Total sample size for this measurement')),
('metadata', models.JSONField(default=dict, help_text='Additional metadata for this measurement')),
('calculated_at', models.DateTimeField(auto_now_add=True)),
('metric', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, related_name='measurements', to='analytics_predictive_insights.kpimetric')),
],
options={
'ordering': ['-calculated_at'],
},
),
migrations.CreateModel(
name='PredictiveModel',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('name', models.CharField(max_length=200)),
('description', models.TextField()),
('model_type', models.CharField(choices=[('ANOMALY_DETECTION', 'Anomaly Detection'), ('INCIDENT_PREDICTION', 'Incident Prediction'), ('SEVERITY_PREDICTION', 'Severity Prediction'), ('RESOLUTION_TIME_PREDICTION', 'Resolution Time Prediction'), ('ESCALATION_PREDICTION', 'Escalation Prediction'), ('COST_PREDICTION', 'Cost Impact Prediction')], max_length=30)),
('algorithm_type', models.CharField(choices=[('ISOLATION_FOREST', 'Isolation Forest'), ('LSTM', 'Long Short-Term Memory'), ('RANDOM_FOREST', 'Random Forest'), ('XGBOOST', 'XGBoost'), ('SVM', 'Support Vector Machine'), ('NEURAL_NETWORK', 'Neural Network'), ('ARIMA', 'ARIMA'), ('PROPHET', 'Prophet')], max_length=20)),
('model_config', models.JSONField(default=dict, help_text='Model-specific configuration parameters')),
('feature_columns', models.JSONField(default=list, help_text='List of feature columns used by the model')),
('target_column', models.CharField(help_text='Target column for prediction', max_length=100)),
('training_data_period_days', models.PositiveIntegerField(default=90, help_text='Number of days of training data to use')),
('min_training_samples', models.PositiveIntegerField(default=100, help_text='Minimum number of samples required for training')),
('accuracy_score', models.FloatField(blank=True, null=True, validators=[django.core.validators.MinValueValidator(0.0), django.core.validators.MaxValueValidator(1.0)])),
('precision_score', models.FloatField(blank=True, null=True, validators=[django.core.validators.MinValueValidator(0.0), django.core.validators.MaxValueValidator(1.0)])),
('recall_score', models.FloatField(blank=True, null=True, validators=[django.core.validators.MinValueValidator(0.0), django.core.validators.MaxValueValidator(1.0)])),
('f1_score', models.FloatField(blank=True, null=True, validators=[django.core.validators.MinValueValidator(0.0), django.core.validators.MaxValueValidator(1.0)])),
('status', models.CharField(choices=[('TRAINING', 'Training'), ('ACTIVE', 'Active'), ('INACTIVE', 'Inactive'), ('RETRAINING', 'Retraining'), ('ERROR', 'Error')], default='TRAINING', max_length=20)),
('version', models.CharField(default='1.0', max_length=20)),
('model_file_path', models.CharField(blank=True, help_text='Path to the trained model file', max_length=500, null=True)),
('last_trained_at', models.DateTimeField(blank=True, null=True)),
('training_duration_seconds', models.PositiveIntegerField(blank=True, null=True)),
('training_samples_count', models.PositiveIntegerField(blank=True, null=True)),
('auto_retrain_enabled', models.BooleanField(default=True)),
('retrain_frequency_days', models.PositiveIntegerField(default=7)),
('performance_threshold', models.FloatField(default=0.8, help_text='Performance threshold below which model should be retrained', validators=[django.core.validators.MinValueValidator(0.0), django.core.validators.MaxValueValidator(1.0)])),
('created_at', models.DateTimeField(auto_now_add=True)),
('updated_at', models.DateTimeField(auto_now=True)),
('created_by', models.ForeignKey(null=True, on_delete=django.db.models.deletion.SET_NULL, to=settings.AUTH_USER_MODEL)),
],
options={
'ordering': ['-created_at'],
},
),
migrations.CreateModel(
name='PredictiveInsight',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('insight_type', models.CharField(choices=[('INCIDENT_PREDICTION', 'Incident Prediction'), ('SEVERITY_PREDICTION', 'Severity Prediction'), ('RESOLUTION_TIME_PREDICTION', 'Resolution Time Prediction'), ('COST_PREDICTION', 'Cost Prediction'), ('TREND_ANALYSIS', 'Trend Analysis'), ('PATTERN_DETECTION', 'Pattern Detection')], max_length=30)),
('title', models.CharField(max_length=200)),
('description', models.TextField()),
('confidence_level', models.CharField(choices=[('LOW', 'Low Confidence'), ('MEDIUM', 'Medium Confidence'), ('HIGH', 'High Confidence'), ('VERY_HIGH', 'Very High Confidence')], max_length=20)),
('confidence_score', models.FloatField(validators=[django.core.validators.MinValueValidator(0.0), django.core.validators.MaxValueValidator(1.0)])),
('predicted_value', models.JSONField(help_text='Predicted value or values')),
('prediction_horizon', models.PositiveIntegerField(help_text='Prediction horizon in hours')),
('prediction_date', models.DateTimeField(help_text='When the prediction is for')),
('input_features', models.JSONField(help_text='Input features used for the prediction')),
('supporting_evidence', models.JSONField(default=list, help_text='Supporting evidence for the prediction')),
('affected_services', models.JSONField(default=list, help_text='Services that may be affected')),
('recommendations', models.JSONField(default=list, help_text='AI-generated recommendations based on the insight')),
('risk_assessment', models.TextField(blank=True, help_text='Risk assessment based on the prediction', null=True)),
('is_acknowledged', models.BooleanField(default=False)),
('acknowledged_at', models.DateTimeField(blank=True, null=True)),
('is_validated', models.BooleanField(default=False)),
('actual_value', models.JSONField(blank=True, help_text='Actual value when prediction is validated', null=True)),
('validation_accuracy', models.FloatField(blank=True, null=True, validators=[django.core.validators.MinValueValidator(0.0), django.core.validators.MaxValueValidator(1.0)])),
('generated_at', models.DateTimeField(auto_now_add=True)),
('expires_at', models.DateTimeField(help_text='When this insight expires')),
('acknowledged_by', models.ForeignKey(blank=True, null=True, on_delete=django.db.models.deletion.SET_NULL, related_name='acknowledged_insights', to=settings.AUTH_USER_MODEL)),
('related_incidents', models.ManyToManyField(blank=True, related_name='predictive_insights', to='incident_intelligence.incident')),
('model', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, related_name='insights', to='analytics_predictive_insights.predictivemodel')),
],
options={
'ordering': ['-generated_at'],
},
),
migrations.CreateModel(
name='AnomalyDetection',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('anomaly_type', models.CharField(choices=[('STATISTICAL', 'Statistical Anomaly'), ('TEMPORAL', 'Temporal Anomaly'), ('PATTERN', 'Pattern Anomaly'), ('THRESHOLD', 'Threshold Breach'), ('BEHAVIORAL', 'Behavioral Anomaly')], max_length=20)),
('severity', models.CharField(choices=[('LOW', 'Low'), ('MEDIUM', 'Medium'), ('HIGH', 'High'), ('CRITICAL', 'Critical')], max_length=20)),
('status', models.CharField(choices=[('DETECTED', 'Detected'), ('INVESTIGATING', 'Investigating'), ('CONFIRMED', 'Confirmed'), ('FALSE_POSITIVE', 'False Positive'), ('RESOLVED', 'Resolved')], default='DETECTED', max_length=20)),
('confidence_score', models.FloatField(validators=[django.core.validators.MinValueValidator(0.0), django.core.validators.MaxValueValidator(1.0)])),
('anomaly_score', models.FloatField(help_text='Raw anomaly score from the model')),
('threshold_used', models.FloatField(help_text='Threshold used for anomaly detection')),
('detected_at', models.DateTimeField(auto_now_add=True)),
('time_window_start', models.DateTimeField()),
('time_window_end', models.DateTimeField()),
('affected_services', models.JSONField(default=list, help_text='Services affected by this anomaly')),
('affected_metrics', models.JSONField(default=list, help_text='Metrics that showed anomalous behavior')),
('description', models.TextField(help_text='Description of the anomaly')),
('root_cause_analysis', models.TextField(blank=True, help_text='Root cause analysis of the anomaly', null=True)),
('impact_assessment', models.TextField(blank=True, help_text="Assessment of the anomaly's impact", null=True)),
('actions_taken', models.JSONField(default=list, help_text='Actions taken in response to the anomaly')),
('resolved_at', models.DateTimeField(blank=True, null=True)),
('metadata', models.JSONField(default=dict, help_text='Additional metadata for this anomaly')),
('related_incidents', models.ManyToManyField(blank=True, related_name='anomaly_detections', to='incident_intelligence.incident')),
('resolved_by', models.ForeignKey(blank=True, null=True, on_delete=django.db.models.deletion.SET_NULL, related_name='resolved_anomalies', to=settings.AUTH_USER_MODEL)),
('model', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, related_name='anomaly_detections', to='analytics_predictive_insights.predictivemodel')),
],
options={
'ordering': ['-detected_at'],
},
),
migrations.CreateModel(
name='CostImpactAnalysis',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('cost_type', models.CharField(choices=[('DOWNTIME', 'Downtime Cost'), ('LOST_REVENUE', 'Lost Revenue'), ('PENALTY', 'Penalty Cost'), ('RESOURCE_COST', 'Resource Cost'), ('REPUTATION_COST', 'Reputation Cost'), ('COMPLIANCE_COST', 'Compliance Cost')], max_length=20)),
('cost_amount', models.DecimalField(decimal_places=2, help_text='Cost amount in USD', max_digits=15)),
('currency', models.CharField(default='USD', max_length=3)),
('calculation_method', models.CharField(help_text='Method used to calculate the cost', max_length=50)),
('calculation_details', models.JSONField(default=dict, help_text='Detailed breakdown of cost calculation')),
('downtime_hours', models.DecimalField(blank=True, decimal_places=2, help_text='Total downtime in hours', max_digits=10, null=True)),
('affected_users', models.PositiveIntegerField(blank=True, help_text='Number of users affected', null=True)),
('revenue_impact', models.DecimalField(blank=True, decimal_places=2, help_text='Revenue impact in USD', max_digits=15, null=True)),
('business_unit', models.CharField(blank=True, help_text='Business unit affected', max_length=100, null=True)),
('service_tier', models.CharField(blank=True, help_text='Service tier (e.g., Premium, Standard)', max_length=50, null=True)),
('is_validated', models.BooleanField(default=False)),
('validated_at', models.DateTimeField(blank=True, null=True)),
('validation_notes', models.TextField(blank=True, null=True)),
('created_at', models.DateTimeField(auto_now_add=True)),
('updated_at', models.DateTimeField(auto_now=True)),
('incident', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, related_name='cost_analyses', to='incident_intelligence.incident')),
('validated_by', models.ForeignKey(blank=True, null=True, on_delete=django.db.models.deletion.SET_NULL, related_name='validated_cost_analyses', to=settings.AUTH_USER_MODEL)),
],
options={
'ordering': ['-created_at'],
'indexes': [models.Index(fields=['incident', 'cost_type'], name='analytics_p_inciden_c66cda_idx'), models.Index(fields=['cost_amount'], name='analytics_p_cost_am_92cb70_idx'), models.Index(fields=['is_validated'], name='analytics_p_is_vali_bf5116_idx')],
},
),
migrations.CreateModel(
name='DashboardConfiguration',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('name', models.CharField(max_length=200)),
('description', models.TextField()),
('dashboard_type', models.CharField(choices=[('EXECUTIVE', 'Executive Dashboard'), ('OPERATIONAL', 'Operational Dashboard'), ('TECHNICAL', 'Technical Dashboard'), ('CUSTOM', 'Custom Dashboard')], max_length=20)),
('layout_config', models.JSONField(default=dict, help_text='Dashboard layout configuration')),
('widget_configs', models.JSONField(default=list, help_text='Configuration for dashboard widgets')),
('is_public', models.BooleanField(default=False)),
('allowed_roles', models.JSONField(default=list, help_text='List of roles that can access this dashboard')),
('auto_refresh_enabled', models.BooleanField(default=True)),
('refresh_interval_seconds', models.PositiveIntegerField(default=300)),
('is_active', models.BooleanField(default=True)),
('created_at', models.DateTimeField(auto_now_add=True)),
('updated_at', models.DateTimeField(auto_now=True)),
('allowed_users', models.ManyToManyField(blank=True, related_name='accessible_dashboards', to=settings.AUTH_USER_MODEL)),
('created_by', models.ForeignKey(null=True, on_delete=django.db.models.deletion.SET_NULL, to=settings.AUTH_USER_MODEL)),
],
options={
'ordering': ['name'],
'indexes': [models.Index(fields=['dashboard_type', 'is_active'], name='analytics_p_dashboa_a8155f_idx'), models.Index(fields=['is_public'], name='analytics_p_is_publ_c4c7bd_idx')],
},
),
migrations.CreateModel(
name='IncidentRecurrenceAnalysis',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('recurrence_type', models.CharField(choices=[('EXACT_DUPLICATE', 'Exact Duplicate'), ('SIMILAR_PATTERN', 'Similar Pattern'), ('SEASONAL', 'Seasonal Recurrence'), ('TREND', 'Trend-based Recurrence'), ('CASCADE', 'Cascade Effect')], max_length=20)),
('confidence_score', models.FloatField(validators=[django.core.validators.MinValueValidator(0.0), django.core.validators.MaxValueValidator(1.0)])),
('recurrence_rate', models.FloatField(help_text='Rate of recurrence (incidents per time period)')),
('common_keywords', models.JSONField(default=list, help_text='Common keywords across recurring incidents')),
('common_categories', models.JSONField(default=list, help_text='Common categories across recurring incidents')),
('time_pattern', models.JSONField(default=dict, help_text='Time-based pattern analysis')),
('total_affected_users', models.PositiveIntegerField(default=0)),
('total_downtime_hours', models.DecimalField(decimal_places=2, default=0, max_digits=10)),
('estimated_cost_impact', models.DecimalField(decimal_places=2, default=0, max_digits=15)),
('prevention_recommendations', models.JSONField(default=list, help_text='AI-generated recommendations to prevent recurrence')),
('automation_opportunities', models.JSONField(default=list, help_text='Potential automation opportunities identified')),
('is_resolved', models.BooleanField(default=False)),
('resolution_actions', models.JSONField(default=list, help_text='Actions taken to resolve the recurrence pattern')),
('created_at', models.DateTimeField(auto_now_add=True)),
('updated_at', models.DateTimeField(auto_now=True)),
('model_version', models.CharField(default='v1.0', max_length=50)),
('primary_incident', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, related_name='recurrence_analyses_as_primary', to='incident_intelligence.incident')),
('recurring_incidents', models.ManyToManyField(related_name='recurrence_analyses_as_recurring', to='incident_intelligence.incident')),
],
options={
'ordering': ['-confidence_score', '-created_at'],
'indexes': [models.Index(fields=['recurrence_type', 'confidence_score'], name='analytics_p_recurre_420fe9_idx'), models.Index(fields=['is_resolved'], name='analytics_p_is_reso_cdecdd_idx')],
},
),
migrations.AddIndex(
model_name='kpimetric',
index=models.Index(fields=['metric_type', 'is_active'], name='analytics_p_metric__8e1291_idx'),
),
migrations.AddIndex(
model_name='kpimetric',
index=models.Index(fields=['incident_categories'], name='analytics_p_inciden_fcc290_idx'),
),
migrations.AddIndex(
model_name='kpimetric',
index=models.Index(fields=['incident_severities'], name='analytics_p_inciden_601d71_idx'),
),
migrations.AddIndex(
model_name='kpimeasurement',
index=models.Index(fields=['metric', 'measurement_period_start'], name='analytics_p_metric__5c1184_idx'),
),
migrations.AddIndex(
model_name='kpimeasurement',
index=models.Index(fields=['calculated_at'], name='analytics_p_calcula_e8b072_idx'),
),
migrations.AddIndex(
model_name='predictivemodel',
index=models.Index(fields=['model_type', 'status'], name='analytics_p_model_t_b1e3f4_idx'),
),
migrations.AddIndex(
model_name='predictivemodel',
index=models.Index(fields=['algorithm_type'], name='analytics_p_algorit_1f51a1_idx'),
),
migrations.AddIndex(
model_name='predictivemodel',
index=models.Index(fields=['status'], name='analytics_p_status_ad4300_idx'),
),
migrations.AddIndex(
model_name='predictiveinsight',
index=models.Index(fields=['insight_type', 'confidence_score'], name='analytics_p_insight_ac65ec_idx'),
),
migrations.AddIndex(
model_name='predictiveinsight',
index=models.Index(fields=['prediction_date'], name='analytics_p_predict_d606fb_idx'),
),
migrations.AddIndex(
model_name='predictiveinsight',
index=models.Index(fields=['is_acknowledged'], name='analytics_p_is_ackn_16014e_idx'),
),
migrations.AddIndex(
model_name='anomalydetection',
index=models.Index(fields=['anomaly_type', 'severity'], name='analytics_p_anomaly_d51ee4_idx'),
),
migrations.AddIndex(
model_name='anomalydetection',
index=models.Index(fields=['status', 'detected_at'], name='analytics_p_status_c15b14_idx'),
),
migrations.AddIndex(
model_name='anomalydetection',
index=models.Index(fields=['confidence_score'], name='analytics_p_confide_c99920_idx'),
),
]

View File

@@ -0,0 +1,32 @@
# Generated by Django 5.2.6 on 2025-09-18 17:19
import django.db.models.deletion
from django.db import migrations, models
class Migration(migrations.Migration):
dependencies = [
('analytics_predictive_insights', '0001_initial'),
('automation_orchestration', '0002_autoremediationexecution_sla_instance_and_more'),
('security', '0002_user_emergency_contact_user_oncall_preferences_and_more'),
('sla_oncall', '0001_initial'),
]
operations = [
migrations.AddField(
model_name='costimpactanalysis',
name='sla_instance',
field=models.ForeignKey(blank=True, help_text='Related SLA instance for cost calculation', null=True, on_delete=django.db.models.deletion.SET_NULL, related_name='cost_analyses', to='sla_oncall.slainstance'),
),
migrations.AddField(
model_name='incidentrecurrenceanalysis',
name='suggested_runbooks',
field=models.ManyToManyField(blank=True, help_text='Runbooks suggested to prevent recurrence', related_name='recurrence_analyses', to='automation_orchestration.runbook'),
),
migrations.AddField(
model_name='predictiveinsight',
name='data_classification',
field=models.ForeignKey(blank=True, help_text='Data classification level for this insight', null=True, on_delete=django.db.models.deletion.SET_NULL, to='security.dataclassification'),
),
]

View File

@@ -0,0 +1 @@
# ML components for analytics and predictive insights

View File

@@ -0,0 +1,491 @@
"""
ML-based anomaly detection for incident management
Implements various anomaly detection algorithms for identifying unusual patterns
"""
import numpy as np
import pandas as pd
from typing import Dict, List, Tuple, Optional, Any
from datetime import datetime, timedelta
from sklearn.ensemble import IsolationForest
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import DBSCAN
from sklearn.decomposition import PCA
from scipy import stats
import logging
from django.utils import timezone
from django.db.models import Q, Avg, Count, Sum
from incident_intelligence.models import Incident
from ..models import AnomalyDetection, PredictiveModel
logger = logging.getLogger(__name__)
class AnomalyDetector:
"""Base class for anomaly detection algorithms"""
def __init__(self, model_config: Dict[str, Any] = None):
self.model_config = model_config or {}
self.scaler = StandardScaler()
self.is_fitted = False
def fit(self, data: pd.DataFrame) -> None:
"""Fit the anomaly detection model"""
raise NotImplementedError
def predict(self, data: pd.DataFrame) -> np.ndarray:
"""Predict anomalies in the data"""
raise NotImplementedError
def get_anomaly_scores(self, data: pd.DataFrame) -> np.ndarray:
"""Get anomaly scores for the data"""
raise NotImplementedError
class StatisticalAnomalyDetector(AnomalyDetector):
"""Statistical anomaly detection using z-score and IQR methods"""
def __init__(self, model_config: Dict[str, Any] = None):
super().__init__(model_config)
self.z_threshold = self.model_config.get('z_threshold', 3.0)
self.iqr_multiplier = self.model_config.get('iqr_multiplier', 1.5)
self.stats_cache = {}
def fit(self, data: pd.DataFrame) -> None:
"""Calculate statistical parameters for anomaly detection"""
for column in data.columns:
if data[column].dtype in ['int64', 'float64']:
values = data[column].dropna()
if len(values) > 0:
self.stats_cache[column] = {
'mean': values.mean(),
'std': values.std(),
'q1': values.quantile(0.25),
'q3': values.quantile(0.75),
'iqr': values.quantile(0.75) - values.quantile(0.25)
}
self.is_fitted = True
def predict(self, data: pd.DataFrame) -> np.ndarray:
"""Predict anomalies using statistical methods"""
if not self.is_fitted:
raise ValueError("Model must be fitted before prediction")
anomaly_flags = np.zeros(len(data), dtype=bool)
for column in data.columns:
if column in self.stats_cache and data[column].dtype in ['int64', 'float64']:
values = data[column].dropna()
if len(values) > 0:
stats = self.stats_cache[column]
# Z-score method
z_scores = np.abs((values - stats['mean']) / stats['std'])
z_anomalies = z_scores > self.z_threshold
# IQR method
lower_bound = stats['q1'] - self.iqr_multiplier * stats['iqr']
upper_bound = stats['q3'] + self.iqr_multiplier * stats['iqr']
iqr_anomalies = (values < lower_bound) | (values > upper_bound)
# Combine both methods
column_anomalies = z_anomalies | iqr_anomalies
anomaly_flags[values.index] |= column_anomalies
return anomaly_flags
def get_anomaly_scores(self, data: pd.DataFrame) -> np.ndarray:
"""Get anomaly scores based on z-scores"""
if not self.is_fitted:
raise ValueError("Model must be fitted before prediction")
scores = np.zeros(len(data))
for column in data.columns:
if column in self.stats_cache and data[column].dtype in ['int64', 'float64']:
values = data[column].dropna()
if len(values) > 0:
stats = self.stats_cache[column]
z_scores = np.abs((values - stats['mean']) / stats['std'])
scores[values.index] += z_scores
return scores
class IsolationForestAnomalyDetector(AnomalyDetector):
"""Isolation Forest anomaly detection"""
def __init__(self, model_config: Dict[str, Any] = None):
super().__init__(model_config)
self.contamination = self.model_config.get('contamination', 0.1)
self.n_estimators = self.model_config.get('n_estimators', 100)
self.model = IsolationForest(
contamination=self.contamination,
n_estimators=self.n_estimators,
random_state=42
)
def fit(self, data: pd.DataFrame) -> None:
"""Fit the Isolation Forest model"""
# Select numeric columns only
numeric_data = data.select_dtypes(include=[np.number])
if numeric_data.empty:
raise ValueError("No numeric columns found in data")
# Handle missing values
numeric_data = numeric_data.fillna(numeric_data.median())
# Scale the data
scaled_data = self.scaler.fit_transform(numeric_data)
# Fit the model
self.model.fit(scaled_data)
self.is_fitted = True
def predict(self, data: pd.DataFrame) -> np.ndarray:
"""Predict anomalies using Isolation Forest"""
if not self.is_fitted:
raise ValueError("Model must be fitted before prediction")
# Select numeric columns only
numeric_data = data.select_dtypes(include=[np.number])
if numeric_data.empty:
return np.zeros(len(data), dtype=bool)
# Handle missing values
numeric_data = numeric_data.fillna(numeric_data.median())
# Scale the data
scaled_data = self.scaler.transform(numeric_data)
# Predict anomalies (-1 for anomalies, 1 for normal)
predictions = self.model.predict(scaled_data)
return predictions == -1
def get_anomaly_scores(self, data: pd.DataFrame) -> np.ndarray:
"""Get anomaly scores from Isolation Forest"""
if not self.is_fitted:
raise ValueError("Model must be fitted before prediction")
# Select numeric columns only
numeric_data = data.select_dtypes(include=[np.number])
if numeric_data.empty:
return np.zeros(len(data))
# Handle missing values
numeric_data = numeric_data.fillna(numeric_data.median())
# Scale the data
scaled_data = self.scaler.transform(numeric_data)
# Get anomaly scores
scores = self.model.decision_function(scaled_data)
# Convert to positive scores (higher = more anomalous)
return -scores
class TemporalAnomalyDetector(AnomalyDetector):
"""Temporal anomaly detection for time series data"""
def __init__(self, model_config: Dict[str, Any] = None):
super().__init__(model_config)
self.window_size = self.model_config.get('window_size', 24) # hours
self.threshold_multiplier = self.model_config.get('threshold_multiplier', 2.0)
self.temporal_stats = {}
def fit(self, data: pd.DataFrame) -> None:
"""Calculate temporal statistics for anomaly detection"""
if 'timestamp' not in data.columns:
raise ValueError("Timestamp column is required for temporal anomaly detection")
# Sort by timestamp
data_sorted = data.sort_values('timestamp')
# Calculate rolling statistics
for column in data_sorted.columns:
if column != 'timestamp' and data_sorted[column].dtype in ['int64', 'float64']:
# Calculate rolling mean and std
rolling_mean = data_sorted[column].rolling(window=self.window_size, min_periods=1).mean()
rolling_std = data_sorted[column].rolling(window=self.window_size, min_periods=1).std()
self.temporal_stats[column] = {
'rolling_mean': rolling_mean,
'rolling_std': rolling_std
}
self.is_fitted = True
def predict(self, data: pd.DataFrame) -> np.ndarray:
"""Predict temporal anomalies"""
if not self.is_fitted:
raise ValueError("Model must be fitted before prediction")
if 'timestamp' not in data.columns:
return np.zeros(len(data), dtype=bool)
# Sort by timestamp
data_sorted = data.sort_values('timestamp')
anomaly_flags = np.zeros(len(data_sorted), dtype=bool)
for column in data_sorted.columns:
if column in self.temporal_stats and column != 'timestamp':
values = data_sorted[column]
rolling_mean = self.temporal_stats[column]['rolling_mean']
rolling_std = self.temporal_stats[column]['rolling_std']
# Calculate z-scores based on rolling statistics
z_scores = np.abs((values - rolling_mean) / (rolling_std + 1e-8))
column_anomalies = z_scores > self.threshold_multiplier
anomaly_flags |= column_anomalies
return anomaly_flags
def get_anomaly_scores(self, data: pd.DataFrame) -> np.ndarray:
"""Get temporal anomaly scores"""
if not self.is_fitted:
raise ValueError("Model must be fitted before prediction")
if 'timestamp' not in data.columns:
return np.zeros(len(data))
# Sort by timestamp
data_sorted = data.sort_values('timestamp')
scores = np.zeros(len(data_sorted))
for column in data_sorted.columns:
if column in self.temporal_stats and column != 'timestamp':
values = data_sorted[column]
rolling_mean = self.temporal_stats[column]['rolling_mean']
rolling_std = self.temporal_stats[column]['rolling_std']
# Calculate z-scores based on rolling statistics
z_scores = np.abs((values - rolling_mean) / (rolling_std + 1e-8))
scores += z_scores
return scores
class AnomalyDetectionEngine:
"""Main engine for anomaly detection"""
def __init__(self):
self.detectors = {
'statistical': StatisticalAnomalyDetector,
'isolation_forest': IsolationForestAnomalyDetector,
'temporal': TemporalAnomalyDetector
}
def create_detector(self, algorithm_type: str, model_config: Dict[str, Any] = None) -> AnomalyDetector:
"""Create an anomaly detector instance"""
if algorithm_type not in self.detectors:
raise ValueError(f"Unknown algorithm type: {algorithm_type}")
return self.detectors[algorithm_type](model_config)
def prepare_incident_data(self, time_window_hours: int = 24) -> pd.DataFrame:
"""Prepare incident data for anomaly detection"""
end_time = timezone.now()
start_time = end_time - timedelta(hours=time_window_hours)
# Get incidents from the time window
incidents = Incident.objects.filter(
created_at__gte=start_time,
created_at__lte=end_time
).values(
'id', 'created_at', 'severity', 'category', 'subcategory',
'affected_users', 'estimated_downtime', 'status'
)
if not incidents:
return pd.DataFrame()
# Convert to DataFrame
df = pd.DataFrame(list(incidents))
# Convert datetime to timestamp
df['timestamp'] = pd.to_datetime(df['created_at']).astype('int64') // 10**9
# Encode categorical variables
severity_mapping = {'LOW': 1, 'MEDIUM': 2, 'HIGH': 3, 'CRITICAL': 4, 'EMERGENCY': 5}
df['severity_encoded'] = df['severity'].map(severity_mapping).fillna(0)
# Convert estimated_downtime to hours
df['downtime_hours'] = df['estimated_downtime'].apply(
lambda x: x.total_seconds() / 3600 if x else 0
)
# Create time-based features
df['hour_of_day'] = pd.to_datetime(df['created_at']).dt.hour
df['day_of_week'] = pd.to_datetime(df['created_at']).dt.dayofweek
return df
def detect_anomalies(self, model: PredictiveModel, time_window_hours: int = 24) -> List[Dict[str, Any]]:
"""Detect anomalies using the specified model"""
try:
# Prepare data
data = self.prepare_incident_data(time_window_hours)
if data.empty:
logger.warning("No incident data found for anomaly detection")
return []
# Create detector
detector = self.create_detector(
model.algorithm_type,
model.model_config
)
# Fit the model
detector.fit(data)
# Predict anomalies
anomaly_flags = detector.predict(data)
anomaly_scores = detector.get_anomaly_scores(data)
# Process results
anomalies = []
for idx, is_anomaly in enumerate(anomaly_flags):
if is_anomaly:
incident_data = data.iloc[idx]
anomaly_data = {
'model': model,
'anomaly_type': self._determine_anomaly_type(model.algorithm_type),
'severity': self._determine_severity(anomaly_scores[idx]),
'confidence_score': min(1.0, max(0.0, anomaly_scores[idx] / 10.0)),
'anomaly_score': float(anomaly_scores[idx]),
'threshold_used': self._get_threshold(model.algorithm_type, model.model_config),
'time_window_start': timezone.now() - timedelta(hours=time_window_hours),
'time_window_end': timezone.now(),
'description': self._generate_description(incident_data, anomaly_scores[idx]),
'affected_services': [incident_data.get('category', 'Unknown')],
'affected_metrics': ['incident_frequency', 'severity_distribution'],
'metadata': {
'incident_id': str(incident_data['id']),
'detection_algorithm': model.algorithm_type,
'time_window_hours': time_window_hours
}
}
anomalies.append(anomaly_data)
return anomalies
except Exception as e:
logger.error(f"Error in anomaly detection: {str(e)}")
return []
def _determine_anomaly_type(self, algorithm_type: str) -> str:
"""Determine anomaly type based on algorithm"""
mapping = {
'statistical': 'STATISTICAL',
'isolation_forest': 'PATTERN',
'temporal': 'TEMPORAL'
}
return mapping.get(algorithm_type, 'STATISTICAL')
def _determine_severity(self, anomaly_score: float) -> str:
"""Determine severity based on anomaly score"""
if anomaly_score >= 5.0:
return 'CRITICAL'
elif anomaly_score >= 3.0:
return 'HIGH'
elif anomaly_score >= 2.0:
return 'MEDIUM'
else:
return 'LOW'
def _get_threshold(self, algorithm_type: str, model_config: Dict[str, Any]) -> float:
"""Get threshold used for anomaly detection"""
if algorithm_type == 'statistical':
return model_config.get('z_threshold', 3.0)
elif algorithm_type == 'isolation_forest':
return model_config.get('contamination', 0.1)
elif algorithm_type == 'temporal':
return model_config.get('threshold_multiplier', 2.0)
return 1.0
def _generate_description(self, incident_data: pd.Series, anomaly_score: float) -> str:
"""Generate description for the anomaly"""
severity = incident_data.get('severity', 'Unknown')
category = incident_data.get('category', 'Unknown')
affected_users = incident_data.get('affected_users', 0)
return f"Anomalous incident detected: {severity} severity incident in {category} category affecting {affected_users} users. Anomaly score: {anomaly_score:.2f}"
class AnomalyDetectionService:
"""Service for managing anomaly detection"""
def __init__(self):
self.engine = AnomalyDetectionEngine()
def run_anomaly_detection(self, model_id: str = None) -> int:
"""Run anomaly detection for all active models or a specific model"""
if model_id:
models = PredictiveModel.objects.filter(
id=model_id,
model_type='ANOMALY_DETECTION',
status='ACTIVE'
)
else:
models = PredictiveModel.objects.filter(
model_type='ANOMALY_DETECTION',
status='ACTIVE'
)
total_anomalies = 0
for model in models:
try:
# Detect anomalies
anomalies = self.engine.detect_anomalies(model)
# Save anomalies to database
for anomaly_data in anomalies:
AnomalyDetection.objects.create(**anomaly_data)
total_anomalies += 1
logger.info(f"Detected {len(anomalies)} anomalies using model {model.name}")
except Exception as e:
logger.error(f"Error running anomaly detection for model {model.name}: {str(e)}")
return total_anomalies
def get_anomaly_summary(self, time_window_hours: int = 24) -> Dict[str, Any]:
"""Get summary of recent anomalies"""
end_time = timezone.now()
start_time = end_time - timedelta(hours=time_window_hours)
anomalies = AnomalyDetection.objects.filter(
detected_at__gte=start_time,
detected_at__lte=end_time
)
return {
'total_anomalies': anomalies.count(),
'critical_anomalies': anomalies.filter(severity='CRITICAL').count(),
'high_anomalies': anomalies.filter(severity='HIGH').count(),
'medium_anomalies': anomalies.filter(severity='MEDIUM').count(),
'low_anomalies': anomalies.filter(severity='LOW').count(),
'unresolved_anomalies': anomalies.filter(
status__in=['DETECTED', 'INVESTIGATING']
).count(),
'false_positive_rate': self._calculate_false_positive_rate(anomalies),
'average_confidence': anomalies.aggregate(
avg=Avg('confidence_score')
)['avg'] or 0.0
}
def _calculate_false_positive_rate(self, anomalies) -> float:
"""Calculate false positive rate"""
total_anomalies = anomalies.count()
if total_anomalies == 0:
return 0.0
false_positives = anomalies.filter(status='FALSE_POSITIVE').count()
return (false_positives / total_anomalies) * 100

View File

@@ -0,0 +1,684 @@
"""
ML-based predictive models for incident management
Implements various predictive algorithms for incident prediction, severity prediction, and cost analysis
"""
import numpy as np
import pandas as pd
from typing import Dict, List, Tuple, Optional, Any, Union
from datetime import datetime, timedelta
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, mean_squared_error, r2_score
import joblib
import logging
from django.utils import timezone
from django.db.models import Q, Avg, Count, Sum, Max, Min
from incident_intelligence.models import Incident
from ..models import PredictiveModel, PredictiveInsight, CostImpactAnalysis
logger = logging.getLogger(__name__)
class BasePredictiveModel:
"""Base class for predictive models"""
def __init__(self, model_config: Dict[str, Any] = None):
self.model_config = model_config or {}
self.scaler = StandardScaler()
self.label_encoders = {}
self.is_fitted = False
self.feature_columns = []
self.target_column = None
def prepare_features(self, data: pd.DataFrame) -> pd.DataFrame:
"""Prepare features for model training/prediction"""
raise NotImplementedError
def fit(self, X: pd.DataFrame, y: pd.Series) -> Dict[str, float]:
"""Fit the model and return performance metrics"""
raise NotImplementedError
def predict(self, X: pd.DataFrame) -> np.ndarray:
"""Make predictions"""
raise NotImplementedError
def get_feature_importance(self) -> Dict[str, float]:
"""Get feature importance scores"""
raise NotImplementedError
class IncidentPredictionModel(BasePredictiveModel):
"""Model for predicting incident occurrence"""
def __init__(self, model_config: Dict[str, Any] = None):
super().__init__(model_config)
self.model = RandomForestClassifier(
n_estimators=self.model_config.get('n_estimators', 100),
max_depth=self.model_config.get('max_depth', 10),
random_state=42
)
def prepare_features(self, data: pd.DataFrame) -> pd.DataFrame:
"""Prepare features for incident prediction"""
features = pd.DataFrame()
# Time-based features
if 'timestamp' in data.columns:
timestamp = pd.to_datetime(data['timestamp'])
features['hour_of_day'] = timestamp.dt.hour
features['day_of_week'] = timestamp.dt.dayofweek
features['day_of_month'] = timestamp.dt.day
features['month'] = timestamp.dt.month
features['is_weekend'] = (timestamp.dt.dayofweek >= 5).astype(int)
features['is_business_hours'] = ((timestamp.dt.hour >= 9) & (timestamp.dt.hour <= 17)).astype(int)
# Historical incident features
if 'incident_count_1h' in data.columns:
features['incident_count_1h'] = data['incident_count_1h']
if 'incident_count_24h' in data.columns:
features['incident_count_24h'] = data['incident_count_24h']
if 'avg_severity_24h' in data.columns:
features['avg_severity_24h'] = data['avg_severity_24h']
# System metrics (if available)
system_metrics = ['cpu_usage', 'memory_usage', 'disk_usage', 'network_usage']
for metric in system_metrics:
if metric in data.columns:
features[metric] = data[metric]
# Service-specific features
if 'service_name' in data.columns:
# Encode service names
if 'service_name' not in self.label_encoders:
self.label_encoders['service_name'] = LabelEncoder()
features['service_encoded'] = self.label_encoders['service_name'].fit_transform(data['service_name'])
else:
features['service_encoded'] = self.label_encoders['service_name'].transform(data['service_name'])
return features
def fit(self, X: pd.DataFrame, y: pd.Series) -> Dict[str, float]:
"""Fit the incident prediction model"""
# Prepare features
X_processed = self.prepare_features(X)
self.feature_columns = X_processed.columns.tolist()
# Scale features
X_scaled = self.scaler.fit_transform(X_processed)
# Split data for validation
X_train, X_val, y_train, y_val = train_test_split(
X_scaled, y, test_size=0.2, random_state=42, stratify=y
)
# Fit model
self.model.fit(X_train, y_train)
# Evaluate model
y_pred = self.model.predict(X_val)
y_pred_proba = self.model.predict_proba(X_val)[:, 1]
metrics = {
'accuracy': accuracy_score(y_val, y_pred),
'precision': precision_score(y_val, y_pred, average='weighted'),
'recall': recall_score(y_val, y_pred, average='weighted'),
'f1_score': f1_score(y_val, y_pred, average='weighted')
}
self.is_fitted = True
return metrics
def predict(self, X: pd.DataFrame) -> np.ndarray:
"""Predict incident probability"""
if not self.is_fitted:
raise ValueError("Model must be fitted before prediction")
X_processed = self.prepare_features(X)
X_scaled = self.scaler.transform(X_processed)
# Return probability of incident occurrence
return self.model.predict_proba(X_scaled)[:, 1]
def get_feature_importance(self) -> Dict[str, float]:
"""Get feature importance scores"""
if not self.is_fitted:
return {}
importance_scores = self.model.feature_importances_
return dict(zip(self.feature_columns, importance_scores))
class SeverityPredictionModel(BasePredictiveModel):
"""Model for predicting incident severity"""
def __init__(self, model_config: Dict[str, Any] = None):
super().__init__(model_config)
self.model = RandomForestClassifier(
n_estimators=self.model_config.get('n_estimators', 100),
max_depth=self.model_config.get('max_depth', 10),
random_state=42
)
self.severity_mapping = {
'LOW': 1, 'MEDIUM': 2, 'HIGH': 3, 'CRITICAL': 4, 'EMERGENCY': 5
}
self.reverse_severity_mapping = {v: k for k, v in self.severity_mapping.items()}
def prepare_features(self, data: pd.DataFrame) -> pd.DataFrame:
"""Prepare features for severity prediction"""
features = pd.DataFrame()
# Text-based features
if 'title' in data.columns:
features['title_length'] = data['title'].str.len()
features['title_word_count'] = data['title'].str.split().str.len()
if 'description' in data.columns:
features['description_length'] = data['description'].str.len()
features['description_word_count'] = data['description'].str.split().str.len()
# Categorical features
if 'category' in data.columns:
if 'category' not in self.label_encoders:
self.label_encoders['category'] = LabelEncoder()
features['category_encoded'] = self.label_encoders['category'].fit_transform(data['category'])
else:
features['category_encoded'] = self.label_encoders['category'].transform(data['category'])
if 'subcategory' in data.columns:
if 'subcategory' not in self.label_encoders:
self.label_encoders['subcategory'] = LabelEncoder()
features['subcategory_encoded'] = self.label_encoders['subcategory'].fit_transform(data['subcategory'])
else:
features['subcategory_encoded'] = self.label_encoders['subcategory'].transform(data['subcategory'])
# Impact features
if 'affected_users' in data.columns:
features['affected_users'] = data['affected_users']
features['affected_users_log'] = np.log1p(data['affected_users'])
# Time-based features
if 'created_at' in data.columns:
timestamp = pd.to_datetime(data['created_at'])
features['hour_of_day'] = timestamp.dt.hour
features['day_of_week'] = timestamp.dt.dayofweek
features['is_weekend'] = (timestamp.dt.dayofweek >= 5).astype(int)
features['is_business_hours'] = ((timestamp.dt.hour >= 9) & (timestamp.dt.hour <= 17)).astype(int)
# Historical features
if 'reporter_id' in data.columns:
# Count of previous incidents by reporter
features['reporter_incident_count'] = data.groupby('reporter_id')['reporter_id'].transform('count')
return features
def fit(self, X: pd.DataFrame, y: pd.Series) -> Dict[str, float]:
"""Fit the severity prediction model"""
# Prepare features
X_processed = self.prepare_features(X)
self.feature_columns = X_processed.columns.tolist()
# Encode target variable
y_encoded = y.map(self.severity_mapping)
# Scale features
X_scaled = self.scaler.fit_transform(X_processed)
# Split data for validation
X_train, X_val, y_train, y_val = train_test_split(
X_scaled, y_encoded, test_size=0.2, random_state=42, stratify=y_encoded
)
# Fit model
self.model.fit(X_train, y_train)
# Evaluate model
y_pred = self.model.predict(X_val)
metrics = {
'accuracy': accuracy_score(y_val, y_pred),
'precision': precision_score(y_val, y_pred, average='weighted'),
'recall': recall_score(y_val, y_pred, average='weighted'),
'f1_score': f1_score(y_val, y_pred, average='weighted')
}
self.is_fitted = True
return metrics
def predict(self, X: pd.DataFrame) -> np.ndarray:
"""Predict incident severity"""
if not self.is_fitted:
raise ValueError("Model must be fitted before prediction")
X_processed = self.prepare_features(X)
X_scaled = self.scaler.transform(X_processed)
# Get predicted severity levels
y_pred_encoded = self.model.predict(X_scaled)
# Convert back to severity labels
return np.array([self.reverse_severity_mapping.get(level, 'MEDIUM') for level in y_pred_encoded])
def get_feature_importance(self) -> Dict[str, float]:
"""Get feature importance scores"""
if not self.is_fitted:
return {}
importance_scores = self.model.feature_importances_
return dict(zip(self.feature_columns, importance_scores))
class ResolutionTimePredictionModel(BasePredictiveModel):
"""Model for predicting incident resolution time"""
def __init__(self, model_config: Dict[str, Any] = None):
super().__init__(model_config)
self.model = RandomForestRegressor(
n_estimators=self.model_config.get('n_estimators', 100),
max_depth=self.model_config.get('max_depth', 10),
random_state=42
)
def prepare_features(self, data: pd.DataFrame) -> pd.DataFrame:
"""Prepare features for resolution time prediction"""
features = pd.DataFrame()
# Severity features
if 'severity' in data.columns:
severity_mapping = {'LOW': 1, 'MEDIUM': 2, 'HIGH': 3, 'CRITICAL': 4, 'EMERGENCY': 5}
features['severity_encoded'] = data['severity'].map(severity_mapping).fillna(2)
# Categorical features
if 'category' in data.columns:
if 'category' not in self.label_encoders:
self.label_encoders['category'] = LabelEncoder()
features['category_encoded'] = self.label_encoders['category'].fit_transform(data['category'])
else:
features['category_encoded'] = self.label_encoders['category'].transform(data['category'])
# Impact features
if 'affected_users' in data.columns:
features['affected_users'] = data['affected_users']
features['affected_users_log'] = np.log1p(data['affected_users'])
# Time-based features
if 'created_at' in data.columns:
timestamp = pd.to_datetime(data['created_at'])
features['hour_of_day'] = timestamp.dt.hour
features['day_of_week'] = timestamp.dt.dayofweek
features['is_weekend'] = (timestamp.dt.dayofweek >= 5).astype(int)
features['is_business_hours'] = ((timestamp.dt.hour >= 9) & (timestamp.dt.hour <= 17)).astype(int)
# Historical features
if 'assigned_to' in data.columns:
# Average resolution time for assignee
features['assignee_avg_resolution_time'] = data.groupby('assigned_to')['resolution_time_hours'].transform('mean')
# Text features
if 'title' in data.columns:
features['title_length'] = data['title'].str.len()
if 'description' in data.columns:
features['description_length'] = data['description'].str.len()
return features
def fit(self, X: pd.DataFrame, y: pd.Series) -> Dict[str, float]:
"""Fit the resolution time prediction model"""
# Prepare features
X_processed = self.prepare_features(X)
self.feature_columns = X_processed.columns.tolist()
# Scale features
X_scaled = self.scaler.fit_transform(X_processed)
# Split data for validation
X_train, X_val, y_train, y_val = train_test_split(
X_scaled, y, test_size=0.2, random_state=42
)
# Fit model
self.model.fit(X_train, y_train)
# Evaluate model
y_pred = self.model.predict(X_val)
metrics = {
'mse': mean_squared_error(y_val, y_pred),
'rmse': np.sqrt(mean_squared_error(y_val, y_pred)),
'r2_score': r2_score(y_val, y_pred)
}
self.is_fitted = True
return metrics
def predict(self, X: pd.DataFrame) -> np.ndarray:
"""Predict resolution time in hours"""
if not self.is_fitted:
raise ValueError("Model must be fitted before prediction")
X_processed = self.prepare_features(X)
X_scaled = self.scaler.transform(X_processed)
return self.model.predict(X_scaled)
def get_feature_importance(self) -> Dict[str, float]:
"""Get feature importance scores"""
if not self.is_fitted:
return {}
importance_scores = self.model.feature_importances_
return dict(zip(self.feature_columns, importance_scores))
class CostPredictionModel(BasePredictiveModel):
"""Model for predicting incident cost impact"""
def __init__(self, model_config: Dict[str, Any] = None):
super().__init__(model_config)
self.model = RandomForestRegressor(
n_estimators=self.model_config.get('n_estimators', 100),
max_depth=self.model_config.get('max_depth', 10),
random_state=42
)
def prepare_features(self, data: pd.DataFrame) -> pd.DataFrame:
"""Prepare features for cost prediction"""
features = pd.DataFrame()
# Severity features
if 'severity' in data.columns:
severity_mapping = {'LOW': 1, 'MEDIUM': 2, 'HIGH': 3, 'CRITICAL': 4, 'EMERGENCY': 5}
features['severity_encoded'] = data['severity'].map(severity_mapping).fillna(2)
# Impact features
if 'affected_users' in data.columns:
features['affected_users'] = data['affected_users']
features['affected_users_log'] = np.log1p(data['affected_users'])
if 'downtime_hours' in data.columns:
features['downtime_hours'] = data['downtime_hours']
features['downtime_hours_log'] = np.log1p(data['downtime_hours'])
# Categorical features
if 'category' in data.columns:
if 'category' not in self.label_encoders:
self.label_encoders['category'] = LabelEncoder()
features['category_encoded'] = self.label_encoders['category'].fit_transform(data['category'])
else:
features['category_encoded'] = self.label_encoders['category'].transform(data['category'])
# Business context
if 'business_unit' in data.columns:
if 'business_unit' not in self.label_encoders:
self.label_encoders['business_unit'] = LabelEncoder()
features['business_unit_encoded'] = self.label_encoders['business_unit'].fit_transform(data['business_unit'])
else:
features['business_unit_encoded'] = self.label_encoders['business_unit'].transform(data['business_unit'])
# Time-based features
if 'created_at' in data.columns:
timestamp = pd.to_datetime(data['created_at'])
features['hour_of_day'] = timestamp.dt.hour
features['day_of_week'] = timestamp.dt.dayofweek
features['is_weekend'] = (timestamp.dt.dayofweek >= 5).astype(int)
features['is_business_hours'] = ((timestamp.dt.hour >= 9) & (timestamp.dt.hour <= 17)).astype(int)
return features
def fit(self, X: pd.DataFrame, y: pd.Series) -> Dict[str, float]:
"""Fit the cost prediction model"""
# Prepare features
X_processed = self.prepare_features(X)
self.feature_columns = X_processed.columns.tolist()
# Scale features
X_scaled = self.scaler.fit_transform(X_processed)
# Split data for validation
X_train, X_val, y_train, y_val = train_test_split(
X_scaled, y, test_size=0.2, random_state=42
)
# Fit model
self.model.fit(X_train, y_train)
# Evaluate model
y_pred = self.model.predict(X_val)
metrics = {
'mse': mean_squared_error(y_val, y_pred),
'rmse': np.sqrt(mean_squared_error(y_val, y_pred)),
'r2_score': r2_score(y_val, y_pred)
}
self.is_fitted = True
return metrics
def predict(self, X: pd.DataFrame) -> np.ndarray:
"""Predict cost impact in USD"""
if not self.is_fitted:
raise ValueError("Model must be fitted before prediction")
X_processed = self.prepare_features(X)
X_scaled = self.scaler.transform(X_processed)
return self.model.predict(X_scaled)
def get_feature_importance(self) -> Dict[str, float]:
"""Get feature importance scores"""
if not self.is_fitted:
return {}
importance_scores = self.model.feature_importances_
return dict(zip(self.feature_columns, importance_scores))
class PredictiveModelFactory:
"""Factory for creating predictive models"""
@staticmethod
def create_model(model_type: str, model_config: Dict[str, Any] = None) -> BasePredictiveModel:
"""Create a predictive model instance"""
models = {
'INCIDENT_PREDICTION': IncidentPredictionModel,
'SEVERITY_PREDICTION': SeverityPredictionModel,
'RESOLUTION_TIME_PREDICTION': ResolutionTimePredictionModel,
'COST_PREDICTION': CostPredictionModel
}
if model_type not in models:
raise ValueError(f"Unknown model type: {model_type}")
return models[model_type](model_config)
class PredictiveModelService:
"""Service for managing predictive models"""
def __init__(self):
self.factory = PredictiveModelFactory()
def prepare_training_data(self, model_type: str, days_back: int = 90) -> Tuple[pd.DataFrame, pd.Series]:
"""Prepare training data for the specified model type"""
end_date = timezone.now()
start_date = end_date - timedelta(days=days_back)
# Get incidents from the time period
incidents = Incident.objects.filter(
created_at__gte=start_date,
created_at__lte=end_date
).values(
'id', 'title', 'description', 'severity', 'category', 'subcategory',
'affected_users', 'estimated_downtime', 'created_at', 'resolved_at',
'assigned_to', 'reporter', 'status'
)
if not incidents:
return pd.DataFrame(), pd.Series()
df = pd.DataFrame(list(incidents))
# Prepare target variable based on model type
if model_type == 'INCIDENT_PREDICTION':
# For incident prediction, we need to create time series data
# This is a simplified version - in practice, you'd need more sophisticated time series preparation
y = pd.Series([1] * len(df)) # Placeholder
elif model_type == 'SEVERITY_PREDICTION':
y = df['severity']
elif model_type == 'RESOLUTION_TIME_PREDICTION':
# Calculate resolution time in hours
df['resolved_at'] = pd.to_datetime(df['resolved_at'])
df['created_at'] = pd.to_datetime(df['created_at'])
df['resolution_time_hours'] = (df['resolved_at'] - df['created_at']).dt.total_seconds() / 3600
y = df['resolution_time_hours'].fillna(df['resolution_time_hours'].median())
elif model_type == 'COST_PREDICTION':
# Get cost data
cost_analyses = CostImpactAnalysis.objects.filter(
incident_id__in=df['id']
).values('incident_id', 'cost_amount')
cost_df = pd.DataFrame(list(cost_analyses))
if not cost_df.empty:
df = df.merge(cost_df, left_on='id', right_on='incident_id', how='left')
y = df['cost_amount'].fillna(df['cost_amount'].median())
else:
y = pd.Series([0] * len(df))
else:
raise ValueError(f"Unknown model type: {model_type}")
return df, y
def train_model(self, model_id: str) -> Dict[str, Any]:
"""Train a predictive model"""
try:
model = PredictiveModel.objects.get(id=model_id)
# Prepare training data
X, y = self.prepare_training_data(model.model_type, model.training_data_period_days)
if X.empty or len(y) < model.min_training_samples:
return {
'success': False,
'error': f'Insufficient training data. Need at least {model.min_training_samples} samples, got {len(y)}'
}
# Create model instance
ml_model = self.factory.create_model(model.model_type, model.model_config)
# Train the model
start_time = timezone.now()
metrics = ml_model.fit(X, y)
end_time = timezone.now()
# Update model with performance metrics
model.accuracy_score = metrics.get('accuracy', metrics.get('r2_score'))
model.precision_score = metrics.get('precision')
model.recall_score = metrics.get('recall')
model.f1_score = metrics.get('f1_score')
model.status = 'ACTIVE'
model.last_trained_at = end_time
model.training_duration_seconds = (end_time - start_time).total_seconds()
model.training_samples_count = len(y)
model.feature_columns = ml_model.feature_columns
# Save model (in a real implementation, you'd save the actual model file)
model.model_file_path = f"models/{model.id}_{model.version}.joblib"
model.save()
return {
'success': True,
'metrics': metrics,
'training_samples': len(y),
'training_duration': model.training_duration_seconds
}
except Exception as e:
logger.error(f"Error training model {model_id}: {str(e)}")
return {
'success': False,
'error': str(e)
}
def generate_predictions(self, model_id: str, prediction_horizon_hours: int = 24) -> List[Dict[str, Any]]:
"""Generate predictions using a trained model"""
try:
model = PredictiveModel.objects.get(id=model_id, status='ACTIVE')
# Create model instance
ml_model = self.factory.create_model(model.model_type, model.model_config)
# Load model (in a real implementation, you'd load from the saved file)
# For now, we'll create a mock prediction
# Prepare prediction data
X, _ = self.prepare_training_data(model.model_type, 7) # Last 7 days
if X.empty:
return []
# Make predictions
predictions = ml_model.predict(X.tail(10)) # Predict for last 10 incidents
# Create insight objects
insights = []
for i, prediction in enumerate(predictions):
insight_data = {
'model': model,
'insight_type': model.model_type,
'title': f"Prediction for {model.model_type.replace('_', ' ').title()}",
'description': f"Model predicts {prediction} for upcoming incidents",
'confidence_level': 'MEDIUM', # Could be calculated based on model confidence
'confidence_score': 0.7, # Placeholder
'predicted_value': {'value': float(prediction)},
'prediction_horizon': prediction_horizon_hours,
'prediction_date': timezone.now() + timedelta(hours=prediction_horizon_hours),
'input_features': X.iloc[i].to_dict(),
'supporting_evidence': [],
'affected_services': [X.iloc[i].get('category', 'Unknown')],
'recommendations': self._generate_recommendations(model.model_type, prediction),
'expires_at': timezone.now() + timedelta(hours=prediction_horizon_hours * 2)
}
insights.append(insight_data)
return insights
except Exception as e:
logger.error(f"Error generating predictions for model {model_id}: {str(e)}")
return []
def _generate_recommendations(self, model_type: str, prediction: Any) -> List[str]:
"""Generate recommendations based on prediction"""
recommendations = []
if model_type == 'INCIDENT_PREDICTION':
if prediction > 0.7:
recommendations.append("High probability of incident occurrence - consider proactive monitoring")
recommendations.append("Ensure on-call team is ready for potential incidents")
elif prediction > 0.4:
recommendations.append("Moderate probability of incident - monitor system metrics closely")
elif model_type == 'SEVERITY_PREDICTION':
if prediction in ['CRITICAL', 'EMERGENCY']:
recommendations.append("High severity incident predicted - prepare escalation procedures")
recommendations.append("Ensure senior staff are available for response")
elif prediction == 'HIGH':
recommendations.append("High severity incident predicted - review response procedures")
elif model_type == 'RESOLUTION_TIME_PREDICTION':
if prediction > 24:
recommendations.append("Long resolution time predicted - consider additional resources")
recommendations.append("Review escalation procedures for complex incidents")
elif prediction > 8:
recommendations.append("Extended resolution time predicted - prepare for extended response")
elif model_type == 'COST_PREDICTION':
if prediction > 10000:
recommendations.append("High cost impact predicted - prepare cost mitigation strategies")
recommendations.append("Consider business continuity measures")
elif prediction > 5000:
recommendations.append("Significant cost impact predicted - review cost control measures")
return recommendations

View File

@@ -0,0 +1,828 @@
"""
Analytics & Predictive Insights models for Enterprise Incident Management API
Implements advanced KPIs, predictive analytics, ML-based anomaly detection, and cost analysis
"""
import uuid
import json
from datetime import datetime, timedelta, time
from typing import Dict, Any, Optional, List
from decimal import Decimal
from django.db import models
from django.contrib.auth import get_user_model
from django.core.validators import MinValueValidator, MaxValueValidator
from django.utils import timezone
from django.core.exceptions import ValidationError
User = get_user_model()
class KPIMetric(models.Model):
"""Base model for KPI metrics tracking"""
METRIC_TYPES = [
('MTTA', 'Mean Time to Acknowledge'),
('MTTR', 'Mean Time to Resolve'),
('MTBF', 'Mean Time Between Failures'),
('MTBSI', 'Mean Time Between Service Incidents'),
('AVAILABILITY', 'Service Availability'),
('INCIDENT_COUNT', 'Incident Count'),
('RESOLUTION_RATE', 'Resolution Rate'),
('ESCALATION_RATE', 'Escalation Rate'),
('CUSTOM', 'Custom Metric'),
]
AGGREGATION_TYPES = [
('AVERAGE', 'Average'),
('MEDIAN', 'Median'),
('MIN', 'Minimum'),
('MAX', 'Maximum'),
('SUM', 'Sum'),
('COUNT', 'Count'),
('PERCENTILE_95', '95th Percentile'),
('PERCENTILE_99', '99th Percentile'),
]
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
name = models.CharField(max_length=200)
description = models.TextField()
metric_type = models.CharField(max_length=20, choices=METRIC_TYPES)
aggregation_type = models.CharField(max_length=20, choices=AGGREGATION_TYPES)
# Targeting criteria
incident_categories = models.JSONField(
default=list,
help_text="List of incident categories this metric applies to"
)
incident_severities = models.JSONField(
default=list,
help_text="List of incident severities this metric applies to"
)
incident_priorities = models.JSONField(
default=list,
help_text="List of incident priorities this metric applies to"
)
# Calculation configuration
calculation_formula = models.TextField(
blank=True,
null=True,
help_text="Custom calculation formula for complex metrics"
)
time_window_hours = models.PositiveIntegerField(
default=24,
help_text="Time window for metric calculation in hours"
)
# Status and metadata
is_active = models.BooleanField(default=True)
is_system_metric = models.BooleanField(
default=False,
help_text="Whether this is a system-defined metric"
)
created_by = models.ForeignKey(User, on_delete=models.SET_NULL, null=True)
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
class Meta:
ordering = ['name']
indexes = [
models.Index(fields=['metric_type', 'is_active']),
models.Index(fields=['incident_categories']),
models.Index(fields=['incident_severities']),
]
def __str__(self):
return f"{self.name} ({self.metric_type})"
class KPIMeasurement(models.Model):
"""Individual measurements of KPI metrics"""
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
metric = models.ForeignKey(KPIMetric, on_delete=models.CASCADE, related_name='measurements')
# Measurement details
value = models.DecimalField(max_digits=15, decimal_places=4)
unit = models.CharField(max_length=50, help_text="Unit of measurement (minutes, hours, percentage, etc.)")
# Time period
measurement_period_start = models.DateTimeField()
measurement_period_end = models.DateTimeField()
# Context
incident_count = models.PositiveIntegerField(
default=0,
help_text="Number of incidents included in this measurement"
)
sample_size = models.PositiveIntegerField(
default=0,
help_text="Total sample size for this measurement"
)
# Additional metadata
metadata = models.JSONField(
default=dict,
help_text="Additional metadata for this measurement"
)
# Timestamps
calculated_at = models.DateTimeField(auto_now_add=True)
class Meta:
ordering = ['-calculated_at']
indexes = [
models.Index(fields=['metric', 'measurement_period_start']),
models.Index(fields=['calculated_at']),
]
def __str__(self):
return f"{self.metric.name}: {self.value} {self.unit}"
class IncidentRecurrenceAnalysis(models.Model):
"""Analysis of incident recurrence patterns"""
RECURRENCE_TYPES = [
('EXACT_DUPLICATE', 'Exact Duplicate'),
('SIMILAR_PATTERN', 'Similar Pattern'),
('SEASONAL', 'Seasonal Recurrence'),
('TREND', 'Trend-based Recurrence'),
('CASCADE', 'Cascade Effect'),
]
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
# Related incidents
primary_incident = models.ForeignKey(
'incident_intelligence.Incident',
on_delete=models.CASCADE,
related_name='recurrence_analyses_as_primary'
)
recurring_incidents = models.ManyToManyField(
'incident_intelligence.Incident',
related_name='recurrence_analyses_as_recurring'
)
# Analysis details
recurrence_type = models.CharField(max_length=20, choices=RECURRENCE_TYPES)
confidence_score = models.FloatField(
validators=[MinValueValidator(0.0), MaxValueValidator(1.0)]
)
recurrence_rate = models.FloatField(
help_text="Rate of recurrence (incidents per time period)"
)
# Pattern characteristics
common_keywords = models.JSONField(
default=list,
help_text="Common keywords across recurring incidents"
)
common_categories = models.JSONField(
default=list,
help_text="Common categories across recurring incidents"
)
time_pattern = models.JSONField(
default=dict,
help_text="Time-based pattern analysis"
)
# Impact analysis
total_affected_users = models.PositiveIntegerField(default=0)
total_downtime_hours = models.DecimalField(max_digits=10, decimal_places=2, default=0)
estimated_cost_impact = models.DecimalField(max_digits=15, decimal_places=2, default=0)
# Recommendations
prevention_recommendations = models.JSONField(
default=list,
help_text="AI-generated recommendations to prevent recurrence"
)
automation_opportunities = models.JSONField(
default=list,
help_text="Potential automation opportunities identified"
)
# Automation integration
suggested_runbooks = models.ManyToManyField(
'automation_orchestration.Runbook',
blank=True,
related_name='recurrence_analyses',
help_text="Runbooks suggested to prevent recurrence"
)
# Status
is_resolved = models.BooleanField(default=False)
resolution_actions = models.JSONField(
default=list,
help_text="Actions taken to resolve the recurrence pattern"
)
# Metadata
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
model_version = models.CharField(max_length=50, default='v1.0')
class Meta:
ordering = ['-confidence_score', '-created_at']
indexes = [
models.Index(fields=['recurrence_type', 'confidence_score']),
models.Index(fields=['is_resolved']),
]
def __str__(self):
return f"Recurrence Analysis: {self.primary_incident.title} ({self.recurrence_type})"
class PredictiveModel(models.Model):
"""ML models for predictive analytics"""
MODEL_TYPES = [
('ANOMALY_DETECTION', 'Anomaly Detection'),
('INCIDENT_PREDICTION', 'Incident Prediction'),
('SEVERITY_PREDICTION', 'Severity Prediction'),
('RESOLUTION_TIME_PREDICTION', 'Resolution Time Prediction'),
('ESCALATION_PREDICTION', 'Escalation Prediction'),
('COST_PREDICTION', 'Cost Impact Prediction'),
]
ALGORITHM_TYPES = [
('ISOLATION_FOREST', 'Isolation Forest'),
('LSTM', 'Long Short-Term Memory'),
('RANDOM_FOREST', 'Random Forest'),
('XGBOOST', 'XGBoost'),
('SVM', 'Support Vector Machine'),
('NEURAL_NETWORK', 'Neural Network'),
('ARIMA', 'ARIMA'),
('PROPHET', 'Prophet'),
]
STATUS_CHOICES = [
('TRAINING', 'Training'),
('ACTIVE', 'Active'),
('INACTIVE', 'Inactive'),
('RETRAINING', 'Retraining'),
('ERROR', 'Error'),
]
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
name = models.CharField(max_length=200)
description = models.TextField()
model_type = models.CharField(max_length=30, choices=MODEL_TYPES)
algorithm_type = models.CharField(max_length=20, choices=ALGORITHM_TYPES)
# Model configuration
model_config = models.JSONField(
default=dict,
help_text="Model-specific configuration parameters"
)
feature_columns = models.JSONField(
default=list,
help_text="List of feature columns used by the model"
)
target_column = models.CharField(
max_length=100,
help_text="Target column for prediction"
)
# Training data
training_data_period_days = models.PositiveIntegerField(
default=90,
help_text="Number of days of training data to use"
)
min_training_samples = models.PositiveIntegerField(
default=100,
help_text="Minimum number of samples required for training"
)
# Performance metrics
accuracy_score = models.FloatField(
null=True, blank=True,
validators=[MinValueValidator(0.0), MaxValueValidator(1.0)]
)
precision_score = models.FloatField(
null=True, blank=True,
validators=[MinValueValidator(0.0), MaxValueValidator(1.0)]
)
recall_score = models.FloatField(
null=True, blank=True,
validators=[MinValueValidator(0.0), MaxValueValidator(1.0)]
)
f1_score = models.FloatField(
null=True, blank=True,
validators=[MinValueValidator(0.0), MaxValueValidator(1.0)]
)
# Status and metadata
status = models.CharField(max_length=20, choices=STATUS_CHOICES, default='TRAINING')
version = models.CharField(max_length=20, default='1.0')
model_file_path = models.CharField(
max_length=500,
blank=True,
null=True,
help_text="Path to the trained model file"
)
# Training metadata
last_trained_at = models.DateTimeField(null=True, blank=True)
training_duration_seconds = models.PositiveIntegerField(null=True, blank=True)
training_samples_count = models.PositiveIntegerField(null=True, blank=True)
# Retraining configuration
auto_retrain_enabled = models.BooleanField(default=True)
retrain_frequency_days = models.PositiveIntegerField(default=7)
performance_threshold = models.FloatField(
default=0.8,
validators=[MinValueValidator(0.0), MaxValueValidator(1.0)],
help_text="Performance threshold below which model should be retrained"
)
# Metadata
created_by = models.ForeignKey(User, on_delete=models.SET_NULL, null=True)
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
class Meta:
ordering = ['-created_at']
indexes = [
models.Index(fields=['model_type', 'status']),
models.Index(fields=['algorithm_type']),
models.Index(fields=['status']),
]
def __str__(self):
return f"{self.name} ({self.model_type})"
class AnomalyDetection(models.Model):
"""Anomaly detection results and alerts"""
ANOMALY_TYPES = [
('STATISTICAL', 'Statistical Anomaly'),
('TEMPORAL', 'Temporal Anomaly'),
('PATTERN', 'Pattern Anomaly'),
('THRESHOLD', 'Threshold Breach'),
('BEHAVIORAL', 'Behavioral Anomaly'),
]
SEVERITY_CHOICES = [
('LOW', 'Low'),
('MEDIUM', 'Medium'),
('HIGH', 'High'),
('CRITICAL', 'Critical'),
]
STATUS_CHOICES = [
('DETECTED', 'Detected'),
('INVESTIGATING', 'Investigating'),
('CONFIRMED', 'Confirmed'),
('FALSE_POSITIVE', 'False Positive'),
('RESOLVED', 'Resolved'),
]
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
model = models.ForeignKey(PredictiveModel, on_delete=models.CASCADE, related_name='anomaly_detections')
# Anomaly details
anomaly_type = models.CharField(max_length=20, choices=ANOMALY_TYPES)
severity = models.CharField(max_length=20, choices=SEVERITY_CHOICES)
status = models.CharField(max_length=20, choices=STATUS_CHOICES, default='DETECTED')
# Detection details
confidence_score = models.FloatField(
validators=[MinValueValidator(0.0), MaxValueValidator(1.0)]
)
anomaly_score = models.FloatField(
help_text="Raw anomaly score from the model"
)
threshold_used = models.FloatField(
help_text="Threshold used for anomaly detection"
)
# Context
detected_at = models.DateTimeField(auto_now_add=True)
time_window_start = models.DateTimeField()
time_window_end = models.DateTimeField()
# Related data
related_incidents = models.ManyToManyField(
'incident_intelligence.Incident',
blank=True,
related_name='anomaly_detections'
)
affected_services = models.JSONField(
default=list,
help_text="Services affected by this anomaly"
)
affected_metrics = models.JSONField(
default=list,
help_text="Metrics that showed anomalous behavior"
)
# Analysis
description = models.TextField(help_text="Description of the anomaly")
root_cause_analysis = models.TextField(
blank=True,
null=True,
help_text="Root cause analysis of the anomaly"
)
impact_assessment = models.TextField(
blank=True,
null=True,
help_text="Assessment of the anomaly's impact"
)
# Actions taken
actions_taken = models.JSONField(
default=list,
help_text="Actions taken in response to the anomaly"
)
resolved_at = models.DateTimeField(null=True, blank=True)
resolved_by = models.ForeignKey(
User,
on_delete=models.SET_NULL,
null=True,
blank=True,
related_name='resolved_anomalies'
)
# Metadata
metadata = models.JSONField(
default=dict,
help_text="Additional metadata for this anomaly"
)
class Meta:
ordering = ['-detected_at']
indexes = [
models.Index(fields=['anomaly_type', 'severity']),
models.Index(fields=['status', 'detected_at']),
models.Index(fields=['confidence_score']),
]
def __str__(self):
return f"Anomaly: {self.anomaly_type} - {self.severity} ({self.detected_at})"
class CostImpactAnalysis(models.Model):
"""Cost impact analysis for incidents and downtime"""
COST_TYPES = [
('DOWNTIME', 'Downtime Cost'),
('LOST_REVENUE', 'Lost Revenue'),
('PENALTY', 'Penalty Cost'),
('RESOURCE_COST', 'Resource Cost'),
('REPUTATION_COST', 'Reputation Cost'),
('COMPLIANCE_COST', 'Compliance Cost'),
]
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
# Related incident
incident = models.ForeignKey(
'incident_intelligence.Incident',
on_delete=models.CASCADE,
related_name='cost_analyses'
)
# SLA integration
sla_instance = models.ForeignKey(
'sla_oncall.SLAInstance',
on_delete=models.SET_NULL,
null=True,
blank=True,
related_name='cost_analyses',
help_text="Related SLA instance for cost calculation"
)
# Cost breakdown
cost_type = models.CharField(max_length=20, choices=COST_TYPES)
cost_amount = models.DecimalField(
max_digits=15,
decimal_places=2,
help_text="Cost amount in USD"
)
currency = models.CharField(max_length=3, default='USD')
# Cost calculation details
calculation_method = models.CharField(
max_length=50,
help_text="Method used to calculate the cost"
)
calculation_details = models.JSONField(
default=dict,
help_text="Detailed breakdown of cost calculation"
)
# Impact metrics
downtime_hours = models.DecimalField(
max_digits=10,
decimal_places=2,
null=True,
blank=True,
help_text="Total downtime in hours"
)
affected_users = models.PositiveIntegerField(
null=True,
blank=True,
help_text="Number of users affected"
)
revenue_impact = models.DecimalField(
max_digits=15,
decimal_places=2,
null=True,
blank=True,
help_text="Revenue impact in USD"
)
# Business context
business_unit = models.CharField(
max_length=100,
blank=True,
null=True,
help_text="Business unit affected"
)
service_tier = models.CharField(
max_length=50,
blank=True,
null=True,
help_text="Service tier (e.g., Premium, Standard)"
)
# Validation and approval
is_validated = models.BooleanField(default=False)
validated_by = models.ForeignKey(
User,
on_delete=models.SET_NULL,
null=True,
blank=True,
related_name='validated_cost_analyses'
)
validated_at = models.DateTimeField(null=True, blank=True)
validation_notes = models.TextField(blank=True, null=True)
# Metadata
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
class Meta:
ordering = ['-created_at']
indexes = [
models.Index(fields=['incident', 'cost_type']),
models.Index(fields=['cost_amount']),
models.Index(fields=['is_validated']),
]
def __str__(self):
return f"Cost Analysis: {self.incident.title} - {self.cost_type} (${self.cost_amount})"
class DashboardConfiguration(models.Model):
"""Dashboard configuration for analytics visualization"""
DASHBOARD_TYPES = [
('EXECUTIVE', 'Executive Dashboard'),
('OPERATIONAL', 'Operational Dashboard'),
('TECHNICAL', 'Technical Dashboard'),
('CUSTOM', 'Custom Dashboard'),
]
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
name = models.CharField(max_length=200)
description = models.TextField()
dashboard_type = models.CharField(max_length=20, choices=DASHBOARD_TYPES)
# Dashboard configuration
layout_config = models.JSONField(
default=dict,
help_text="Dashboard layout configuration"
)
widget_configs = models.JSONField(
default=list,
help_text="Configuration for dashboard widgets"
)
# Access control
is_public = models.BooleanField(default=False)
allowed_users = models.ManyToManyField(
User,
blank=True,
related_name='accessible_dashboards'
)
allowed_roles = models.JSONField(
default=list,
help_text="List of roles that can access this dashboard"
)
# Refresh configuration
auto_refresh_enabled = models.BooleanField(default=True)
refresh_interval_seconds = models.PositiveIntegerField(default=300)
# Status and metadata
is_active = models.BooleanField(default=True)
created_by = models.ForeignKey(User, on_delete=models.SET_NULL, null=True)
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
class Meta:
ordering = ['name']
indexes = [
models.Index(fields=['dashboard_type', 'is_active']),
models.Index(fields=['is_public']),
]
def __str__(self):
return f"{self.name} ({self.dashboard_type})"
class HeatmapData(models.Model):
"""Heatmap data for visualization"""
HEATMAP_TYPES = [
('INCIDENT_FREQUENCY', 'Incident Frequency'),
('RESOLUTION_TIME', 'Resolution Time'),
('COST_IMPACT', 'Cost Impact'),
('ANOMALY_DENSITY', 'Anomaly Density'),
('SLA_PERFORMANCE', 'SLA Performance'),
]
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
name = models.CharField(max_length=200)
heatmap_type = models.CharField(max_length=20, choices=HEATMAP_TYPES)
# Data configuration
time_period_start = models.DateTimeField()
time_period_end = models.DateTimeField()
time_granularity = models.CharField(
max_length=20,
choices=[
('HOUR', 'Hour'),
('DAY', 'Day'),
('WEEK', 'Week'),
('MONTH', 'Month'),
]
)
# Heatmap data
data_points = models.JSONField(
help_text="Heatmap data points with coordinates and values"
)
color_scheme = models.CharField(
max_length=50,
default='viridis',
help_text="Color scheme for the heatmap"
)
# Aggregation settings
aggregation_method = models.CharField(
max_length=20,
choices=[
('SUM', 'Sum'),
('AVERAGE', 'Average'),
('COUNT', 'Count'),
('MAX', 'Maximum'),
('MIN', 'Minimum'),
]
)
# Metadata
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
class Meta:
ordering = ['-created_at']
indexes = [
models.Index(fields=['heatmap_type', 'time_period_start']),
models.Index(fields=['time_granularity']),
]
def __str__(self):
return f"Heatmap: {self.name} ({self.heatmap_type})"
class PredictiveInsight(models.Model):
"""Predictive insights generated by ML models"""
INSIGHT_TYPES = [
('INCIDENT_PREDICTION', 'Incident Prediction'),
('SEVERITY_PREDICTION', 'Severity Prediction'),
('RESOLUTION_TIME_PREDICTION', 'Resolution Time Prediction'),
('COST_PREDICTION', 'Cost Prediction'),
('TREND_ANALYSIS', 'Trend Analysis'),
('PATTERN_DETECTION', 'Pattern Detection'),
]
CONFIDENCE_LEVELS = [
('LOW', 'Low Confidence'),
('MEDIUM', 'Medium Confidence'),
('HIGH', 'High Confidence'),
('VERY_HIGH', 'Very High Confidence'),
]
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
model = models.ForeignKey(PredictiveModel, on_delete=models.CASCADE, related_name='insights')
# Security integration
data_classification = models.ForeignKey(
'security.DataClassification',
on_delete=models.SET_NULL,
null=True,
blank=True,
help_text="Data classification level for this insight"
)
# Insight details
insight_type = models.CharField(max_length=30, choices=INSIGHT_TYPES)
title = models.CharField(max_length=200)
description = models.TextField()
confidence_level = models.CharField(max_length=20, choices=CONFIDENCE_LEVELS)
confidence_score = models.FloatField(
validators=[MinValueValidator(0.0), MaxValueValidator(1.0)]
)
# Prediction details
predicted_value = models.JSONField(
help_text="Predicted value or values"
)
prediction_horizon = models.PositiveIntegerField(
help_text="Prediction horizon in hours"
)
prediction_date = models.DateTimeField(
help_text="When the prediction is for"
)
# Context
input_features = models.JSONField(
help_text="Input features used for the prediction"
)
supporting_evidence = models.JSONField(
default=list,
help_text="Supporting evidence for the prediction"
)
# Related data
related_incidents = models.ManyToManyField(
'incident_intelligence.Incident',
blank=True,
related_name='predictive_insights'
)
affected_services = models.JSONField(
default=list,
help_text="Services that may be affected"
)
# Recommendations
recommendations = models.JSONField(
default=list,
help_text="AI-generated recommendations based on the insight"
)
risk_assessment = models.TextField(
blank=True,
null=True,
help_text="Risk assessment based on the prediction"
)
# Status
is_acknowledged = models.BooleanField(default=False)
acknowledged_by = models.ForeignKey(
User,
on_delete=models.SET_NULL,
null=True,
blank=True,
related_name='acknowledged_insights'
)
acknowledged_at = models.DateTimeField(null=True, blank=True)
# Validation
is_validated = models.BooleanField(default=False)
actual_value = models.JSONField(
null=True,
blank=True,
help_text="Actual value when prediction is validated"
)
validation_accuracy = models.FloatField(
null=True,
blank=True,
validators=[MinValueValidator(0.0), MaxValueValidator(1.0)]
)
# Metadata
generated_at = models.DateTimeField(auto_now_add=True)
expires_at = models.DateTimeField(
help_text="When this insight expires"
)
class Meta:
ordering = ['-generated_at']
indexes = [
models.Index(fields=['insight_type', 'confidence_score']),
models.Index(fields=['prediction_date']),
models.Index(fields=['is_acknowledged']),
]
def __str__(self):
return f"Insight: {self.title} ({self.insight_type})"
@property
def is_expired(self):
"""Check if this insight has expired"""
return timezone.now() > self.expires_at

View File

@@ -0,0 +1 @@
# Analytics & Predictive Insights serializers

View File

@@ -0,0 +1,404 @@
"""
Analytics & Predictive Insights serializers for Enterprise Incident Management API
Provides comprehensive serialization for KPIs, metrics, predictive models, and insights
"""
from rest_framework import serializers
from django.contrib.auth import get_user_model
from ..models import (
KPIMetric, KPIMeasurement, IncidentRecurrenceAnalysis, PredictiveModel,
AnomalyDetection, CostImpactAnalysis, DashboardConfiguration,
HeatmapData, PredictiveInsight
)
User = get_user_model()
class KPIMetricSerializer(serializers.ModelSerializer):
"""Serializer for KPI metrics"""
created_by_username = serializers.CharField(source='created_by.username', read_only=True)
measurement_count = serializers.SerializerMethodField()
latest_measurement = serializers.SerializerMethodField()
class Meta:
model = KPIMetric
fields = [
'id', 'name', 'description', 'metric_type', 'aggregation_type',
'incident_categories', 'incident_severities', 'incident_priorities',
'calculation_formula', 'time_window_hours', 'is_active',
'is_system_metric', 'created_by_username', 'created_at', 'updated_at',
'measurement_count', 'latest_measurement'
]
read_only_fields = ['id', 'created_at', 'updated_at']
def get_measurement_count(self, obj):
"""Get the number of measurements for this metric"""
return obj.measurements.count()
def get_latest_measurement(self, obj):
"""Get the latest measurement for this metric"""
latest = obj.measurements.first()
if latest:
return {
'value': str(latest.value),
'unit': latest.unit,
'calculated_at': latest.calculated_at,
'incident_count': latest.incident_count
}
return None
class KPIMeasurementSerializer(serializers.ModelSerializer):
"""Serializer for KPI measurements"""
metric_name = serializers.CharField(source='metric.name', read_only=True)
metric_type = serializers.CharField(source='metric.metric_type', read_only=True)
class Meta:
model = KPIMeasurement
fields = [
'id', 'metric', 'metric_name', 'metric_type', 'value', 'unit',
'measurement_period_start', 'measurement_period_end',
'incident_count', 'sample_size', 'metadata', 'calculated_at'
]
read_only_fields = ['id', 'calculated_at']
class IncidentRecurrenceAnalysisSerializer(serializers.ModelSerializer):
"""Serializer for incident recurrence analysis"""
primary_incident_title = serializers.CharField(source='primary_incident.title', read_only=True)
primary_incident_severity = serializers.CharField(source='primary_incident.severity', read_only=True)
recurring_incident_count = serializers.SerializerMethodField()
recurring_incident_titles = serializers.SerializerMethodField()
class Meta:
model = IncidentRecurrenceAnalysis
fields = [
'id', 'primary_incident', 'primary_incident_title', 'primary_incident_severity',
'recurring_incidents', 'recurring_incident_count', 'recurring_incident_titles',
'recurrence_type', 'confidence_score', 'recurrence_rate',
'common_keywords', 'common_categories', 'time_pattern',
'total_affected_users', 'total_downtime_hours', 'estimated_cost_impact',
'prevention_recommendations', 'automation_opportunities',
'is_resolved', 'resolution_actions', 'created_at', 'updated_at', 'model_version'
]
read_only_fields = ['id', 'created_at', 'updated_at']
def get_recurring_incident_count(self, obj):
"""Get the number of recurring incidents"""
return obj.recurring_incidents.count()
def get_recurring_incident_titles(self, obj):
"""Get titles of recurring incidents"""
return [incident.title for incident in obj.recurring_incidents.all()]
class PredictiveModelSerializer(serializers.ModelSerializer):
"""Serializer for predictive models"""
created_by_username = serializers.CharField(source='created_by.username', read_only=True)
insight_count = serializers.SerializerMethodField()
anomaly_detection_count = serializers.SerializerMethodField()
performance_summary = serializers.SerializerMethodField()
class Meta:
model = PredictiveModel
fields = [
'id', 'name', 'description', 'model_type', 'algorithm_type',
'model_config', 'feature_columns', 'target_column',
'training_data_period_days', 'min_training_samples',
'accuracy_score', 'precision_score', 'recall_score', 'f1_score',
'status', 'version', 'model_file_path',
'last_trained_at', 'training_duration_seconds', 'training_samples_count',
'auto_retrain_enabled', 'retrain_frequency_days', 'performance_threshold',
'created_by_username', 'created_at', 'updated_at',
'insight_count', 'anomaly_detection_count', 'performance_summary'
]
read_only_fields = ['id', 'created_at', 'updated_at']
def get_insight_count(self, obj):
"""Get the number of insights generated by this model"""
return obj.insights.count()
def get_anomaly_detection_count(self, obj):
"""Get the number of anomaly detections by this model"""
return obj.anomaly_detections.count()
def get_performance_summary(self, obj):
"""Get a summary of model performance metrics"""
return {
'accuracy': obj.accuracy_score,
'precision': obj.precision_score,
'recall': obj.recall_score,
'f1_score': obj.f1_score,
'overall_health': self._calculate_overall_health(obj)
}
def _calculate_overall_health(self, obj):
"""Calculate overall model health score"""
if not all([obj.accuracy_score, obj.precision_score, obj.recall_score, obj.f1_score]):
return 'Unknown'
avg_score = (obj.accuracy_score + obj.precision_score + obj.recall_score + obj.f1_score) / 4
if avg_score >= 0.9:
return 'Excellent'
elif avg_score >= 0.8:
return 'Good'
elif avg_score >= 0.7:
return 'Fair'
else:
return 'Poor'
class AnomalyDetectionSerializer(serializers.ModelSerializer):
"""Serializer for anomaly detection results"""
model_name = serializers.CharField(source='model.name', read_only=True)
model_type = serializers.CharField(source='model.model_type', read_only=True)
resolved_by_username = serializers.CharField(source='resolved_by.username', read_only=True)
related_incident_count = serializers.SerializerMethodField()
related_incident_titles = serializers.SerializerMethodField()
time_since_detection = serializers.SerializerMethodField()
class Meta:
model = AnomalyDetection
fields = [
'id', 'model', 'model_name', 'model_type',
'anomaly_type', 'severity', 'status',
'confidence_score', 'anomaly_score', 'threshold_used',
'detected_at', 'time_window_start', 'time_window_end',
'related_incidents', 'related_incident_count', 'related_incident_titles',
'affected_services', 'affected_metrics',
'description', 'root_cause_analysis', 'impact_assessment',
'actions_taken', 'resolved_at', 'resolved_by', 'resolved_by_username',
'metadata', 'time_since_detection'
]
read_only_fields = ['id', 'detected_at']
def get_related_incident_count(self, obj):
"""Get the number of related incidents"""
return obj.related_incidents.count()
def get_related_incident_titles(self, obj):
"""Get titles of related incidents"""
return [incident.title for incident in obj.related_incidents.all()]
def get_time_since_detection(self, obj):
"""Get time elapsed since anomaly detection"""
from django.utils import timezone
return timezone.now() - obj.detected_at
class CostImpactAnalysisSerializer(serializers.ModelSerializer):
"""Serializer for cost impact analysis"""
incident_title = serializers.CharField(source='incident.title', read_only=True)
incident_severity = serializers.CharField(source='incident.severity', read_only=True)
validated_by_username = serializers.CharField(source='validated_by.username', read_only=True)
cost_per_hour = serializers.SerializerMethodField()
cost_per_user = serializers.SerializerMethodField()
class Meta:
model = CostImpactAnalysis
fields = [
'id', 'incident', 'incident_title', 'incident_severity',
'cost_type', 'cost_amount', 'currency',
'calculation_method', 'calculation_details',
'downtime_hours', 'affected_users', 'revenue_impact',
'business_unit', 'service_tier',
'is_validated', 'validated_by', 'validated_by_username',
'validated_at', 'validation_notes',
'created_at', 'updated_at',
'cost_per_hour', 'cost_per_user'
]
read_only_fields = ['id', 'created_at', 'updated_at']
def get_cost_per_hour(self, obj):
"""Calculate cost per hour of downtime"""
if obj.downtime_hours and obj.downtime_hours > 0:
return float(obj.cost_amount / obj.downtime_hours)
return None
def get_cost_per_user(self, obj):
"""Calculate cost per affected user"""
if obj.affected_users and obj.affected_users > 0:
return float(obj.cost_amount / obj.affected_users)
return None
class DashboardConfigurationSerializer(serializers.ModelSerializer):
"""Serializer for dashboard configurations"""
created_by_username = serializers.CharField(source='created_by.username', read_only=True)
allowed_user_count = serializers.SerializerMethodField()
allowed_usernames = serializers.SerializerMethodField()
widget_count = serializers.SerializerMethodField()
class Meta:
model = DashboardConfiguration
fields = [
'id', 'name', 'description', 'dashboard_type',
'layout_config', 'widget_configs', 'widget_count',
'is_public', 'allowed_users', 'allowed_user_count', 'allowed_usernames',
'allowed_roles', 'auto_refresh_enabled', 'refresh_interval_seconds',
'is_active', 'created_by_username', 'created_at', 'updated_at'
]
read_only_fields = ['id', 'created_at', 'updated_at']
def get_allowed_user_count(self, obj):
"""Get the number of allowed users"""
return obj.allowed_users.count()
def get_allowed_usernames(self, obj):
"""Get usernames of allowed users"""
return [user.username for user in obj.allowed_users.all()]
def get_widget_count(self, obj):
"""Get the number of widgets in this dashboard"""
return len(obj.widget_configs) if obj.widget_configs else 0
class HeatmapDataSerializer(serializers.ModelSerializer):
"""Serializer for heatmap data"""
data_point_count = serializers.SerializerMethodField()
time_span_hours = serializers.SerializerMethodField()
class Meta:
model = HeatmapData
fields = [
'id', 'name', 'heatmap_type', 'time_period_start', 'time_period_end',
'time_granularity', 'data_points', 'data_point_count',
'color_scheme', 'aggregation_method', 'time_span_hours',
'created_at', 'updated_at'
]
read_only_fields = ['id', 'created_at', 'updated_at']
def get_data_point_count(self, obj):
"""Get the number of data points"""
return len(obj.data_points) if obj.data_points else 0
def get_time_span_hours(self, obj):
"""Get the time span in hours"""
from django.utils import timezone
delta = obj.time_period_end - obj.time_period_start
return delta.total_seconds() / 3600
class PredictiveInsightSerializer(serializers.ModelSerializer):
"""Serializer for predictive insights"""
model_name = serializers.CharField(source='model.name', read_only=True)
model_type = serializers.CharField(source='model.model_type', read_only=True)
acknowledged_by_username = serializers.CharField(source='acknowledged_by.username', read_only=True)
related_incident_count = serializers.SerializerMethodField()
related_incident_titles = serializers.SerializerMethodField()
time_until_expiry = serializers.SerializerMethodField()
is_expired = serializers.SerializerMethodField()
class Meta:
model = PredictiveInsight
fields = [
'id', 'model', 'model_name', 'model_type',
'insight_type', 'title', 'description',
'confidence_level', 'confidence_score',
'predicted_value', 'prediction_horizon', 'prediction_date',
'input_features', 'supporting_evidence',
'related_incidents', 'related_incident_count', 'related_incident_titles',
'affected_services', 'recommendations', 'risk_assessment',
'is_acknowledged', 'acknowledged_by', 'acknowledged_by_username',
'acknowledged_at', 'is_validated', 'actual_value', 'validation_accuracy',
'generated_at', 'expires_at', 'time_until_expiry', 'is_expired'
]
read_only_fields = ['id', 'generated_at']
def get_related_incident_count(self, obj):
"""Get the number of related incidents"""
return obj.related_incidents.count()
def get_related_incident_titles(self, obj):
"""Get titles of related incidents"""
return [incident.title for incident in obj.related_incidents.all()]
def get_time_until_expiry(self, obj):
"""Get time until insight expires"""
from django.utils import timezone
return obj.expires_at - timezone.now()
def get_is_expired(self, obj):
"""Check if insight has expired"""
return obj.is_expired
# Summary and aggregation serializers
class KPISummarySerializer(serializers.Serializer):
"""Serializer for KPI summary data"""
metric_type = serializers.CharField()
metric_name = serializers.CharField()
current_value = serializers.DecimalField(max_digits=15, decimal_places=4)
unit = serializers.CharField()
trend = serializers.CharField() # 'up', 'down', 'stable'
trend_percentage = serializers.DecimalField(max_digits=5, decimal_places=2)
period_start = serializers.DateTimeField()
period_end = serializers.DateTimeField()
incident_count = serializers.IntegerField()
target_value = serializers.DecimalField(max_digits=15, decimal_places=4, allow_null=True)
target_met = serializers.BooleanField()
class AnomalySummarySerializer(serializers.Serializer):
"""Serializer for anomaly summary data"""
total_anomalies = serializers.IntegerField()
critical_anomalies = serializers.IntegerField()
high_anomalies = serializers.IntegerField()
medium_anomalies = serializers.IntegerField()
low_anomalies = serializers.IntegerField()
unresolved_anomalies = serializers.IntegerField()
false_positive_rate = serializers.DecimalField(max_digits=5, decimal_places=2)
average_resolution_time = serializers.DurationField()
class CostSummarySerializer(serializers.Serializer):
"""Serializer for cost summary data"""
total_cost = serializers.DecimalField(max_digits=15, decimal_places=2)
currency = serializers.CharField()
downtime_cost = serializers.DecimalField(max_digits=15, decimal_places=2)
lost_revenue = serializers.DecimalField(max_digits=15, decimal_places=2)
penalty_cost = serializers.DecimalField(max_digits=15, decimal_places=2)
resource_cost = serializers.DecimalField(max_digits=15, decimal_places=2)
total_downtime_hours = serializers.DecimalField(max_digits=10, decimal_places=2)
total_affected_users = serializers.IntegerField()
cost_per_hour = serializers.DecimalField(max_digits=10, decimal_places=2)
cost_per_user = serializers.DecimalField(max_digits=10, decimal_places=2)
class PredictiveInsightSummarySerializer(serializers.Serializer):
"""Serializer for predictive insight summary data"""
total_insights = serializers.IntegerField()
high_confidence_insights = serializers.IntegerField()
medium_confidence_insights = serializers.IntegerField()
low_confidence_insights = serializers.IntegerField()
acknowledged_insights = serializers.IntegerField()
validated_insights = serializers.IntegerField()
expired_insights = serializers.IntegerField()
average_accuracy = serializers.DecimalField(max_digits=5, decimal_places=2)
active_models = serializers.IntegerField()
class DashboardDataSerializer(serializers.Serializer):
"""Serializer for complete dashboard data"""
kpi_summary = KPISummarySerializer(many=True)
anomaly_summary = AnomalySummarySerializer()
cost_summary = CostSummarySerializer()
insight_summary = PredictiveInsightSummarySerializer()
recent_anomalies = AnomalyDetectionSerializer(many=True)
recent_insights = PredictiveInsightSerializer(many=True)
heatmap_data = HeatmapDataSerializer(many=True)
last_updated = serializers.DateTimeField()

View File

@@ -0,0 +1,238 @@
"""
Signals for analytics_predictive_insights app
Handles automatic KPI calculations and analytics updates
"""
from django.db.models.signals import post_save, post_delete
from django.dispatch import receiver
from django.utils import timezone
from django.db import models
from datetime import timedelta
import logging
from incident_intelligence.models import Incident
from .models import KPIMetric, KPIMeasurement, CostImpactAnalysis
from .ml.anomaly_detection import AnomalyDetectionService
from .ml.predictive_models import PredictiveModelService
logger = logging.getLogger(__name__)
@receiver(post_save, sender=Incident)
def update_kpi_measurements_on_incident_change(sender, instance, created, **kwargs):
"""Update KPI measurements when incidents are created or updated"""
try:
# Only process if incident is resolved or status changed
if not created and not instance.is_resolved:
return
# Get active KPI metrics that apply to this incident
applicable_metrics = KPIMetric.objects.filter(
is_active=True
).filter(
# Check if metric applies to this incident
models.Q(incident_categories__contains=[instance.category]) |
models.Q(incident_severities__contains=[instance.severity]) |
models.Q(incident_priorities__contains=[instance.priority]) |
models.Q(incident_categories__isnull=True) |
models.Q(incident_severities__isnull=True) |
models.Q(incident_priorities__isnull=True)
)
for metric in applicable_metrics:
# Calculate and update KPI measurement
_calculate_kpi_measurement(metric, instance)
except Exception as e:
logger.error(f"Error updating KPI measurements for incident {instance.id}: {str(e)}")
@receiver(post_save, sender=Incident)
def trigger_anomaly_detection_on_incident(sender, instance, created, **kwargs):
"""Trigger anomaly detection when new incidents are created"""
try:
if created:
# Run anomaly detection for active models
anomaly_service = AnomalyDetectionService()
anomaly_service.run_anomaly_detection()
except Exception as e:
logger.error(f"Error running anomaly detection for incident {instance.id}: {str(e)}")
@receiver(post_save, sender=CostImpactAnalysis)
def update_cost_analytics_on_cost_change(sender, instance, created, **kwargs):
"""Update cost analytics when cost analysis is created or updated"""
try:
# Trigger cost-related KPI updates
cost_metrics = KPIMetric.objects.filter(
is_active=True,
metric_type='COST_IMPACT'
)
for metric in cost_metrics:
_calculate_kpi_measurement(metric, instance.incident)
except Exception as e:
logger.error(f"Error updating cost analytics for cost analysis {instance.id}: {str(e)}")
def _calculate_kpi_measurement(metric, incident):
"""Calculate KPI measurement for a specific metric and incident"""
try:
# Determine time window for calculation
end_time = timezone.now()
start_time = end_time - timedelta(hours=metric.time_window_hours)
# Get incidents in the time window that match the metric criteria
incidents = Incident.objects.filter(
created_at__gte=start_time,
created_at__lte=end_time
)
# Apply metric filters
if metric.incident_categories:
incidents = incidents.filter(category__in=metric.incident_categories)
if metric.incident_severities:
incidents = incidents.filter(severity__in=metric.incident_severities)
if metric.incident_priorities:
incidents = incidents.filter(priority__in=metric.incident_priorities)
# Calculate metric value based on type
if metric.metric_type == 'MTTA':
# Mean Time to Acknowledge
acknowledged_incidents = incidents.filter(
status__in=['IN_PROGRESS', 'RESOLVED', 'CLOSED']
).exclude(assigned_to__isnull=True)
if acknowledged_incidents.exists():
# Calculate average time to acknowledgment
total_time = timedelta()
count = 0
for inc in acknowledged_incidents:
# This is simplified - in practice, you'd need to track acknowledgment time
if inc.updated_at and inc.created_at:
time_diff = inc.updated_at - inc.created_at
total_time += time_diff
count += 1
if count > 0:
avg_time = total_time / count
value = avg_time.total_seconds() / 60 # Convert to minutes
unit = 'minutes'
else:
value = 0
unit = 'minutes'
else:
value = 0
unit = 'minutes'
elif metric.metric_type == 'MTTR':
# Mean Time to Resolve
resolved_incidents = incidents.filter(
status__in=['RESOLVED', 'CLOSED'],
resolved_at__isnull=False
)
if resolved_incidents.exists():
total_time = timedelta()
count = 0
for inc in resolved_incidents:
if inc.resolved_at and inc.created_at:
time_diff = inc.resolved_at - inc.created_at
total_time += time_diff
count += 1
if count > 0:
avg_time = total_time / count
value = avg_time.total_seconds() / 3600 # Convert to hours
unit = 'hours'
else:
value = 0
unit = 'hours'
else:
value = 0
unit = 'hours'
elif metric.metric_type == 'INCIDENT_COUNT':
# Incident Count
value = incidents.count()
unit = 'count'
elif metric.metric_type == 'RESOLUTION_RATE':
# Resolution Rate
total_incidents = incidents.count()
resolved_incidents = incidents.filter(
status__in=['RESOLVED', 'CLOSED']
).count()
if total_incidents > 0:
value = (resolved_incidents / total_incidents) * 100
unit = 'percentage'
else:
value = 0
unit = 'percentage'
else:
# Default calculation
value = incidents.count()
unit = 'count'
# Create or update KPI measurement
measurement, created = KPIMeasurement.objects.get_or_create(
metric=metric,
measurement_period_start=start_time,
measurement_period_end=end_time,
defaults={
'value': value,
'unit': unit,
'incident_count': incidents.count(),
'sample_size': incidents.count()
}
)
if not created:
measurement.value = value
measurement.unit = unit
measurement.incident_count = incidents.count()
measurement.sample_size = incidents.count()
measurement.save()
logger.info(f"Updated KPI measurement for {metric.name}: {value} {unit}")
except Exception as e:
logger.error(f"Error calculating KPI measurement for {metric.name}: {str(e)}")
# Management command signals for scheduled tasks
@receiver(post_save, sender=PredictiveModel)
def schedule_model_training(sender, instance, created, **kwargs):
"""Schedule model training when a new predictive model is created"""
try:
if created and instance.status == 'TRAINING':
# In a real implementation, you would schedule a background task
# For now, we'll just log the event
logger.info(f"Scheduled training for model {instance.name}")
except Exception as e:
logger.error(f"Error scheduling model training for {instance.name}: {str(e)}")
@receiver(post_save, sender=PredictiveModel)
def trigger_model_retraining(sender, instance, created, **kwargs):
"""Trigger model retraining when performance drops below threshold"""
try:
if not created and instance.auto_retrain_enabled:
# Check if model performance is below threshold
if (instance.accuracy_score and
instance.accuracy_score < instance.performance_threshold):
# Update status to retraining
instance.status = 'RETRAINING'
instance.save()
logger.info(f"Triggered retraining for model {instance.name} due to low performance")
except Exception as e:
logger.error(f"Error triggering model retraining for {instance.name}: {str(e)}")

View File

@@ -0,0 +1,3 @@
from django.test import TestCase
# Create your tests here.

View File

@@ -0,0 +1,50 @@
"""
URL configuration for analytics_predictive_insights app
"""
from django.urls import path, include
from rest_framework.routers import DefaultRouter
from .views.analytics import (
KPIMetricViewSet, KPIMeasurementViewSet, IncidentRecurrenceAnalysisViewSet,
PredictiveModelViewSet, AnomalyDetectionViewSet, CostImpactAnalysisViewSet,
DashboardConfigurationViewSet, HeatmapDataViewSet, PredictiveInsightViewSet
)
# Create router and register viewsets
router = DefaultRouter()
router.register(r'kpi-metrics', KPIMetricViewSet)
router.register(r'kpi-measurements', KPIMeasurementViewSet)
router.register(r'recurrence-analyses', IncidentRecurrenceAnalysisViewSet)
router.register(r'predictive-models', PredictiveModelViewSet)
router.register(r'anomaly-detections', AnomalyDetectionViewSet)
router.register(r'cost-analyses', CostImpactAnalysisViewSet)
router.register(r'dashboard-configurations', DashboardConfigurationViewSet)
router.register(r'heatmap-data', HeatmapDataViewSet)
router.register(r'predictive-insights', PredictiveInsightViewSet)
app_name = 'analytics_predictive_insights'
urlpatterns = [
# Include router URLs
path('', include(router.urls)),
# Additional custom endpoints
path('dashboard/<uuid:dashboard_id>/data/',
DashboardConfigurationViewSet.as_view({'get': 'data'}),
name='dashboard-data'),
path('kpi-metrics/summary/',
KPIMetricViewSet.as_view({'get': 'summary'}),
name='kpi-summary'),
path('anomaly-detections/summary/',
AnomalyDetectionViewSet.as_view({'get': 'summary'}),
name='anomaly-summary'),
path('cost-analyses/summary/',
CostImpactAnalysisViewSet.as_view({'get': 'summary'}),
name='cost-summary'),
path('predictive-insights/summary/',
PredictiveInsightViewSet.as_view({'get': 'summary'}),
name='insight-summary'),
]

View File

@@ -0,0 +1,3 @@
from django.shortcuts import render
# Create your views here.

View File

@@ -0,0 +1 @@
# Analytics & Predictive Insights views

View File

@@ -0,0 +1,714 @@
"""
Analytics & Predictive Insights views for Enterprise Incident Management API
Implements comprehensive analytics endpoints for KPIs, predictive insights, and dashboards
"""
from rest_framework import viewsets, status, permissions
from rest_framework.decorators import action
from rest_framework.response import Response
from rest_framework.pagination import PageNumberPagination
from django_filters.rest_framework import DjangoFilterBackend
from django_filters import rest_framework as filters
from django.db.models import Q, Avg, Count, Sum, Max, Min
from django.utils import timezone
from datetime import datetime, timedelta
from decimal import Decimal
from ..models import (
KPIMetric, KPIMeasurement, IncidentRecurrenceAnalysis, PredictiveModel,
AnomalyDetection, CostImpactAnalysis, DashboardConfiguration,
HeatmapData, PredictiveInsight
)
from ..serializers.analytics import (
KPIMetricSerializer, KPIMeasurementSerializer, IncidentRecurrenceAnalysisSerializer,
PredictiveModelSerializer, AnomalyDetectionSerializer, CostImpactAnalysisSerializer,
DashboardConfigurationSerializer, HeatmapDataSerializer, PredictiveInsightSerializer,
KPISummarySerializer, AnomalySummarySerializer, CostSummarySerializer,
PredictiveInsightSummarySerializer, DashboardDataSerializer
)
class StandardResultsSetPagination(PageNumberPagination):
"""Standard pagination for analytics endpoints"""
page_size = 20
page_size_query_param = 'page_size'
max_page_size = 100
class KPIMetricFilter(filters.FilterSet):
"""Filter for KPI metrics"""
metric_type = filters.ChoiceFilter(choices=KPIMetric.METRIC_TYPES)
is_active = filters.BooleanFilter()
is_system_metric = filters.BooleanFilter()
created_after = filters.DateTimeFilter(field_name='created_at', lookup_expr='gte')
created_before = filters.DateTimeFilter(field_name='created_at', lookup_expr='lte')
class Meta:
model = KPIMetric
fields = ['metric_type', 'is_active', 'is_system_metric', 'created_after', 'created_before']
class KPIMetricViewSet(viewsets.ModelViewSet):
"""ViewSet for KPI metrics management"""
queryset = KPIMetric.objects.all()
serializer_class = KPIMetricSerializer
pagination_class = StandardResultsSetPagination
filter_backends = [DjangoFilterBackend]
filterset_class = KPIMetricFilter
permission_classes = [permissions.IsAuthenticated]
def get_queryset(self):
"""Filter queryset based on user permissions"""
queryset = super().get_queryset()
# Add any additional filtering based on user permissions
if not self.request.user.is_staff:
# Non-staff users can only see active metrics
queryset = queryset.filter(is_active=True)
return queryset.order_by('-created_at')
@action(detail=True, methods=['get'])
def measurements(self, request, pk=None):
"""Get measurements for a specific KPI metric"""
metric = self.get_object()
measurements = metric.measurements.all().order_by('-calculated_at')
# Apply date filtering if provided
start_date = request.query_params.get('start_date')
end_date = request.query_params.get('end_date')
if start_date:
measurements = measurements.filter(measurement_period_start__gte=start_date)
if end_date:
measurements = measurements.filter(measurement_period_end__lte=end_date)
# Paginate results
paginator = StandardResultsSetPagination()
page = paginator.paginate_queryset(measurements, request)
if page is not None:
serializer = KPIMeasurementSerializer(page, many=True)
return paginator.get_paginated_response(serializer.data)
serializer = KPIMeasurementSerializer(measurements, many=True)
return Response(serializer.data)
@action(detail=False, methods=['get'])
def summary(self, request):
"""Get summary of all KPI metrics"""
metrics = self.get_queryset()
# Get latest measurements for each metric
summaries = []
for metric in metrics:
latest_measurement = metric.measurements.first()
if latest_measurement:
# Calculate trend (simplified - compare with previous measurement)
previous_measurement = metric.measurements.all()[1:2].first()
trend = 'stable'
trend_percentage = Decimal('0.00')
if previous_measurement:
if latest_measurement.value > previous_measurement.value:
trend = 'up'
trend_percentage = ((latest_measurement.value - previous_measurement.value) / previous_measurement.value) * 100
elif latest_measurement.value < previous_measurement.value:
trend = 'down'
trend_percentage = ((previous_measurement.value - latest_measurement.value) / previous_measurement.value) * 100
summary_data = {
'metric_type': metric.metric_type,
'metric_name': metric.name,
'current_value': latest_measurement.value,
'unit': latest_measurement.unit,
'trend': trend,
'trend_percentage': trend_percentage,
'period_start': latest_measurement.measurement_period_start,
'period_end': latest_measurement.measurement_period_end,
'incident_count': latest_measurement.incident_count,
'target_value': None, # Could be added to metric model
'target_met': True # Could be calculated based on target
}
summaries.append(summary_data)
serializer = KPISummarySerializer(summaries, many=True)
return Response(serializer.data)
class KPIMeasurementViewSet(viewsets.ReadOnlyModelViewSet):
"""ViewSet for KPI measurements (read-only)"""
queryset = KPIMeasurement.objects.all()
serializer_class = KPIMeasurementSerializer
pagination_class = StandardResultsSetPagination
permission_classes = [permissions.IsAuthenticated]
def get_queryset(self):
"""Filter queryset based on query parameters"""
queryset = super().get_queryset()
# Filter by metric
metric_id = self.request.query_params.get('metric_id')
if metric_id:
queryset = queryset.filter(metric_id=metric_id)
# Filter by date range
start_date = self.request.query_params.get('start_date')
end_date = self.request.query_params.get('end_date')
if start_date:
queryset = queryset.filter(measurement_period_start__gte=start_date)
if end_date:
queryset = queryset.filter(measurement_period_end__lte=end_date)
return queryset.order_by('-calculated_at')
class IncidentRecurrenceAnalysisViewSet(viewsets.ReadOnlyModelViewSet):
"""ViewSet for incident recurrence analysis"""
queryset = IncidentRecurrenceAnalysis.objects.all()
serializer_class = IncidentRecurrenceAnalysisSerializer
pagination_class = StandardResultsSetPagination
permission_classes = [permissions.IsAuthenticated]
def get_queryset(self):
"""Filter queryset based on query parameters"""
queryset = super().get_queryset()
# Filter by recurrence type
recurrence_type = self.request.query_params.get('recurrence_type')
if recurrence_type:
queryset = queryset.filter(recurrence_type=recurrence_type)
# Filter by confidence score
min_confidence = self.request.query_params.get('min_confidence')
if min_confidence:
queryset = queryset.filter(confidence_score__gte=float(min_confidence))
# Filter by resolution status
is_resolved = self.request.query_params.get('is_resolved')
if is_resolved is not None:
queryset = queryset.filter(is_resolved=is_resolved.lower() == 'true')
return queryset.order_by('-confidence_score', '-created_at')
@action(detail=False, methods=['get'])
def unresolved(self, request):
"""Get unresolved recurrence analyses"""
queryset = self.get_queryset().filter(is_resolved=False)
paginator = StandardResultsSetPagination()
page = paginator.paginate_queryset(queryset, request)
if page is not None:
serializer = self.get_serializer(page, many=True)
return paginator.get_paginated_response(serializer.data)
serializer = self.get_serializer(queryset, many=True)
return Response(serializer.data)
class PredictiveModelViewSet(viewsets.ModelViewSet):
"""ViewSet for predictive models management"""
queryset = PredictiveModel.objects.all()
serializer_class = PredictiveModelSerializer
pagination_class = StandardResultsSetPagination
permission_classes = [permissions.IsAuthenticated]
def get_queryset(self):
"""Filter queryset based on user permissions"""
queryset = super().get_queryset()
# Filter by model type
model_type = self.request.query_params.get('model_type')
if model_type:
queryset = queryset.filter(model_type=model_type)
# Filter by status
status_filter = self.request.query_params.get('status')
if status_filter:
queryset = queryset.filter(status=status_filter)
return queryset.order_by('-created_at')
@action(detail=True, methods=['post'])
def train(self, request, pk=None):
"""Trigger model training"""
model = self.get_object()
# Update model status to training
model.status = 'TRAINING'
model.save()
# Here you would typically trigger the actual training process
# For now, we'll just return a success response
return Response({
'message': f'Training started for model {model.name}',
'model_id': str(model.id),
'status': model.status
}, status=status.HTTP_202_ACCEPTED)
@action(detail=True, methods=['get'])
def performance(self, request, pk=None):
"""Get model performance metrics"""
model = self.get_object()
performance_data = {
'accuracy': model.accuracy_score,
'precision': model.precision_score,
'recall': model.recall_score,
'f1_score': model.f1_score,
'training_samples': model.training_samples_count,
'last_trained': model.last_trained_at,
'training_duration': model.training_duration_seconds,
'insight_count': model.insights.count(),
'anomaly_detection_count': model.anomaly_detections.count()
}
return Response(performance_data)
class AnomalyDetectionViewSet(viewsets.ReadOnlyModelViewSet):
"""ViewSet for anomaly detection results"""
queryset = AnomalyDetection.objects.all()
serializer_class = AnomalyDetectionSerializer
pagination_class = StandardResultsSetPagination
permission_classes = [permissions.IsAuthenticated]
def get_queryset(self):
"""Filter queryset based on query parameters"""
queryset = super().get_queryset()
# Filter by anomaly type
anomaly_type = self.request.query_params.get('anomaly_type')
if anomaly_type:
queryset = queryset.filter(anomaly_type=anomaly_type)
# Filter by severity
severity = self.request.query_params.get('severity')
if severity:
queryset = queryset.filter(severity=severity)
# Filter by status
status_filter = self.request.query_params.get('status')
if status_filter:
queryset = queryset.filter(status=status_filter)
# Filter by date range
start_date = self.request.query_params.get('start_date')
end_date = self.request.query_params.get('end_date')
if start_date:
queryset = queryset.filter(detected_at__gte=start_date)
if end_date:
queryset = queryset.filter(detected_at__lte=end_date)
return queryset.order_by('-detected_at')
@action(detail=False, methods=['get'])
def summary(self, request):
"""Get anomaly detection summary"""
queryset = self.get_queryset()
# Calculate summary statistics
total_anomalies = queryset.count()
critical_anomalies = queryset.filter(severity='CRITICAL').count()
high_anomalies = queryset.filter(severity='HIGH').count()
medium_anomalies = queryset.filter(severity='MEDIUM').count()
low_anomalies = queryset.filter(severity='LOW').count()
unresolved_anomalies = queryset.filter(status__in=['DETECTED', 'INVESTIGATING']).count()
# Calculate false positive rate (simplified)
false_positives = queryset.filter(status='FALSE_POSITIVE').count()
false_positive_rate = (false_positives / total_anomalies * 100) if total_anomalies > 0 else 0
# Calculate average resolution time
resolved_anomalies = queryset.filter(status='RESOLVED', resolved_at__isnull=False)
if resolved_anomalies.exists():
avg_resolution_time = resolved_anomalies.aggregate(
avg_time=Avg('resolved_at' - 'detected_at')
)['avg_time']
else:
avg_resolution_time = None
summary_data = {
'total_anomalies': total_anomalies,
'critical_anomalies': critical_anomalies,
'high_anomalies': high_anomalies,
'medium_anomalies': medium_anomalies,
'low_anomalies': low_anomalies,
'unresolved_anomalies': unresolved_anomalies,
'false_positive_rate': Decimal(str(false_positive_rate)),
'average_resolution_time': avg_resolution_time
}
serializer = AnomalySummarySerializer(summary_data)
return Response(serializer.data)
@action(detail=True, methods=['post'])
def acknowledge(self, request, pk=None):
"""Acknowledge an anomaly detection"""
anomaly = self.get_object()
if anomaly.status == 'DETECTED':
anomaly.status = 'INVESTIGATING'
anomaly.save()
return Response({
'message': 'Anomaly acknowledged and moved to investigating status',
'anomaly_id': str(anomaly.id),
'status': anomaly.status
})
return Response({
'error': 'Anomaly is not in DETECTED status'
}, status=status.HTTP_400_BAD_REQUEST)
@action(detail=True, methods=['post'])
def resolve(self, request, pk=None):
"""Resolve an anomaly detection"""
anomaly = self.get_object()
if anomaly.status in ['DETECTED', 'INVESTIGATING', 'CONFIRMED']:
anomaly.status = 'RESOLVED'
anomaly.resolved_at = timezone.now()
anomaly.resolved_by = request.user
anomaly.save()
return Response({
'message': 'Anomaly resolved',
'anomaly_id': str(anomaly.id),
'status': anomaly.status,
'resolved_at': anomaly.resolved_at
})
return Response({
'error': 'Anomaly cannot be resolved in current status'
}, status=status.HTTP_400_BAD_REQUEST)
class CostImpactAnalysisViewSet(viewsets.ReadOnlyModelViewSet):
"""ViewSet for cost impact analysis"""
queryset = CostImpactAnalysis.objects.all()
serializer_class = CostImpactAnalysisSerializer
pagination_class = StandardResultsSetPagination
permission_classes = [permissions.IsAuthenticated]
def get_queryset(self):
"""Filter queryset based on query parameters"""
queryset = super().get_queryset()
# Filter by cost type
cost_type = self.request.query_params.get('cost_type')
if cost_type:
queryset = queryset.filter(cost_type=cost_type)
# Filter by validation status
is_validated = self.request.query_params.get('is_validated')
if is_validated is not None:
queryset = queryset.filter(is_validated=is_validated.lower() == 'true')
# Filter by date range
start_date = self.request.query_params.get('start_date')
end_date = self.request.query_params.get('end_date')
if start_date:
queryset = queryset.filter(created_at__gte=start_date)
if end_date:
queryset = queryset.filter(created_at__lte=end_date)
return queryset.order_by('-created_at')
@action(detail=False, methods=['get'])
def summary(self, request):
"""Get cost impact summary"""
queryset = self.get_queryset()
# Calculate summary statistics
total_cost = queryset.aggregate(total=Sum('cost_amount'))['total'] or Decimal('0')
downtime_cost = queryset.filter(cost_type='DOWNTIME').aggregate(total=Sum('cost_amount'))['total'] or Decimal('0')
lost_revenue = queryset.filter(cost_type='LOST_REVENUE').aggregate(total=Sum('cost_amount'))['total'] or Decimal('0')
penalty_cost = queryset.filter(cost_type='PENALTY').aggregate(total=Sum('cost_amount'))['total'] or Decimal('0')
resource_cost = queryset.filter(cost_type='RESOURCE_COST').aggregate(total=Sum('cost_amount'))['total'] or Decimal('0')
total_downtime_hours = queryset.aggregate(total=Sum('downtime_hours'))['total'] or Decimal('0')
total_affected_users = queryset.aggregate(total=Sum('affected_users'))['total'] or 0
# Calculate derived metrics
cost_per_hour = (total_cost / total_downtime_hours) if total_downtime_hours > 0 else Decimal('0')
cost_per_user = (total_cost / total_affected_users) if total_affected_users > 0 else Decimal('0')
summary_data = {
'total_cost': total_cost,
'currency': 'USD',
'downtime_cost': downtime_cost,
'lost_revenue': lost_revenue,
'penalty_cost': penalty_cost,
'resource_cost': resource_cost,
'total_downtime_hours': total_downtime_hours,
'total_affected_users': total_affected_users,
'cost_per_hour': cost_per_hour,
'cost_per_user': cost_per_user
}
serializer = CostSummarySerializer(summary_data)
return Response(serializer.data)
class DashboardConfigurationViewSet(viewsets.ModelViewSet):
"""ViewSet for dashboard configurations"""
queryset = DashboardConfiguration.objects.all()
serializer_class = DashboardConfigurationSerializer
pagination_class = StandardResultsSetPagination
permission_classes = [permissions.IsAuthenticated]
def get_queryset(self):
"""Filter queryset based on user permissions"""
queryset = super().get_queryset()
# Filter by dashboard type
dashboard_type = self.request.query_params.get('dashboard_type')
if dashboard_type:
queryset = queryset.filter(dashboard_type=dashboard_type)
# Filter by active status
is_active = self.request.query_params.get('is_active')
if is_active is not None:
queryset = queryset.filter(is_active=is_active.lower() == 'true')
# Filter by public dashboards or user's accessible dashboards
if not self.request.user.is_staff:
queryset = queryset.filter(
Q(is_public=True) | Q(allowed_users=self.request.user)
)
return queryset.order_by('name')
@action(detail=True, methods=['get'])
def data(self, request, pk=None):
"""Get dashboard data"""
dashboard = self.get_object()
# Check if user has access to this dashboard
if not dashboard.is_public and self.request.user not in dashboard.allowed_users.all():
return Response({
'error': 'Access denied to this dashboard'
}, status=status.HTTP_403_FORBIDDEN)
# Get KPI summary
kpi_metrics = KPIMetric.objects.filter(is_active=True)
kpi_summaries = []
for metric in kpi_metrics:
latest_measurement = metric.measurements.first()
if latest_measurement:
kpi_summaries.append({
'metric_type': metric.metric_type,
'metric_name': metric.name,
'current_value': latest_measurement.value,
'unit': latest_measurement.unit,
'trend': 'stable', # Simplified
'trend_percentage': Decimal('0.00'),
'period_start': latest_measurement.measurement_period_start,
'period_end': latest_measurement.measurement_period_end,
'incident_count': latest_measurement.incident_count,
'target_value': None,
'target_met': True
})
# Get anomaly summary
anomalies = AnomalyDetection.objects.all()
anomaly_summary = {
'total_anomalies': anomalies.count(),
'critical_anomalies': anomalies.filter(severity='CRITICAL').count(),
'high_anomalies': anomalies.filter(severity='HIGH').count(),
'medium_anomalies': anomalies.filter(severity='MEDIUM').count(),
'low_anomalies': anomalies.filter(severity='LOW').count(),
'unresolved_anomalies': anomalies.filter(status__in=['DETECTED', 'INVESTIGATING']).count(),
'false_positive_rate': Decimal('0.00'), # Simplified
'average_resolution_time': None
}
# Get cost summary
cost_analyses = CostImpactAnalysis.objects.all()
cost_summary = {
'total_cost': cost_analyses.aggregate(total=Sum('cost_amount'))['total'] or Decimal('0'),
'currency': 'USD',
'downtime_cost': cost_analyses.filter(cost_type='DOWNTIME').aggregate(total=Sum('cost_amount'))['total'] or Decimal('0'),
'lost_revenue': cost_analyses.filter(cost_type='LOST_REVENUE').aggregate(total=Sum('cost_amount'))['total'] or Decimal('0'),
'penalty_cost': cost_analyses.filter(cost_type='PENALTY').aggregate(total=Sum('cost_amount'))['total'] or Decimal('0'),
'resource_cost': cost_analyses.filter(cost_type='RESOURCE_COST').aggregate(total=Sum('cost_amount'))['total'] or Decimal('0'),
'total_downtime_hours': cost_analyses.aggregate(total=Sum('downtime_hours'))['total'] or Decimal('0'),
'total_affected_users': cost_analyses.aggregate(total=Sum('affected_users'))['total'] or 0,
'cost_per_hour': Decimal('0.00'),
'cost_per_user': Decimal('0.00')
}
# Get insight summary
insights = PredictiveInsight.objects.all()
insight_summary = {
'total_insights': insights.count(),
'high_confidence_insights': insights.filter(confidence_level='HIGH').count(),
'medium_confidence_insights': insights.filter(confidence_level='MEDIUM').count(),
'low_confidence_insights': insights.filter(confidence_level='LOW').count(),
'acknowledged_insights': insights.filter(is_acknowledged=True).count(),
'validated_insights': insights.filter(is_validated=True).count(),
'expired_insights': insights.filter(expires_at__lt=timezone.now()).count(),
'average_accuracy': Decimal('0.00'),
'active_models': PredictiveModel.objects.filter(status='ACTIVE').count()
}
# Get recent data
recent_anomalies = anomalies.order_by('-detected_at')[:5]
recent_insights = insights.order_by('-generated_at')[:5]
heatmap_data = HeatmapData.objects.all()[:3]
dashboard_data = {
'kpi_summary': kpi_summaries,
'anomaly_summary': anomaly_summary,
'cost_summary': cost_summary,
'insight_summary': insight_summary,
'recent_anomalies': AnomalyDetectionSerializer(recent_anomalies, many=True).data,
'recent_insights': PredictiveInsightSerializer(recent_insights, many=True).data,
'heatmap_data': HeatmapDataSerializer(heatmap_data, many=True).data,
'last_updated': timezone.now()
}
serializer = DashboardDataSerializer(dashboard_data)
return Response(serializer.data)
class HeatmapDataViewSet(viewsets.ReadOnlyModelViewSet):
"""ViewSet for heatmap data"""
queryset = HeatmapData.objects.all()
serializer_class = HeatmapDataSerializer
pagination_class = StandardResultsSetPagination
permission_classes = [permissions.IsAuthenticated]
def get_queryset(self):
"""Filter queryset based on query parameters"""
queryset = super().get_queryset()
# Filter by heatmap type
heatmap_type = self.request.query_params.get('heatmap_type')
if heatmap_type:
queryset = queryset.filter(heatmap_type=heatmap_type)
# Filter by time granularity
time_granularity = self.request.query_params.get('time_granularity')
if time_granularity:
queryset = queryset.filter(time_granularity=time_granularity)
return queryset.order_by('-created_at')
class PredictiveInsightViewSet(viewsets.ReadOnlyModelViewSet):
"""ViewSet for predictive insights"""
queryset = PredictiveInsight.objects.all()
serializer_class = PredictiveInsightSerializer
pagination_class = StandardResultsSetPagination
permission_classes = [permissions.IsAuthenticated]
def get_queryset(self):
"""Filter queryset based on query parameters"""
queryset = super().get_queryset()
# Filter by insight type
insight_type = self.request.query_params.get('insight_type')
if insight_type:
queryset = queryset.filter(insight_type=insight_type)
# Filter by confidence level
confidence_level = self.request.query_params.get('confidence_level')
if confidence_level:
queryset = queryset.filter(confidence_level=confidence_level)
# Filter by acknowledgment status
is_acknowledged = self.request.query_params.get('is_acknowledged')
if is_acknowledged is not None:
queryset = queryset.filter(is_acknowledged=is_acknowledged.lower() == 'true')
# Filter by validation status
is_validated = self.request.query_params.get('is_validated')
if is_validated is not None:
queryset = queryset.filter(is_validated=is_validated.lower() == 'true')
# Filter by expiry
include_expired = self.request.query_params.get('include_expired', 'false')
if include_expired.lower() != 'true':
queryset = queryset.filter(expires_at__gt=timezone.now())
return queryset.order_by('-generated_at')
@action(detail=True, methods=['post'])
def acknowledge(self, request, pk=None):
"""Acknowledge a predictive insight"""
insight = self.get_object()
if not insight.is_acknowledged:
insight.is_acknowledged = True
insight.acknowledged_by = request.user
insight.acknowledged_at = timezone.now()
insight.save()
return Response({
'message': 'Insight acknowledged',
'insight_id': str(insight.id),
'acknowledged_at': insight.acknowledged_at
})
return Response({
'error': 'Insight is already acknowledged'
}, status=status.HTTP_400_BAD_REQUEST)
@action(detail=False, methods=['get'])
def summary(self, request):
"""Get predictive insight summary"""
queryset = self.get_queryset()
# Calculate summary statistics
total_insights = queryset.count()
high_confidence_insights = queryset.filter(confidence_level='HIGH').count()
medium_confidence_insights = queryset.filter(confidence_level='MEDIUM').count()
low_confidence_insights = queryset.filter(confidence_level='LOW').count()
acknowledged_insights = queryset.filter(is_acknowledged=True).count()
validated_insights = queryset.filter(is_validated=True).count()
expired_insights = queryset.filter(expires_at__lt=timezone.now()).count()
# Calculate average accuracy
validated_insights_with_accuracy = queryset.filter(
is_validated=True,
validation_accuracy__isnull=False
)
if validated_insights_with_accuracy.exists():
avg_accuracy = validated_insights_with_accuracy.aggregate(
avg=Avg('validation_accuracy')
)['avg']
else:
avg_accuracy = None
active_models = PredictiveModel.objects.filter(status='ACTIVE').count()
summary_data = {
'total_insights': total_insights,
'high_confidence_insights': high_confidence_insights,
'medium_confidence_insights': medium_confidence_insights,
'low_confidence_insights': low_confidence_insights,
'acknowledged_insights': acknowledged_insights,
'validated_insights': validated_insights,
'expired_insights': expired_insights,
'average_accuracy': avg_accuracy,
'active_models': active_models
}
serializer = PredictiveInsightSummarySerializer(summary_data)
return Response(serializer.data)