This commit is contained in:
Iliyan Angelov
2025-09-19 11:58:53 +03:00
parent 306b20e24a
commit 6b247e5b9f
11423 changed files with 1500615 additions and 778 deletions

View File

@@ -0,0 +1,477 @@
# Automation & Orchestration API Documentation
## Overview
The Automation & Orchestration module provides comprehensive automation capabilities for incident management, including runbooks, integrations with external systems, ChatOps functionality, auto-remediation, and maintenance window management.
## Features
### 1. Runbooks Automation
- **Predefined Response Steps**: Create and manage automated response procedures
- **Multiple Trigger Types**: Manual, automatic, scheduled, webhook, and ChatOps triggers
- **Execution Tracking**: Monitor runbook execution status and performance
- **Version Control**: Track runbook versions and changes
### 2. External System Integrations
- **ITSM Tools**: Jira, ServiceNow integration
- **CI/CD Tools**: GitHub, Jenkins, Ansible, Terraform
- **Chat Platforms**: Slack, Microsoft Teams, Discord, Mattermost
- **Generic APIs**: Webhook and API integrations
- **Health Monitoring**: Integration health checks and status tracking
### 3. ChatOps Integration
- **Command Execution**: Trigger workflows from chat platforms
- **Security Controls**: User and channel-based access control
- **Command History**: Track and audit ChatOps commands
- **Multi-Platform Support**: Slack, Teams, Discord, Mattermost
### 4. Auto-Remediation
- **Automatic Response**: Trigger remediation actions based on incident conditions
- **Safety Controls**: Approval workflows and execution limits
- **Multiple Remediation Types**: Service restart, deployment rollback, scaling, etc.
- **Execution Tracking**: Monitor remediation success rates and performance
### 5. Maintenance Windows
- **Scheduled Suppression**: Suppress alerts during planned maintenance
- **Service-Specific**: Target specific services and components
- **Flexible Configuration**: Control incident creation, notifications, and escalations
- **Status Management**: Automatic status updates based on schedule
### 6. Workflow Templates
- **Reusable Workflows**: Create templates for common automation scenarios
- **Parameterized Execution**: Support for input parameters and output schemas
- **Template Types**: Incident response, deployment, maintenance, scaling, monitoring
- **Usage Tracking**: Monitor template usage and performance
## API Endpoints
### Runbooks
#### List Runbooks
```
GET /api/automation/runbooks/
```
**Query Parameters:**
- `status`: Filter by status (DRAFT, ACTIVE, INACTIVE, DEPRECATED)
- `trigger_type`: Filter by trigger type (MANUAL, AUTOMATIC, SCHEDULED, WEBHOOK, CHATOPS)
- `category`: Filter by category
- `is_public`: Filter by public/private status
- `search`: Search in name, description, category
**Response:**
```json
{
"count": 10,
"next": null,
"previous": null,
"results": [
{
"id": "uuid",
"name": "Database Service Restart",
"description": "Automated runbook for restarting database services",
"version": "1.0",
"trigger_type": "AUTOMATIC",
"trigger_conditions": {
"severity": ["CRITICAL", "EMERGENCY"],
"category": "database"
},
"steps": [...],
"estimated_duration": "00:05:00",
"category": "database",
"tags": ["database", "restart", "automation"],
"status": "ACTIVE",
"is_public": true,
"execution_count": 5,
"success_rate": 0.8,
"can_trigger": true,
"created_at": "2024-01-15T10:00:00Z",
"updated_at": "2024-01-15T10:00:00Z"
}
]
}
```
#### Create Runbook
```
POST /api/automation/runbooks/
```
**Request Body:**
```json
{
"name": "New Runbook",
"description": "Description of the runbook",
"version": "1.0",
"trigger_type": "MANUAL",
"trigger_conditions": {
"severity": ["HIGH", "CRITICAL"]
},
"steps": [
{
"name": "Step 1",
"action": "check_status",
"timeout": 30,
"parameters": {"service": "web"}
}
],
"estimated_duration": "00:05:00",
"category": "web",
"tags": ["web", "restart"],
"status": "DRAFT",
"is_public": true
}
```
#### Execute Runbook
```
POST /api/automation/runbooks/{id}/execute/
```
**Request Body:**
```json
{
"trigger_data": {
"incident_id": "uuid",
"context": "additional context"
}
}
```
### Integrations
#### List Integrations
```
GET /api/automation/integrations/
```
**Query Parameters:**
- `integration_type`: Filter by type (JIRA, GITHUB, JENKINS, etc.)
- `status`: Filter by status (ACTIVE, INACTIVE, ERROR, CONFIGURING)
- `health_status`: Filter by health status (HEALTHY, WARNING, ERROR, UNKNOWN)
#### Test Integration Connection
```
POST /api/automation/integrations/{id}/test_connection/
```
#### Perform Health Check
```
POST /api/automation/integrations/{id}/health_check/
```
### ChatOps
#### List ChatOps Integrations
```
GET /api/automation/chatops-integrations/
```
#### List ChatOps Commands
```
GET /api/automation/chatops-commands/
```
**Query Parameters:**
- `status`: Filter by execution status
- `chatops_integration`: Filter by integration
- `command`: Filter by command name
- `user_id`: Filter by user ID
- `channel_id`: Filter by channel ID
### Auto-Remediation
#### List Auto-Remediations
```
GET /api/automation/auto-remediations/
```
**Query Parameters:**
- `remediation_type`: Filter by type (SERVICE_RESTART, DEPLOYMENT_ROLLBACK, etc.)
- `trigger_condition_type`: Filter by trigger condition type
- `is_active`: Filter by active status
- `requires_approval`: Filter by approval requirement
#### Approve Auto-Remediation Execution
```
POST /api/automation/auto-remediation-executions/{id}/approve/
```
**Request Body:**
```json
{
"approval_notes": "Approved for execution"
}
```
#### Reject Auto-Remediation Execution
```
POST /api/automation/auto-remediation-executions/{id}/reject/
```
**Request Body:**
```json
{
"rejection_notes": "Rejected due to risk concerns"
}
```
### Maintenance Windows
#### List Maintenance Windows
```
GET /api/automation/maintenance-windows/
```
#### Get Active Maintenance Windows
```
GET /api/automation/maintenance-windows/active/
```
#### Get Upcoming Maintenance Windows
```
GET /api/automation/maintenance-windows/upcoming/
```
### Workflow Templates
#### List Workflow Templates
```
GET /api/automation/workflow-templates/
```
**Query Parameters:**
- `template_type`: Filter by type (INCIDENT_RESPONSE, DEPLOYMENT, etc.)
- `is_public`: Filter by public/private status
## Data Models
### Runbook
- **id**: UUID primary key
- **name**: Unique name for the runbook
- **description**: Detailed description
- **version**: Version string
- **trigger_type**: How the runbook is triggered
- **trigger_conditions**: JSON conditions for triggering
- **steps**: JSON array of execution steps
- **estimated_duration**: Expected execution time
- **category**: Categorization
- **tags**: JSON array of tags
- **status**: Current status
- **is_public**: Public/private visibility
- **execution_count**: Number of executions
- **success_rate**: Success rate (0.0-1.0)
### Integration
- **id**: UUID primary key
- **name**: Unique name for the integration
- **integration_type**: Type of integration (JIRA, GITHUB, etc.)
- **description**: Description
- **configuration**: JSON configuration data
- **authentication_config**: JSON authentication data
- **status**: Integration status
- **health_status**: Health status
- **request_count**: Number of requests made
- **last_used_at**: Last usage timestamp
### ChatOpsIntegration
- **id**: UUID primary key
- **name**: Unique name
- **platform**: Chat platform (SLACK, TEAMS, etc.)
- **webhook_url**: Webhook URL
- **bot_token**: Bot authentication token
- **channel_id**: Default channel ID
- **command_prefix**: Command prefix character
- **available_commands**: JSON array of available commands
- **allowed_users**: JSON array of allowed user IDs
- **allowed_channels**: JSON array of allowed channel IDs
- **is_active**: Active status
### AutoRemediation
- **id**: UUID primary key
- **name**: Unique name
- **description**: Description
- **remediation_type**: Type of remediation action
- **trigger_conditions**: JSON trigger conditions
- **trigger_condition_type**: Type of trigger condition
- **remediation_config**: JSON remediation configuration
- **timeout_seconds**: Execution timeout
- **requires_approval**: Whether approval is required
- **approval_users**: Many-to-many relationship with users
- **max_executions_per_incident**: Maximum executions per incident
- **is_active**: Active status
- **execution_count**: Number of executions
- **success_count**: Number of successful executions
### MaintenanceWindow
- **id**: UUID primary key
- **name**: Name of the maintenance window
- **description**: Description
- **start_time**: Start datetime
- **end_time**: End datetime
- **timezone**: Timezone
- **affected_services**: JSON array of affected services
- **affected_components**: JSON array of affected components
- **suppress_incident_creation**: Whether to suppress incident creation
- **suppress_notifications**: Whether to suppress notifications
- **suppress_escalations**: Whether to suppress escalations
- **status**: Current status
- **incidents_suppressed**: Count of suppressed incidents
- **notifications_suppressed**: Count of suppressed notifications
### WorkflowTemplate
- **id**: UUID primary key
- **name**: Unique name
- **description**: Description
- **template_type**: Type of workflow template
- **workflow_steps**: JSON array of workflow steps
- **input_parameters**: JSON array of input parameters
- **output_schema**: JSON output schema
- **usage_count**: Number of times used
- **is_public**: Public/private visibility
## Security Features
### Access Control
- **User Permissions**: Role-based access control
- **Data Classification**: Integration with security module
- **Audit Logging**: Comprehensive audit trails
- **API Authentication**: Token and session authentication
### ChatOps Security
- **User Whitelisting**: Restrict commands to specific users
- **Channel Restrictions**: Limit commands to specific channels
- **Command Validation**: Validate command parameters
- **Execution Logging**: Log all command executions
### Auto-Remediation Safety
- **Approval Workflows**: Require manual approval for sensitive actions
- **Execution Limits**: Limit executions per incident
- **Timeout Controls**: Prevent runaway executions
- **Rollback Capabilities**: Support for rollback operations
## Integration with Other Modules
### Incident Intelligence Integration
- **Automatic Triggering**: Trigger runbooks based on incident characteristics
- **AI Suggestions**: AI-driven runbook recommendations
- **Correlation**: Link automation actions to incident patterns
- **Maintenance Suppression**: Suppress incidents during maintenance windows
### Security Module Integration
- **Access Control**: Use security module for authentication and authorization
- **Data Classification**: Apply data classification to automation data
- **Audit Integration**: Integrate with security audit trails
- **MFA Support**: Support multi-factor authentication for sensitive operations
## Best Practices
### Runbook Design
1. **Clear Steps**: Define clear, atomic steps
2. **Error Handling**: Include error handling and rollback procedures
3. **Timeout Management**: Set appropriate timeouts for each step
4. **Documentation**: Provide clear documentation for each step
5. **Testing**: Test runbooks in non-production environments
### Integration Management
1. **Health Monitoring**: Regularly monitor integration health
2. **Credential Management**: Securely store and rotate credentials
3. **Rate Limiting**: Implement appropriate rate limiting
4. **Error Handling**: Handle integration failures gracefully
5. **Monitoring**: Monitor integration usage and performance
### Auto-Remediation
1. **Conservative Approach**: Start with low-risk remediations
2. **Approval Workflows**: Use approval workflows for high-risk actions
3. **Monitoring**: Monitor remediation success rates
4. **Documentation**: Document all remediation actions
5. **Testing**: Test remediations in controlled environments
### Maintenance Windows
1. **Communication**: Communicate maintenance windows to stakeholders
2. **Scope Definition**: Clearly define affected services and components
3. **Rollback Plans**: Have rollback plans for maintenance activities
4. **Monitoring**: Monitor system health during maintenance
5. **Documentation**: Document maintenance activities and outcomes
## Error Handling
### Common Error Scenarios
1. **Integration Failures**: Handle external system unavailability
2. **Authentication Errors**: Handle credential expiration
3. **Timeout Errors**: Handle execution timeouts
4. **Permission Errors**: Handle insufficient permissions
5. **Data Validation Errors**: Handle invalid input data
### Error Response Format
```json
{
"error": "Error message",
"code": "ERROR_CODE",
"details": {
"field": "specific field error"
},
"timestamp": "2024-01-15T10:00:00Z"
}
```
## Rate Limiting
### Default Limits
- **API Requests**: 1000 requests per hour per user
- **Runbook Executions**: 10 executions per hour per user
- **Integration Calls**: 100 calls per hour per integration
- **ChatOps Commands**: 50 commands per hour per user
### Custom Limits
- Configure custom rate limits per user role
- Set different limits for different integration types
- Implement burst allowances for emergency situations
## Monitoring and Alerting
### Key Metrics
- **Runbook Success Rate**: Track runbook execution success
- **Integration Health**: Monitor integration availability
- **Auto-Remediation Effectiveness**: Track remediation success
- **ChatOps Usage**: Monitor ChatOps command usage
- **Maintenance Window Impact**: Track maintenance window effectiveness
### Alerting
- **Integration Failures**: Alert on integration health issues
- **Runbook Failures**: Alert on runbook execution failures
- **Auto-Remediation Issues**: Alert on remediation failures
- **Rate Limit Exceeded**: Alert on rate limit violations
- **Security Issues**: Alert on security-related events
## Troubleshooting
### Common Issues
1. **Runbook Execution Failures**: Check step configurations and permissions
2. **Integration Connection Issues**: Verify credentials and network connectivity
3. **ChatOps Command Failures**: Check user permissions and command syntax
4. **Auto-Remediation Not Triggering**: Verify trigger conditions and permissions
5. **Maintenance Window Not Working**: Check timezone and schedule configuration
### Debug Information
- Enable debug logging for detailed execution information
- Use execution logs to trace runbook and workflow execution
- Check integration health status and error messages
- Review audit logs for security and access issues
- Monitor system metrics for performance issues
## Future Enhancements
### Planned Features
1. **Visual Workflow Builder**: Drag-and-drop workflow creation
2. **Advanced AI Integration**: Enhanced AI-driven automation suggestions
3. **Multi-Cloud Support**: Support for multiple cloud providers
4. **Advanced Analytics**: Enhanced reporting and analytics capabilities
5. **Mobile Support**: Mobile app for automation management
### Integration Roadmap
1. **Additional ITSM Tools**: ServiceNow, Remedy, etc.
2. **Cloud Platforms**: AWS, Azure, GCP integrations
3. **Monitoring Tools**: Prometheus, Grafana, DataDog
4. **Communication Platforms**: Additional chat platforms
5. **Development Tools**: GitLab, Bitbucket, CircleCI

View File

@@ -0,0 +1,149 @@
"""
Admin configuration for automation_orchestration app
"""
from django.contrib import admin
from django.utils.html import format_html
from django.urls import reverse
from django.utils.safestring import mark_safe
from .models import (
Runbook,
RunbookExecution,
Integration,
ChatOpsIntegration,
ChatOpsCommand,
AutoRemediation,
AutoRemediationExecution,
MaintenanceWindow,
WorkflowTemplate,
WorkflowExecution,
)
@admin.register(Runbook)
class RunbookAdmin(admin.ModelAdmin):
list_display = [
'name', 'version', 'trigger_type', 'status', 'category',
'execution_count', 'success_rate', 'is_public', 'created_by', 'created_at'
]
list_filter = ['status', 'trigger_type', 'category', 'is_public', 'created_at']
search_fields = ['name', 'description', 'category']
readonly_fields = ['id', 'execution_count', 'success_rate', 'created_at', 'updated_at', 'last_executed_at']
fieldsets = (
('Basic Information', {
'fields': ('id', 'name', 'description', 'version', 'category', 'tags')
}),
('Trigger Configuration', {
'fields': ('trigger_type', 'trigger_conditions')
}),
('Content', {
'fields': ('steps', 'estimated_duration')
}),
('Status & Permissions', {
'fields': ('status', 'is_public')
}),
('Metadata', {
'fields': ('created_by', 'last_modified_by', 'created_at', 'updated_at', 'last_executed_at'),
'classes': ('collapse',)
}),
('Statistics', {
'fields': ('execution_count', 'success_rate'),
'classes': ('collapse',)
}),
)
@admin.register(RunbookExecution)
class RunbookExecutionAdmin(admin.ModelAdmin):
list_display = [
'runbook', 'triggered_by', 'trigger_type', 'status',
'current_step', 'total_steps', 'started_at', 'duration'
]
list_filter = ['status', 'trigger_type', 'started_at']
search_fields = ['runbook__name', 'triggered_by__username', 'incident__title']
readonly_fields = ['id', 'started_at', 'completed_at', 'duration']
@admin.register(Integration)
class IntegrationAdmin(admin.ModelAdmin):
list_display = [
'name', 'integration_type', 'status', 'health_status',
'request_count', 'last_used_at', 'created_by'
]
list_filter = ['integration_type', 'status', 'health_status', 'created_at']
search_fields = ['name', 'description']
readonly_fields = ['id', 'request_count', 'last_used_at', 'created_at', 'updated_at', 'last_health_check']
@admin.register(ChatOpsIntegration)
class ChatOpsIntegrationAdmin(admin.ModelAdmin):
list_display = [
'name', 'platform', 'is_active', 'last_activity', 'created_by'
]
list_filter = ['platform', 'is_active', 'created_at']
search_fields = ['name']
readonly_fields = ['id', 'last_activity', 'created_at', 'updated_at']
@admin.register(ChatOpsCommand)
class ChatOpsCommandAdmin(admin.ModelAdmin):
list_display = [
'command', 'chatops_integration', 'user_id', 'status',
'executed_at', 'completed_at'
]
list_filter = ['status', 'chatops_integration__platform', 'executed_at']
search_fields = ['command', 'user_id', 'channel_id']
readonly_fields = ['id', 'executed_at', 'completed_at']
@admin.register(AutoRemediation)
class AutoRemediationAdmin(admin.ModelAdmin):
list_display = [
'name', 'remediation_type', 'trigger_condition_type',
'is_active', 'requires_approval', 'execution_count', 'success_count'
]
list_filter = ['remediation_type', 'trigger_condition_type', 'is_active', 'requires_approval']
search_fields = ['name', 'description']
readonly_fields = ['id', 'execution_count', 'success_count', 'last_executed_at', 'created_at', 'updated_at']
@admin.register(AutoRemediationExecution)
class AutoRemediationExecutionAdmin(admin.ModelAdmin):
list_display = [
'auto_remediation', 'incident', 'status', 'triggered_at',
'approved_by', 'completed_at'
]
list_filter = ['status', 'triggered_at', 'auto_remediation__remediation_type']
search_fields = ['auto_remediation__name', 'incident__title', 'approved_by__username']
readonly_fields = ['id', 'triggered_at', 'started_at', 'completed_at', 'duration']
@admin.register(MaintenanceWindow)
class MaintenanceWindowAdmin(admin.ModelAdmin):
list_display = [
'name', 'start_time', 'end_time', 'status',
'incidents_suppressed', 'notifications_suppressed', 'created_by'
]
list_filter = ['status', 'start_time', 'end_time']
search_fields = ['name', 'description']
readonly_fields = ['id', 'incidents_suppressed', 'notifications_suppressed', 'created_at', 'updated_at']
@admin.register(WorkflowTemplate)
class WorkflowTemplateAdmin(admin.ModelAdmin):
list_display = [
'name', 'template_type', 'usage_count', 'is_public', 'created_by'
]
list_filter = ['template_type', 'is_public', 'created_at']
search_fields = ['name', 'description']
readonly_fields = ['id', 'usage_count', 'created_at', 'updated_at']
@admin.register(WorkflowExecution)
class WorkflowExecutionAdmin(admin.ModelAdmin):
list_display = [
'name', 'workflow_template', 'triggered_by', 'status',
'current_step', 'total_steps', 'started_at', 'duration'
]
list_filter = ['status', 'trigger_type', 'started_at']
search_fields = ['name', 'workflow_template__name', 'triggered_by__username']
readonly_fields = ['id', 'started_at', 'completed_at', 'duration']

View File

@@ -0,0 +1,11 @@
from django.apps import AppConfig
class AutomationOrchestrationConfig(AppConfig):
default_auto_field = 'django.db.models.BigAutoField'
name = 'automation_orchestration'
verbose_name = 'Automation & Orchestration'
def ready(self):
"""Import signal handlers when the app is ready"""
import automation_orchestration.signals

View File

@@ -0,0 +1,433 @@
"""
Management command to set up automation & orchestration module
"""
from django.core.management.base import BaseCommand
from django.contrib.auth import get_user_model
from datetime import timedelta, datetime
from django.utils import timezone
from automation_orchestration.models import (
Runbook,
Integration,
ChatOpsIntegration,
AutoRemediation,
MaintenanceWindow,
WorkflowTemplate,
)
User = get_user_model()
class Command(BaseCommand):
help = 'Set up automation & orchestration module with sample data'
def add_arguments(self, parser):
parser.add_argument(
'--reset',
action='store_true',
help='Reset existing data before creating new data',
)
def handle(self, *args, **options):
if options['reset']:
self.stdout.write('Resetting existing automation data...')
self.reset_data()
self.stdout.write('Setting up automation & orchestration module...')
# Create sample runbooks
self.create_sample_runbooks()
# Create sample integrations
self.create_sample_integrations()
# Create sample ChatOps integrations
self.create_sample_chatops_integrations()
# Create sample auto-remediations
self.create_sample_auto_remediations()
# Create sample maintenance windows
self.create_sample_maintenance_windows()
# Create sample workflow templates
self.create_sample_workflow_templates()
self.stdout.write(
self.style.SUCCESS('Successfully set up automation & orchestration module!')
)
def reset_data(self):
"""Reset existing automation data"""
Runbook.objects.all().delete()
Integration.objects.all().delete()
ChatOpsIntegration.objects.all().delete()
AutoRemediation.objects.all().delete()
MaintenanceWindow.objects.all().delete()
WorkflowTemplate.objects.all().delete()
def create_sample_runbooks(self):
"""Create sample runbooks"""
self.stdout.write('Creating sample runbooks...')
# Get or create a superuser for sample data
admin_user = User.objects.filter(is_superuser=True).first()
if not admin_user:
admin_user = User.objects.create_superuser(
username='admin',
email='admin@example.com',
password='admin123'
)
# Sample runbook 1: Database restart
runbook1, created = Runbook.objects.get_or_create(
name='Database Service Restart',
defaults={
'description': 'Automated runbook for restarting database services',
'version': '1.0',
'trigger_type': 'AUTOMATIC',
'trigger_conditions': {
'severity': ['CRITICAL', 'EMERGENCY'],
'category': 'database',
'keywords': ['database', 'connection', 'timeout']
},
'steps': [
{
'name': 'Check database status',
'action': 'check_service_status',
'timeout': 30,
'parameters': {'service': 'postgresql'}
},
{
'name': 'Stop database service',
'action': 'stop_service',
'timeout': 60,
'parameters': {'service': 'postgresql'}
},
{
'name': 'Start database service',
'action': 'start_service',
'timeout': 120,
'parameters': {'service': 'postgresql'}
},
{
'name': 'Verify database connectivity',
'action': 'verify_connectivity',
'timeout': 30,
'parameters': {'host': 'localhost', 'port': 5432}
}
],
'estimated_duration': timedelta(minutes=5),
'category': 'database',
'tags': ['database', 'restart', 'automation'],
'status': 'ACTIVE',
'is_public': True,
'created_by': admin_user
}
)
# Sample runbook 2: Web server scaling
runbook2, created = Runbook.objects.get_or_create(
name='Web Server Scale Up',
defaults={
'description': 'Automated runbook for scaling up web servers',
'version': '1.0',
'trigger_type': 'AUTOMATIC',
'trigger_conditions': {
'severity': ['HIGH', 'CRITICAL'],
'category': 'performance',
'metrics': {'cpu_usage': '>80', 'response_time': '>2000'}
},
'steps': [
{
'name': 'Check current load',
'action': 'check_metrics',
'timeout': 30,
'parameters': {'metrics': ['cpu', 'memory', 'response_time']}
},
{
'name': 'Scale up instances',
'action': 'scale_instances',
'timeout': 300,
'parameters': {'count': 2, 'instance_type': 'web'}
},
{
'name': 'Update load balancer',
'action': 'update_load_balancer',
'timeout': 60,
'parameters': {'new_instances': True}
},
{
'name': 'Verify scaling',
'action': 'verify_scaling',
'timeout': 120,
'parameters': {'expected_instances': '+2'}
}
],
'estimated_duration': timedelta(minutes=10),
'category': 'scaling',
'tags': ['scaling', 'performance', 'web'],
'status': 'ACTIVE',
'is_public': True,
'created_by': admin_user
}
)
if created:
self.stdout.write(f' Created runbook: {runbook1.name}')
self.stdout.write(f' Created runbook: {runbook2.name}')
else:
self.stdout.write(' Sample runbooks already exist')
def create_sample_integrations(self):
"""Create sample integrations"""
self.stdout.write('Creating sample integrations...')
admin_user = User.objects.filter(is_superuser=True).first()
# Jira integration
jira_integration, created = Integration.objects.get_or_create(
name='Jira Production',
defaults={
'integration_type': 'JIRA',
'description': 'Jira integration for production environment',
'configuration': {
'base_url': 'https://company.atlassian.net',
'project_key': 'PROD',
'issue_type': 'Bug'
},
'authentication_config': {
'auth_type': 'basic',
'username': 'jira_user',
'api_token': 'encrypted_token_here'
},
'status': 'ACTIVE',
'health_status': 'HEALTHY',
'created_by': admin_user
}
)
# GitHub integration
github_integration, created = Integration.objects.get_or_create(
name='GitHub Main Repository',
defaults={
'integration_type': 'GITHUB',
'description': 'GitHub integration for main repository',
'configuration': {
'repository': 'company/main-repo',
'branch': 'main',
'webhook_secret': 'webhook_secret_here'
},
'authentication_config': {
'auth_type': 'token',
'access_token': 'encrypted_token_here'
},
'status': 'ACTIVE',
'health_status': 'HEALTHY',
'created_by': admin_user
}
)
if created:
self.stdout.write(f' Created integration: {jira_integration.name}')
self.stdout.write(f' Created integration: {github_integration.name}')
else:
self.stdout.write(' Sample integrations already exist')
def create_sample_chatops_integrations(self):
"""Create sample ChatOps integrations"""
self.stdout.write('Creating sample ChatOps integrations...')
admin_user = User.objects.filter(is_superuser=True).first()
# Slack integration
slack_integration, created = ChatOpsIntegration.objects.get_or_create(
name='Production Slack',
defaults={
'platform': 'SLACK',
'webhook_url': 'https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX',
'bot_token': 'xoxb-0000000000000-0000000000000-XXXXXXXXXXXXXXXXXXXXXXXX',
'channel_id': 'C0000000000',
'command_prefix': '!',
'available_commands': [
{
'name': 'incident',
'description': 'Create or manage incidents',
'usage': '!incident create "title" "description"'
},
{
'name': 'status',
'description': 'Check system status',
'usage': '!status [service]'
},
{
'name': 'runbook',
'description': 'Execute runbooks',
'usage': '!runbook execute <runbook_name>'
}
],
'allowed_users': ['U0000000000', 'U0000000001'],
'allowed_channels': ['C0000000000', 'C0000000001'],
'is_active': True,
'created_by': admin_user
}
)
if created:
self.stdout.write(f' Created ChatOps integration: {slack_integration.name}')
else:
self.stdout.write(' Sample ChatOps integrations already exist')
def create_sample_auto_remediations(self):
"""Create sample auto-remediations"""
self.stdout.write('Creating sample auto-remediations...')
admin_user = User.objects.filter(is_superuser=True).first()
# Service restart remediation
service_restart, created = AutoRemediation.objects.get_or_create(
name='Auto Restart Database Service',
defaults={
'description': 'Automatically restart database service when connection issues are detected',
'remediation_type': 'SERVICE_RESTART',
'trigger_conditions': {
'severity': ['CRITICAL', 'EMERGENCY'],
'category': 'database',
'error_patterns': ['connection timeout', 'connection refused', 'database unavailable']
},
'trigger_condition_type': 'CATEGORY',
'remediation_config': {
'service_name': 'postgresql',
'restart_command': 'systemctl restart postgresql',
'verify_command': 'systemctl is-active postgresql',
'max_restart_attempts': 3
},
'timeout_seconds': 300,
'requires_approval': False,
'max_executions_per_incident': 1,
'is_active': True,
'created_by': admin_user
}
)
# Deployment rollback remediation
rollback_remediation, created = AutoRemediation.objects.get_or_create(
name='Auto Rollback Failed Deployment',
defaults={
'description': 'Automatically rollback deployment when critical errors are detected',
'remediation_type': 'DEPLOYMENT_ROLLBACK',
'trigger_conditions': {
'severity': ['CRITICAL', 'EMERGENCY'],
'category': 'deployment',
'error_rate_threshold': 0.1,
'time_window_minutes': 5
},
'trigger_condition_type': 'SEVERITY',
'remediation_config': {
'rollback_to_version': 'previous',
'rollback_command': 'kubectl rollout undo deployment/web-app',
'verify_command': 'kubectl get pods -l app=web-app',
'notification_channels': ['slack', 'email']
},
'timeout_seconds': 600,
'requires_approval': True,
'max_executions_per_incident': 1,
'is_active': True,
'created_by': admin_user
}
)
if created:
self.stdout.write(f' Created auto-remediation: {service_restart.name}')
self.stdout.write(f' Created auto-remediation: {rollback_remediation.name}')
else:
self.stdout.write(' Sample auto-remediations already exist')
def create_sample_maintenance_windows(self):
"""Create sample maintenance windows"""
self.stdout.write('Creating sample maintenance windows...')
admin_user = User.objects.filter(is_superuser=True).first()
# Weekly maintenance window
weekly_maintenance, created = MaintenanceWindow.objects.get_or_create(
name='Weekly System Maintenance',
defaults={
'description': 'Weekly maintenance window for system updates and patches',
'start_time': timezone.now() + timedelta(days=1),
'end_time': timezone.now() + timedelta(days=1, hours=2),
'timezone': 'UTC',
'affected_services': ['web-app', 'api-service', 'database'],
'affected_components': ['load-balancer', 'cache', 'monitoring'],
'suppress_incident_creation': True,
'suppress_notifications': True,
'suppress_escalations': True,
'status': 'SCHEDULED',
'created_by': admin_user
}
)
if created:
self.stdout.write(f' Created maintenance window: {weekly_maintenance.name}')
else:
self.stdout.write(' Sample maintenance windows already exist')
def create_sample_workflow_templates(self):
"""Create sample workflow templates"""
self.stdout.write('Creating sample workflow templates...')
admin_user = User.objects.filter(is_superuser=True).first()
# Incident response workflow
incident_workflow, created = WorkflowTemplate.objects.get_or_create(
name='Standard Incident Response',
defaults={
'description': 'Standard workflow for incident response and resolution',
'template_type': 'INCIDENT_RESPONSE',
'workflow_steps': [
{
'name': 'Initial Assessment',
'action': 'assess_incident',
'conditions': {'severity': ['HIGH', 'CRITICAL', 'EMERGENCY']},
'timeout': 300
},
{
'name': 'Notify Stakeholders',
'action': 'notify_stakeholders',
'conditions': {'auto_notify': True},
'timeout': 60
},
{
'name': 'Execute Runbook',
'action': 'execute_runbook',
'conditions': {'has_runbook': True},
'timeout': 1800
},
{
'name': 'Update Status',
'action': 'update_incident_status',
'conditions': {'always': True},
'timeout': 30
}
],
'input_parameters': [
{'name': 'incident_id', 'type': 'string', 'required': True},
{'name': 'severity', 'type': 'string', 'required': True},
{'name': 'category', 'type': 'string', 'required': False}
],
'output_schema': {
'type': 'object',
'properties': {
'status': {'type': 'string'},
'resolution_time': {'type': 'string'},
'actions_taken': {'type': 'array'}
}
},
'is_public': True,
'created_by': admin_user
}
)
if created:
self.stdout.write(f' Created workflow template: {incident_workflow.name}')
else:
self.stdout.write(' Sample workflow templates already exist')

View File

@@ -0,0 +1,349 @@
# Generated by Django 5.2.6 on 2025-09-18 15:29
import django.core.validators
import django.db.models.deletion
import uuid
from django.conf import settings
from django.db import migrations, models
class Migration(migrations.Migration):
initial = True
dependencies = [
('incident_intelligence', '0003_incident_auto_remediation_attempted_and_more'),
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
]
operations = [
migrations.CreateModel(
name='AutoRemediation',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('name', models.CharField(max_length=200, unique=True)),
('description', models.TextField()),
('remediation_type', models.CharField(choices=[('SERVICE_RESTART', 'Service Restart'), ('DEPLOYMENT_ROLLBACK', 'Deployment Rollback'), ('SCALE_UP', 'Scale Up Resources'), ('SCALE_DOWN', 'Scale Down Resources'), ('CACHE_CLEAR', 'Clear Cache'), ('CONFIG_UPDATE', 'Configuration Update'), ('CUSTOM_SCRIPT', 'Custom Script'), ('WEBHOOK', 'Webhook Call')], max_length=30)),
('trigger_conditions', models.JSONField(default=dict, help_text='Conditions that trigger this remediation')),
('trigger_condition_type', models.CharField(choices=[('SEVERITY', 'Incident Severity'), ('CATEGORY', 'Incident Category'), ('SERVICE', 'Affected Service'), ('DURATION', 'Incident Duration'), ('PATTERN', 'Pattern Match')], max_length=20)),
('remediation_config', models.JSONField(default=dict, help_text='Configuration for the remediation action')),
('timeout_seconds', models.PositiveIntegerField(default=300, help_text='Timeout for remediation action')),
('requires_approval', models.BooleanField(default=False, help_text='Whether manual approval is required')),
('max_executions_per_incident', models.PositiveIntegerField(default=1, help_text='Max times this can run per incident')),
('is_active', models.BooleanField(default=True)),
('created_at', models.DateTimeField(auto_now_add=True)),
('updated_at', models.DateTimeField(auto_now=True)),
('execution_count', models.PositiveIntegerField(default=0)),
('success_count', models.PositiveIntegerField(default=0)),
('last_executed_at', models.DateTimeField(blank=True, null=True)),
('approval_users', models.ManyToManyField(blank=True, help_text='Users who can approve this remediation', related_name='approvable_remediations', to=settings.AUTH_USER_MODEL)),
('created_by', models.ForeignKey(null=True, on_delete=django.db.models.deletion.SET_NULL, related_name='created_auto_remediations', to=settings.AUTH_USER_MODEL)),
],
options={
'ordering': ['name'],
},
),
migrations.CreateModel(
name='AutoRemediationExecution',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('status', models.CharField(choices=[('PENDING', 'Pending'), ('APPROVED', 'Approved'), ('EXECUTING', 'Executing'), ('COMPLETED', 'Completed'), ('FAILED', 'Failed'), ('CANCELLED', 'Cancelled'), ('TIMEOUT', 'Timeout'), ('REJECTED', 'Rejected')], default='PENDING', max_length=20)),
('trigger_data', models.JSONField(default=dict, help_text='Data that triggered the remediation')),
('approved_at', models.DateTimeField(blank=True, null=True)),
('approval_notes', models.TextField(blank=True, null=True)),
('execution_log', models.JSONField(default=list, help_text='Detailed execution log')),
('output_data', models.JSONField(default=dict, help_text='Output data from remediation')),
('error_message', models.TextField(blank=True, null=True)),
('triggered_at', models.DateTimeField(auto_now_add=True)),
('started_at', models.DateTimeField(blank=True, null=True)),
('completed_at', models.DateTimeField(blank=True, null=True)),
('duration', models.DurationField(blank=True, null=True)),
('approved_by', models.ForeignKey(blank=True, null=True, on_delete=django.db.models.deletion.SET_NULL, related_name='approved_remediations', to=settings.AUTH_USER_MODEL)),
('auto_remediation', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, related_name='executions', to='automation_orchestration.autoremediation')),
('incident', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, related_name='auto_remediations', to='incident_intelligence.incident')),
],
options={
'ordering': ['-triggered_at'],
},
),
migrations.CreateModel(
name='ChatOpsIntegration',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('name', models.CharField(max_length=200, unique=True)),
('platform', models.CharField(choices=[('SLACK', 'Slack'), ('TEAMS', 'Microsoft Teams'), ('DISCORD', 'Discord'), ('MATTERMOST', 'Mattermost')], max_length=20)),
('webhook_url', models.URLField(help_text='Webhook URL for the chat platform')),
('bot_token', models.CharField(help_text='Bot authentication token', max_length=500)),
('channel_id', models.CharField(help_text='Default channel ID', max_length=100)),
('command_prefix', models.CharField(default='!', help_text='Command prefix (e.g., !, /)', max_length=10)),
('available_commands', models.JSONField(default=list, help_text='List of available commands and their descriptions')),
('allowed_users', models.JSONField(default=list, help_text='List of user IDs allowed to use commands')),
('allowed_channels', models.JSONField(default=list, help_text='List of channel IDs where commands are allowed')),
('is_active', models.BooleanField(default=True)),
('last_activity', models.DateTimeField(blank=True, null=True)),
('created_at', models.DateTimeField(auto_now_add=True)),
('updated_at', models.DateTimeField(auto_now=True)),
('created_by', models.ForeignKey(null=True, on_delete=django.db.models.deletion.SET_NULL, to=settings.AUTH_USER_MODEL)),
],
options={
'ordering': ['name'],
},
),
migrations.CreateModel(
name='Integration',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('name', models.CharField(max_length=200, unique=True)),
('integration_type', models.CharField(choices=[('JIRA', 'Jira'), ('GITHUB', 'GitHub'), ('JENKINS', 'Jenkins'), ('SERVICENOW', 'ServiceNow'), ('ANSIBLE', 'Ansible'), ('TERRAFORM', 'Terraform'), ('SLACK', 'Slack'), ('TEAMS', 'Microsoft Teams'), ('WEBHOOK', 'Generic Webhook'), ('API', 'Generic API')], max_length=20)),
('description', models.TextField(blank=True, null=True)),
('configuration', models.JSONField(default=dict, help_text='Integration-specific configuration (API keys, URLs, etc.)')),
('authentication_config', models.JSONField(default=dict, help_text='Authentication configuration (OAuth, API keys, etc.)')),
('status', models.CharField(choices=[('ACTIVE', 'Active'), ('INACTIVE', 'Inactive'), ('ERROR', 'Error'), ('CONFIGURING', 'Configuring')], default='CONFIGURING', max_length=20)),
('last_health_check', models.DateTimeField(blank=True, null=True)),
('health_status', models.CharField(choices=[('HEALTHY', 'Healthy'), ('WARNING', 'Warning'), ('ERROR', 'Error'), ('UNKNOWN', 'Unknown')], default='UNKNOWN', max_length=20)),
('error_message', models.TextField(blank=True, null=True)),
('request_count', models.PositiveIntegerField(default=0)),
('last_used_at', models.DateTimeField(blank=True, null=True)),
('created_at', models.DateTimeField(auto_now_add=True)),
('updated_at', models.DateTimeField(auto_now=True)),
('created_by', models.ForeignKey(null=True, on_delete=django.db.models.deletion.SET_NULL, to=settings.AUTH_USER_MODEL)),
],
options={
'ordering': ['name'],
},
),
migrations.CreateModel(
name='MaintenanceWindow',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('name', models.CharField(max_length=200)),
('description', models.TextField()),
('start_time', models.DateTimeField(help_text='When maintenance window starts')),
('end_time', models.DateTimeField(help_text='When maintenance window ends')),
('timezone', models.CharField(default='UTC', max_length=50)),
('affected_services', models.JSONField(default=list, help_text='List of services affected by this maintenance')),
('affected_components', models.JSONField(default=list, help_text='List of components affected by this maintenance')),
('suppress_incident_creation', models.BooleanField(default=True)),
('suppress_notifications', models.BooleanField(default=True)),
('suppress_escalations', models.BooleanField(default=True)),
('status', models.CharField(choices=[('SCHEDULED', 'Scheduled'), ('ACTIVE', 'Active'), ('COMPLETED', 'Completed'), ('CANCELLED', 'Cancelled')], default='SCHEDULED', max_length=20)),
('created_at', models.DateTimeField(auto_now_add=True)),
('updated_at', models.DateTimeField(auto_now=True)),
('incidents_suppressed', models.PositiveIntegerField(default=0)),
('notifications_suppressed', models.PositiveIntegerField(default=0)),
('created_by', models.ForeignKey(null=True, on_delete=django.db.models.deletion.SET_NULL, to=settings.AUTH_USER_MODEL)),
],
options={
'ordering': ['start_time'],
},
),
migrations.CreateModel(
name='Runbook',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('name', models.CharField(max_length=200, unique=True)),
('description', models.TextField()),
('version', models.CharField(default='1.0', max_length=20)),
('trigger_type', models.CharField(choices=[('MANUAL', 'Manual Trigger'), ('AUTOMATIC', 'Automatic Trigger'), ('SCHEDULED', 'Scheduled Trigger'), ('WEBHOOK', 'Webhook Trigger'), ('CHATOPS', 'ChatOps Trigger')], default='MANUAL', max_length=20)),
('trigger_conditions', models.JSONField(default=dict, help_text='Conditions that trigger this runbook (incident severity, category, etc.)')),
('steps', models.JSONField(default=list, help_text='List of steps to execute in order')),
('estimated_duration', models.DurationField(help_text='Estimated time to complete')),
('category', models.CharField(blank=True, max_length=100, null=True)),
('tags', models.JSONField(default=list, help_text='Tags for categorization and search')),
('status', models.CharField(choices=[('DRAFT', 'Draft'), ('ACTIVE', 'Active'), ('INACTIVE', 'Inactive'), ('DEPRECATED', 'Deprecated')], default='DRAFT', max_length=20)),
('is_public', models.BooleanField(default=True, help_text='Whether this runbook is available to all users')),
('execution_count', models.PositiveIntegerField(default=0)),
('success_rate', models.FloatField(default=0.0, help_text='Success rate of runbook executions (0.0-1.0)', validators=[django.core.validators.MinValueValidator(0.0), django.core.validators.MaxValueValidator(1.0)])),
('created_at', models.DateTimeField(auto_now_add=True)),
('updated_at', models.DateTimeField(auto_now=True)),
('last_executed_at', models.DateTimeField(blank=True, null=True)),
('created_by', models.ForeignKey(null=True, on_delete=django.db.models.deletion.SET_NULL, related_name='created_runbooks', to=settings.AUTH_USER_MODEL)),
('last_modified_by', models.ForeignKey(null=True, on_delete=django.db.models.deletion.SET_NULL, related_name='modified_runbooks', to=settings.AUTH_USER_MODEL)),
],
options={
'ordering': ['name'],
},
),
migrations.CreateModel(
name='ChatOpsCommand',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('command', models.CharField(help_text='The command that was executed', max_length=100)),
('arguments', models.JSONField(default=list, help_text='Command arguments')),
('user_id', models.CharField(help_text='User ID from chat platform', max_length=100)),
('channel_id', models.CharField(help_text='Channel ID where command was executed', max_length=100)),
('status', models.CharField(choices=[('PENDING', 'Pending'), ('EXECUTING', 'Executing'), ('COMPLETED', 'Completed'), ('FAILED', 'Failed'), ('CANCELLED', 'Cancelled')], default='PENDING', max_length=20)),
('response_message', models.TextField(blank=True, null=True)),
('execution_log', models.JSONField(default=list, help_text='Detailed execution log')),
('error_message', models.TextField(blank=True, null=True)),
('executed_at', models.DateTimeField(auto_now_add=True)),
('completed_at', models.DateTimeField(blank=True, null=True)),
('related_incident', models.ForeignKey(blank=True, null=True, on_delete=django.db.models.deletion.SET_NULL, related_name='chatops_commands', to='incident_intelligence.incident')),
('chatops_integration', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, related_name='commands', to='automation_orchestration.chatopsintegration')),
('triggered_runbook', models.ForeignKey(blank=True, null=True, on_delete=django.db.models.deletion.SET_NULL, related_name='chatops_triggers', to='automation_orchestration.runbook')),
],
options={
'ordering': ['-executed_at'],
},
),
migrations.CreateModel(
name='RunbookExecution',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('trigger_type', models.CharField(choices=[('MANUAL', 'Manual Trigger'), ('AUTOMATIC', 'Automatic Trigger'), ('SCHEDULED', 'Scheduled Trigger'), ('WEBHOOK', 'Webhook Trigger'), ('CHATOPS', 'ChatOps Trigger')], max_length=20)),
('trigger_data', models.JSONField(default=dict, help_text='Data that triggered the execution')),
('status', models.CharField(choices=[('PENDING', 'Pending'), ('RUNNING', 'Running'), ('COMPLETED', 'Completed'), ('FAILED', 'Failed'), ('CANCELLED', 'Cancelled'), ('TIMEOUT', 'Timeout')], default='PENDING', max_length=20)),
('current_step', models.PositiveIntegerField(default=0)),
('total_steps', models.PositiveIntegerField()),
('execution_log', models.JSONField(default=list, help_text='Detailed execution log')),
('error_message', models.TextField(blank=True, null=True)),
('output_data', models.JSONField(default=dict, help_text='Output data from execution')),
('started_at', models.DateTimeField(auto_now_add=True)),
('completed_at', models.DateTimeField(blank=True, null=True)),
('duration', models.DurationField(blank=True, null=True)),
('incident', models.ForeignKey(blank=True, null=True, on_delete=django.db.models.deletion.SET_NULL, related_name='runbook_executions', to='incident_intelligence.incident')),
('runbook', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, related_name='executions', to='automation_orchestration.runbook')),
('triggered_by', models.ForeignKey(null=True, on_delete=django.db.models.deletion.SET_NULL, to=settings.AUTH_USER_MODEL)),
],
options={
'ordering': ['-started_at'],
},
),
migrations.CreateModel(
name='WorkflowTemplate',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('name', models.CharField(max_length=200, unique=True)),
('description', models.TextField()),
('template_type', models.CharField(choices=[('INCIDENT_RESPONSE', 'Incident Response'), ('DEPLOYMENT', 'Deployment'), ('MAINTENANCE', 'Maintenance'), ('SCALING', 'Scaling'), ('MONITORING', 'Monitoring'), ('CUSTOM', 'Custom')], max_length=30)),
('workflow_steps', models.JSONField(default=list, help_text='List of workflow steps with conditions and actions')),
('input_parameters', models.JSONField(default=list, help_text='Required input parameters for the workflow')),
('output_schema', models.JSONField(default=dict, help_text='Expected output schema')),
('usage_count', models.PositiveIntegerField(default=0)),
('is_public', models.BooleanField(default=True)),
('created_at', models.DateTimeField(auto_now_add=True)),
('updated_at', models.DateTimeField(auto_now=True)),
('created_by', models.ForeignKey(null=True, on_delete=django.db.models.deletion.SET_NULL, to=settings.AUTH_USER_MODEL)),
],
options={
'ordering': ['name'],
},
),
migrations.CreateModel(
name='WorkflowExecution',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('name', models.CharField(help_text='Name for this execution instance', max_length=200)),
('trigger_type', models.CharField(choices=[('MANUAL', 'Manual Trigger'), ('AUTOMATIC', 'Automatic Trigger'), ('SCHEDULED', 'Scheduled Trigger'), ('WEBHOOK', 'Webhook Trigger'), ('CHATOPS', 'ChatOps Trigger')], max_length=20)),
('status', models.CharField(choices=[('PENDING', 'Pending'), ('RUNNING', 'Running'), ('COMPLETED', 'Completed'), ('FAILED', 'Failed'), ('CANCELLED', 'Cancelled'), ('PAUSED', 'Paused')], default='PENDING', max_length=20)),
('current_step', models.PositiveIntegerField(default=0)),
('total_steps', models.PositiveIntegerField()),
('input_data', models.JSONField(default=dict, help_text='Input data for the workflow')),
('output_data', models.JSONField(default=dict, help_text='Output data from the workflow')),
('execution_log', models.JSONField(default=list, help_text='Detailed execution log')),
('error_message', models.TextField(blank=True, null=True)),
('started_at', models.DateTimeField(auto_now_add=True)),
('completed_at', models.DateTimeField(blank=True, null=True)),
('duration', models.DurationField(blank=True, null=True)),
('related_incident', models.ForeignKey(blank=True, null=True, on_delete=django.db.models.deletion.SET_NULL, related_name='workflow_executions', to='incident_intelligence.incident')),
('related_maintenance', models.ForeignKey(blank=True, null=True, on_delete=django.db.models.deletion.SET_NULL, related_name='workflow_executions', to='automation_orchestration.maintenancewindow')),
('triggered_by', models.ForeignKey(null=True, on_delete=django.db.models.deletion.SET_NULL, to=settings.AUTH_USER_MODEL)),
('workflow_template', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, related_name='executions', to='automation_orchestration.workflowtemplate')),
],
options={
'ordering': ['-started_at'],
},
),
migrations.AddIndex(
model_name='autoremediation',
index=models.Index(fields=['remediation_type', 'is_active'], name='automation__remedia_3c1fa8_idx'),
),
migrations.AddIndex(
model_name='autoremediation',
index=models.Index(fields=['trigger_condition_type'], name='automation__trigger_264d7b_idx'),
),
migrations.AddIndex(
model_name='autoremediationexecution',
index=models.Index(fields=['auto_remediation', 'status'], name='automation__auto_re_e8a9e2_idx'),
),
migrations.AddIndex(
model_name='autoremediationexecution',
index=models.Index(fields=['incident', 'status'], name='automation__inciden_a63d49_idx'),
),
migrations.AddIndex(
model_name='autoremediationexecution',
index=models.Index(fields=['triggered_at'], name='automation__trigger_8ef9fa_idx'),
),
migrations.AddIndex(
model_name='chatopsintegration',
index=models.Index(fields=['platform', 'is_active'], name='automation__platfor_7d4e29_idx'),
),
migrations.AddIndex(
model_name='integration',
index=models.Index(fields=['integration_type', 'status'], name='automation__integra_6734a8_idx'),
),
migrations.AddIndex(
model_name='integration',
index=models.Index(fields=['status', 'health_status'], name='automation__status_30ecd5_idx'),
),
migrations.AddIndex(
model_name='maintenancewindow',
index=models.Index(fields=['start_time', 'end_time'], name='automation__start_t_b3c4cd_idx'),
),
migrations.AddIndex(
model_name='maintenancewindow',
index=models.Index(fields=['status'], name='automation__status_da957b_idx'),
),
migrations.AddIndex(
model_name='runbook',
index=models.Index(fields=['status', 'trigger_type'], name='automation__status_bfcafe_idx'),
),
migrations.AddIndex(
model_name='runbook',
index=models.Index(fields=['category'], name='automation__categor_dd8bc8_idx'),
),
migrations.AddIndex(
model_name='runbook',
index=models.Index(fields=['created_at'], name='automation__created_ad879a_idx'),
),
migrations.AddIndex(
model_name='chatopscommand',
index=models.Index(fields=['chatops_integration', 'status'], name='automation__chatops_3b0b3a_idx'),
),
migrations.AddIndex(
model_name='chatopscommand',
index=models.Index(fields=['user_id', 'executed_at'], name='automation__user_id_390588_idx'),
),
migrations.AddIndex(
model_name='chatopscommand',
index=models.Index(fields=['channel_id', 'executed_at'], name='automation__channel_35c09f_idx'),
),
migrations.AddIndex(
model_name='runbookexecution',
index=models.Index(fields=['runbook', 'status'], name='automation__runbook_534aaf_idx'),
),
migrations.AddIndex(
model_name='runbookexecution',
index=models.Index(fields=['triggered_by', 'started_at'], name='automation__trigger_05e907_idx'),
),
migrations.AddIndex(
model_name='runbookexecution',
index=models.Index(fields=['incident', 'status'], name='automation__inciden_4231a4_idx'),
),
migrations.AddIndex(
model_name='workflowtemplate',
index=models.Index(fields=['template_type', 'is_public'], name='automation__templat_3aecbb_idx'),
),
migrations.AddIndex(
model_name='workflowexecution',
index=models.Index(fields=['workflow_template', 'status'], name='automation__workflo_1a0d89_idx'),
),
migrations.AddIndex(
model_name='workflowexecution',
index=models.Index(fields=['triggered_by', 'started_at'], name='automation__trigger_072811_idx'),
),
migrations.AddIndex(
model_name='workflowexecution',
index=models.Index(fields=['related_incident', 'status'], name='automation__related_08164b_idx'),
),
]

View File

@@ -0,0 +1,25 @@
# Generated by Django 5.2.6 on 2025-09-18 15:51
import django.db.models.deletion
from django.db import migrations, models
class Migration(migrations.Migration):
dependencies = [
('automation_orchestration', '0001_initial'),
('sla_oncall', '0001_initial'),
]
operations = [
migrations.AddField(
model_name='autoremediationexecution',
name='sla_instance',
field=models.ForeignKey(blank=True, help_text='SLA instance related to this auto-remediation', null=True, on_delete=django.db.models.deletion.SET_NULL, related_name='auto_remediations', to='sla_oncall.slainstance'),
),
migrations.AddField(
model_name='runbookexecution',
name='sla_instance',
field=models.ForeignKey(blank=True, help_text='SLA instance that triggered this runbook execution', null=True, on_delete=django.db.models.deletion.SET_NULL, related_name='runbook_executions', to='sla_oncall.slainstance'),
),
]

View File

@@ -0,0 +1,680 @@
"""
Automation & Orchestration models for Enterprise Incident Management API
Implements runbooks, integrations, ChatOps, auto-remediation, and maintenance scheduling
"""
import uuid
import json
from datetime import datetime, timedelta
from typing import Dict, Any, Optional, List
from django.db import models
from django.contrib.auth import get_user_model
from django.core.validators import MinValueValidator, MaxValueValidator
from django.utils import timezone
from django.core.exceptions import ValidationError
User = get_user_model()
class Runbook(models.Model):
"""Predefined response steps for incident automation"""
TRIGGER_TYPES = [
('MANUAL', 'Manual Trigger'),
('AUTOMATIC', 'Automatic Trigger'),
('SCHEDULED', 'Scheduled Trigger'),
('WEBHOOK', 'Webhook Trigger'),
('CHATOPS', 'ChatOps Trigger'),
]
STATUS_CHOICES = [
('DRAFT', 'Draft'),
('ACTIVE', 'Active'),
('INACTIVE', 'Inactive'),
('DEPRECATED', 'Deprecated'),
]
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
name = models.CharField(max_length=200, unique=True)
description = models.TextField()
version = models.CharField(max_length=20, default='1.0')
# Trigger configuration
trigger_type = models.CharField(max_length=20, choices=TRIGGER_TYPES, default='MANUAL')
trigger_conditions = models.JSONField(
default=dict,
help_text="Conditions that trigger this runbook (incident severity, category, etc.)"
)
# Runbook content
steps = models.JSONField(
default=list,
help_text="List of steps to execute in order"
)
estimated_duration = models.DurationField(help_text="Estimated time to complete")
# Categorization
category = models.CharField(max_length=100, blank=True, null=True)
tags = models.JSONField(default=list, help_text="Tags for categorization and search")
# Status and metadata
status = models.CharField(max_length=20, choices=STATUS_CHOICES, default='DRAFT')
is_public = models.BooleanField(default=True, help_text="Whether this runbook is available to all users")
created_by = models.ForeignKey(User, on_delete=models.SET_NULL, null=True, related_name='created_runbooks')
last_modified_by = models.ForeignKey(User, on_delete=models.SET_NULL, null=True, related_name='modified_runbooks')
# Execution tracking
execution_count = models.PositiveIntegerField(default=0)
success_rate = models.FloatField(
validators=[MinValueValidator(0.0), MaxValueValidator(1.0)],
default=0.0,
help_text="Success rate of runbook executions (0.0-1.0)"
)
# Timestamps
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
last_executed_at = models.DateTimeField(null=True, blank=True)
class Meta:
ordering = ['name']
indexes = [
models.Index(fields=['status', 'trigger_type']),
models.Index(fields=['category']),
models.Index(fields=['created_at']),
]
def __str__(self):
return f"{self.name} v{self.version}"
def can_be_triggered_by(self, user: User) -> bool:
"""Check if user can trigger this runbook"""
if not self.is_public and self.created_by != user:
return False
return self.status == 'ACTIVE'
class RunbookExecution(models.Model):
"""Execution log for runbook runs"""
STATUS_CHOICES = [
('PENDING', 'Pending'),
('RUNNING', 'Running'),
('COMPLETED', 'Completed'),
('FAILED', 'Failed'),
('CANCELLED', 'Cancelled'),
('TIMEOUT', 'Timeout'),
]
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
runbook = models.ForeignKey(Runbook, on_delete=models.CASCADE, related_name='executions')
# Execution context
triggered_by = models.ForeignKey(User, on_delete=models.SET_NULL, null=True)
trigger_type = models.CharField(max_length=20, choices=Runbook.TRIGGER_TYPES)
trigger_data = models.JSONField(default=dict, help_text="Data that triggered the execution")
# Related incident (if applicable)
incident = models.ForeignKey(
'incident_intelligence.Incident',
on_delete=models.SET_NULL,
null=True,
blank=True,
related_name='runbook_executions'
)
# SLA Integration
sla_instance = models.ForeignKey(
'sla_oncall.SLAInstance',
on_delete=models.SET_NULL,
null=True,
blank=True,
related_name='runbook_executions',
help_text="SLA instance that triggered this runbook execution"
)
# Execution details
status = models.CharField(max_length=20, choices=STATUS_CHOICES, default='PENDING')
current_step = models.PositiveIntegerField(default=0)
total_steps = models.PositiveIntegerField()
# Results
execution_log = models.JSONField(default=list, help_text="Detailed execution log")
error_message = models.TextField(blank=True, null=True)
output_data = models.JSONField(default=dict, help_text="Output data from execution")
# Performance metrics
started_at = models.DateTimeField(auto_now_add=True)
completed_at = models.DateTimeField(null=True, blank=True)
duration = models.DurationField(null=True, blank=True)
class Meta:
ordering = ['-started_at']
indexes = [
models.Index(fields=['runbook', 'status']),
models.Index(fields=['triggered_by', 'started_at']),
models.Index(fields=['incident', 'status']),
]
def __str__(self):
return f"Execution of {self.runbook.name} - {self.status}"
@property
def is_running(self):
return self.status == 'RUNNING'
@property
def is_completed(self):
return self.status in ['COMPLETED', 'FAILED', 'CANCELLED', 'TIMEOUT']
class Integration(models.Model):
"""External system integrations (ITSM/CI/CD tools)"""
INTEGRATION_TYPES = [
('JIRA', 'Jira'),
('GITHUB', 'GitHub'),
('JENKINS', 'Jenkins'),
('SERVICENOW', 'ServiceNow'),
('ANSIBLE', 'Ansible'),
('TERRAFORM', 'Terraform'),
('SLACK', 'Slack'),
('TEAMS', 'Microsoft Teams'),
('WEBHOOK', 'Generic Webhook'),
('API', 'Generic API'),
]
STATUS_CHOICES = [
('ACTIVE', 'Active'),
('INACTIVE', 'Inactive'),
('ERROR', 'Error'),
('CONFIGURING', 'Configuring'),
]
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
name = models.CharField(max_length=200, unique=True)
integration_type = models.CharField(max_length=20, choices=INTEGRATION_TYPES)
description = models.TextField(blank=True, null=True)
# Configuration
configuration = models.JSONField(
default=dict,
help_text="Integration-specific configuration (API keys, URLs, etc.)"
)
authentication_config = models.JSONField(
default=dict,
help_text="Authentication configuration (OAuth, API keys, etc.)"
)
# Status and health
status = models.CharField(max_length=20, choices=STATUS_CHOICES, default='CONFIGURING')
last_health_check = models.DateTimeField(null=True, blank=True)
health_status = models.CharField(
max_length=20,
choices=[
('HEALTHY', 'Healthy'),
('WARNING', 'Warning'),
('ERROR', 'Error'),
('UNKNOWN', 'Unknown'),
],
default='UNKNOWN'
)
error_message = models.TextField(blank=True, null=True)
# Usage tracking
request_count = models.PositiveIntegerField(default=0)
last_used_at = models.DateTimeField(null=True, blank=True)
# Metadata
created_by = models.ForeignKey(User, on_delete=models.SET_NULL, null=True)
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
class Meta:
ordering = ['name']
indexes = [
models.Index(fields=['integration_type', 'status']),
models.Index(fields=['status', 'health_status']),
]
def __str__(self):
return f"{self.name} ({self.integration_type})"
def is_healthy(self) -> bool:
"""Check if integration is healthy and ready to use"""
return self.status == 'ACTIVE' and self.health_status == 'HEALTHY'
class ChatOpsIntegration(models.Model):
"""ChatOps integration for triggering workflows from chat platforms"""
PLATFORM_CHOICES = [
('SLACK', 'Slack'),
('TEAMS', 'Microsoft Teams'),
('DISCORD', 'Discord'),
('MATTERMOST', 'Mattermost'),
]
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
name = models.CharField(max_length=200, unique=True)
platform = models.CharField(max_length=20, choices=PLATFORM_CHOICES)
# Platform configuration
webhook_url = models.URLField(help_text="Webhook URL for the chat platform")
bot_token = models.CharField(max_length=500, help_text="Bot authentication token")
channel_id = models.CharField(max_length=100, help_text="Default channel ID")
# Command configuration
command_prefix = models.CharField(max_length=10, default='!', help_text="Command prefix (e.g., !, /)")
available_commands = models.JSONField(
default=list,
help_text="List of available commands and their descriptions"
)
# Security
allowed_users = models.JSONField(
default=list,
help_text="List of user IDs allowed to use commands"
)
allowed_channels = models.JSONField(
default=list,
help_text="List of channel IDs where commands are allowed"
)
# Status
is_active = models.BooleanField(default=True)
last_activity = models.DateTimeField(null=True, blank=True)
# Metadata
created_by = models.ForeignKey(User, on_delete=models.SET_NULL, null=True)
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
class Meta:
ordering = ['name']
indexes = [
models.Index(fields=['platform', 'is_active']),
]
def __str__(self):
return f"{self.name} ({self.platform})"
class ChatOpsCommand(models.Model):
"""Individual ChatOps commands and their execution"""
STATUS_CHOICES = [
('PENDING', 'Pending'),
('EXECUTING', 'Executing'),
('COMPLETED', 'Completed'),
('FAILED', 'Failed'),
('CANCELLED', 'Cancelled'),
]
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
chatops_integration = models.ForeignKey(ChatOpsIntegration, on_delete=models.CASCADE, related_name='commands')
# Command details
command = models.CharField(max_length=100, help_text="The command that was executed")
arguments = models.JSONField(default=list, help_text="Command arguments")
user_id = models.CharField(max_length=100, help_text="User ID from chat platform")
channel_id = models.CharField(max_length=100, help_text="Channel ID where command was executed")
# Execution context
triggered_runbook = models.ForeignKey(
Runbook,
on_delete=models.SET_NULL,
null=True,
blank=True,
related_name='chatops_triggers'
)
related_incident = models.ForeignKey(
'incident_intelligence.Incident',
on_delete=models.SET_NULL,
null=True,
blank=True,
related_name='chatops_commands'
)
# Execution results
status = models.CharField(max_length=20, choices=STATUS_CHOICES, default='PENDING')
response_message = models.TextField(blank=True, null=True)
execution_log = models.JSONField(default=list, help_text="Detailed execution log")
error_message = models.TextField(blank=True, null=True)
# Timestamps
executed_at = models.DateTimeField(auto_now_add=True)
completed_at = models.DateTimeField(null=True, blank=True)
class Meta:
ordering = ['-executed_at']
indexes = [
models.Index(fields=['chatops_integration', 'status']),
models.Index(fields=['user_id', 'executed_at']),
models.Index(fields=['channel_id', 'executed_at']),
]
def __str__(self):
return f"{self.command} by {self.user_id} - {self.status}"
class AutoRemediation(models.Model):
"""Auto-remediation hooks for automatic incident response"""
REMEDIATION_TYPES = [
('SERVICE_RESTART', 'Service Restart'),
('DEPLOYMENT_ROLLBACK', 'Deployment Rollback'),
('SCALE_UP', 'Scale Up Resources'),
('SCALE_DOWN', 'Scale Down Resources'),
('CACHE_CLEAR', 'Clear Cache'),
('CONFIG_UPDATE', 'Configuration Update'),
('CUSTOM_SCRIPT', 'Custom Script'),
('WEBHOOK', 'Webhook Call'),
]
TRIGGER_CONDITIONS = [
('SEVERITY', 'Incident Severity'),
('CATEGORY', 'Incident Category'),
('SERVICE', 'Affected Service'),
('DURATION', 'Incident Duration'),
('PATTERN', 'Pattern Match'),
]
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
name = models.CharField(max_length=200, unique=True)
description = models.TextField()
remediation_type = models.CharField(max_length=30, choices=REMEDIATION_TYPES)
# Trigger configuration
trigger_conditions = models.JSONField(
default=dict,
help_text="Conditions that trigger this remediation"
)
trigger_condition_type = models.CharField(max_length=20, choices=TRIGGER_CONDITIONS)
# Remediation configuration
remediation_config = models.JSONField(
default=dict,
help_text="Configuration for the remediation action"
)
timeout_seconds = models.PositiveIntegerField(default=300, help_text="Timeout for remediation action")
# Safety and approval
requires_approval = models.BooleanField(default=False, help_text="Whether manual approval is required")
approval_users = models.ManyToManyField(User, blank=True, related_name='approvable_remediations', help_text="Users who can approve this remediation")
max_executions_per_incident = models.PositiveIntegerField(default=1, help_text="Max times this can run per incident")
# Status and metadata
is_active = models.BooleanField(default=True)
created_by = models.ForeignKey(User, on_delete=models.SET_NULL, null=True, related_name='created_auto_remediations')
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
# Execution tracking
execution_count = models.PositiveIntegerField(default=0)
success_count = models.PositiveIntegerField(default=0)
last_executed_at = models.DateTimeField(null=True, blank=True)
class Meta:
ordering = ['name']
indexes = [
models.Index(fields=['remediation_type', 'is_active']),
models.Index(fields=['trigger_condition_type']),
]
def __str__(self):
return f"{self.name} ({self.remediation_type})"
@property
def success_rate(self):
if self.execution_count == 0:
return 0.0
return self.success_count / self.execution_count
class AutoRemediationExecution(models.Model):
"""Execution log for auto-remediation actions"""
STATUS_CHOICES = [
('PENDING', 'Pending'),
('APPROVED', 'Approved'),
('EXECUTING', 'Executing'),
('COMPLETED', 'Completed'),
('FAILED', 'Failed'),
('CANCELLED', 'Cancelled'),
('TIMEOUT', 'Timeout'),
('REJECTED', 'Rejected'),
]
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
auto_remediation = models.ForeignKey(AutoRemediation, on_delete=models.CASCADE, related_name='executions')
# Related incident
incident = models.ForeignKey(
'incident_intelligence.Incident',
on_delete=models.CASCADE,
related_name='auto_remediations'
)
# SLA Integration
sla_instance = models.ForeignKey(
'sla_oncall.SLAInstance',
on_delete=models.SET_NULL,
null=True,
blank=True,
related_name='auto_remediations',
help_text="SLA instance related to this auto-remediation"
)
# Execution details
status = models.CharField(max_length=20, choices=STATUS_CHOICES, default='PENDING')
trigger_data = models.JSONField(default=dict, help_text="Data that triggered the remediation")
# Approval workflow
approved_by = models.ForeignKey(User, on_delete=models.SET_NULL, null=True, blank=True, related_name='approved_remediations')
approved_at = models.DateTimeField(null=True, blank=True)
approval_notes = models.TextField(blank=True, null=True)
# Execution results
execution_log = models.JSONField(default=list, help_text="Detailed execution log")
output_data = models.JSONField(default=dict, help_text="Output data from remediation")
error_message = models.TextField(blank=True, null=True)
# Timestamps
triggered_at = models.DateTimeField(auto_now_add=True)
started_at = models.DateTimeField(null=True, blank=True)
completed_at = models.DateTimeField(null=True, blank=True)
duration = models.DurationField(null=True, blank=True)
class Meta:
ordering = ['-triggered_at']
indexes = [
models.Index(fields=['auto_remediation', 'status']),
models.Index(fields=['incident', 'status']),
models.Index(fields=['triggered_at']),
]
def __str__(self):
return f"Remediation {self.auto_remediation.name} for {self.incident.title} - {self.status}"
class MaintenanceWindow(models.Model):
"""Scheduled maintenance windows to suppress alerts"""
STATUS_CHOICES = [
('SCHEDULED', 'Scheduled'),
('ACTIVE', 'Active'),
('COMPLETED', 'Completed'),
('CANCELLED', 'Cancelled'),
]
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
name = models.CharField(max_length=200)
description = models.TextField()
# Schedule
start_time = models.DateTimeField(help_text="When maintenance window starts")
end_time = models.DateTimeField(help_text="When maintenance window ends")
timezone = models.CharField(max_length=50, default='UTC')
# Scope
affected_services = models.JSONField(
default=list,
help_text="List of services affected by this maintenance"
)
affected_components = models.JSONField(
default=list,
help_text="List of components affected by this maintenance"
)
# Alert suppression
suppress_incident_creation = models.BooleanField(default=True)
suppress_notifications = models.BooleanField(default=True)
suppress_escalations = models.BooleanField(default=True)
# Status and metadata
status = models.CharField(max_length=20, choices=STATUS_CHOICES, default='SCHEDULED')
created_by = models.ForeignKey(User, on_delete=models.SET_NULL, null=True)
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
# Execution tracking
incidents_suppressed = models.PositiveIntegerField(default=0)
notifications_suppressed = models.PositiveIntegerField(default=0)
class Meta:
ordering = ['start_time']
indexes = [
models.Index(fields=['start_time', 'end_time']),
models.Index(fields=['status']),
]
def __str__(self):
return f"{self.name} ({self.start_time} - {self.end_time})"
def is_active(self) -> bool:
"""Check if maintenance window is currently active"""
now = timezone.now()
return self.start_time <= now <= self.end_time and self.status == 'ACTIVE'
def is_scheduled(self) -> bool:
"""Check if maintenance window is scheduled for the future"""
now = timezone.now()
return self.start_time > now and self.status == 'SCHEDULED'
def clean(self):
"""Validate maintenance window data"""
if self.start_time >= self.end_time:
raise ValidationError("Start time must be before end time")
class WorkflowTemplate(models.Model):
"""Reusable workflow templates for common automation scenarios"""
TEMPLATE_TYPES = [
('INCIDENT_RESPONSE', 'Incident Response'),
('DEPLOYMENT', 'Deployment'),
('MAINTENANCE', 'Maintenance'),
('SCALING', 'Scaling'),
('MONITORING', 'Monitoring'),
('CUSTOM', 'Custom'),
]
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
name = models.CharField(max_length=200, unique=True)
description = models.TextField()
template_type = models.CharField(max_length=30, choices=TEMPLATE_TYPES)
# Template content
workflow_steps = models.JSONField(
default=list,
help_text="List of workflow steps with conditions and actions"
)
input_parameters = models.JSONField(
default=list,
help_text="Required input parameters for the workflow"
)
output_schema = models.JSONField(
default=dict,
help_text="Expected output schema"
)
# Usage and metadata
usage_count = models.PositiveIntegerField(default=0)
is_public = models.BooleanField(default=True)
created_by = models.ForeignKey(User, on_delete=models.SET_NULL, null=True)
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
class Meta:
ordering = ['name']
indexes = [
models.Index(fields=['template_type', 'is_public']),
]
def __str__(self):
return f"{self.name} ({self.template_type})"
class WorkflowExecution(models.Model):
"""Execution of workflow templates"""
STATUS_CHOICES = [
('PENDING', 'Pending'),
('RUNNING', 'Running'),
('COMPLETED', 'Completed'),
('FAILED', 'Failed'),
('CANCELLED', 'Cancelled'),
('PAUSED', 'Paused'),
]
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
workflow_template = models.ForeignKey(WorkflowTemplate, on_delete=models.CASCADE, related_name='executions')
# Execution context
name = models.CharField(max_length=200, help_text="Name for this execution instance")
triggered_by = models.ForeignKey(User, on_delete=models.SET_NULL, null=True)
trigger_type = models.CharField(max_length=20, choices=Runbook.TRIGGER_TYPES)
# Related objects
related_incident = models.ForeignKey(
'incident_intelligence.Incident',
on_delete=models.SET_NULL,
null=True,
blank=True,
related_name='workflow_executions'
)
related_maintenance = models.ForeignKey(
MaintenanceWindow,
on_delete=models.SET_NULL,
null=True,
blank=True,
related_name='workflow_executions'
)
# Execution state
status = models.CharField(max_length=20, choices=STATUS_CHOICES, default='PENDING')
current_step = models.PositiveIntegerField(default=0)
total_steps = models.PositiveIntegerField()
# Input/Output
input_data = models.JSONField(default=dict, help_text="Input data for the workflow")
output_data = models.JSONField(default=dict, help_text="Output data from the workflow")
execution_log = models.JSONField(default=list, help_text="Detailed execution log")
error_message = models.TextField(blank=True, null=True)
# Timestamps
started_at = models.DateTimeField(auto_now_add=True)
completed_at = models.DateTimeField(null=True, blank=True)
duration = models.DurationField(null=True, blank=True)
class Meta:
ordering = ['-started_at']
indexes = [
models.Index(fields=['workflow_template', 'status']),
models.Index(fields=['triggered_by', 'started_at']),
models.Index(fields=['related_incident', 'status']),
]
def __str__(self):
return f"Workflow {self.name} - {self.status}"

View File

@@ -0,0 +1,29 @@
"""
Serializers for Automation & Orchestration module
"""
from .automation import (
RunbookSerializer,
RunbookExecutionSerializer,
IntegrationSerializer,
ChatOpsIntegrationSerializer,
ChatOpsCommandSerializer,
AutoRemediationSerializer,
AutoRemediationExecutionSerializer,
MaintenanceWindowSerializer,
WorkflowTemplateSerializer,
WorkflowExecutionSerializer,
)
__all__ = [
'RunbookSerializer',
'RunbookExecutionSerializer',
'IntegrationSerializer',
'ChatOpsIntegrationSerializer',
'ChatOpsCommandSerializer',
'AutoRemediationSerializer',
'AutoRemediationExecutionSerializer',
'MaintenanceWindowSerializer',
'WorkflowTemplateSerializer',
'WorkflowExecutionSerializer',
]

View File

@@ -0,0 +1,308 @@
"""
Serializers for Automation & Orchestration models
"""
from rest_framework import serializers
from django.contrib.auth import get_user_model
from ..models import (
Runbook,
RunbookExecution,
Integration,
ChatOpsIntegration,
ChatOpsCommand,
AutoRemediation,
AutoRemediationExecution,
MaintenanceWindow,
WorkflowTemplate,
WorkflowExecution,
)
User = get_user_model()
class RunbookSerializer(serializers.ModelSerializer):
"""Serializer for Runbook model"""
created_by_username = serializers.CharField(source='created_by.username', read_only=True)
last_modified_by_username = serializers.CharField(source='last_modified_by.username', read_only=True)
can_trigger = serializers.SerializerMethodField()
class Meta:
model = Runbook
fields = [
'id', 'name', 'description', 'version', 'trigger_type', 'trigger_conditions',
'steps', 'estimated_duration', 'category', 'tags', 'status', 'is_public',
'created_by', 'created_by_username', 'last_modified_by', 'last_modified_by_username',
'execution_count', 'success_rate', 'created_at', 'updated_at', 'last_executed_at',
'can_trigger'
]
read_only_fields = [
'id', 'created_by_username', 'last_modified_by_username', 'execution_count',
'success_rate', 'created_at', 'updated_at', 'last_executed_at', 'can_trigger'
]
def get_can_trigger(self, obj):
"""Check if current user can trigger this runbook"""
request = self.context.get('request')
if request and request.user:
return obj.can_be_triggered_by(request.user)
return False
def validate_steps(self, value):
"""Validate runbook steps"""
if not isinstance(value, list):
raise serializers.ValidationError("Steps must be a list")
for i, step in enumerate(value):
if not isinstance(step, dict):
raise serializers.ValidationError(f"Step {i+1} must be a dictionary")
required_fields = ['name', 'action', 'timeout']
for field in required_fields:
if field not in step:
raise serializers.ValidationError(f"Step {i+1} missing required field: {field}")
return value
class RunbookExecutionSerializer(serializers.ModelSerializer):
"""Serializer for RunbookExecution model"""
runbook_name = serializers.CharField(source='runbook.name', read_only=True)
triggered_by_username = serializers.CharField(source='triggered_by.username', read_only=True)
incident_title = serializers.CharField(source='incident.title', read_only=True)
is_running = serializers.BooleanField(read_only=True)
is_completed = serializers.BooleanField(read_only=True)
class Meta:
model = RunbookExecution
fields = [
'id', 'runbook', 'runbook_name', 'triggered_by', 'triggered_by_username',
'trigger_type', 'trigger_data', 'incident', 'incident_title', 'status',
'current_step', 'total_steps', 'execution_log', 'error_message', 'output_data',
'started_at', 'completed_at', 'duration', 'is_running', 'is_completed'
]
read_only_fields = [
'id', 'runbook_name', 'triggered_by_username', 'incident_title',
'is_running', 'is_completed', 'started_at', 'completed_at', 'duration'
]
class IntegrationSerializer(serializers.ModelSerializer):
"""Serializer for Integration model"""
created_by_username = serializers.CharField(source='created_by.username', read_only=True)
is_healthy = serializers.BooleanField(read_only=True)
class Meta:
model = Integration
fields = [
'id', 'name', 'integration_type', 'description', 'configuration',
'authentication_config', 'status', 'last_health_check', 'health_status',
'error_message', 'request_count', 'last_used_at', 'created_by',
'created_by_username', 'created_at', 'updated_at', 'is_healthy'
]
read_only_fields = [
'id', 'created_by_username', 'last_health_check', 'health_status',
'error_message', 'request_count', 'last_used_at', 'created_at',
'updated_at', 'is_healthy'
]
def validate_configuration(self, value):
"""Validate integration configuration"""
if not isinstance(value, dict):
raise serializers.ValidationError("Configuration must be a dictionary")
return value
def validate_authentication_config(self, value):
"""Validate authentication configuration"""
if not isinstance(value, dict):
raise serializers.ValidationError("Authentication configuration must be a dictionary")
return value
class ChatOpsIntegrationSerializer(serializers.ModelSerializer):
"""Serializer for ChatOpsIntegration model"""
created_by_username = serializers.CharField(source='created_by.username', read_only=True)
class Meta:
model = ChatOpsIntegration
fields = [
'id', 'name', 'platform', 'webhook_url', 'bot_token', 'channel_id',
'command_prefix', 'available_commands', 'allowed_users', 'allowed_channels',
'is_active', 'last_activity', 'created_by', 'created_by_username',
'created_at', 'updated_at'
]
read_only_fields = [
'id', 'created_by_username', 'last_activity', 'created_at', 'updated_at'
]
def validate_available_commands(self, value):
"""Validate available commands"""
if not isinstance(value, list):
raise serializers.ValidationError("Available commands must be a list")
for i, command in enumerate(value):
if not isinstance(command, dict):
raise serializers.ValidationError(f"Command {i+1} must be a dictionary")
required_fields = ['name', 'description']
for field in required_fields:
if field not in command:
raise serializers.ValidationError(f"Command {i+1} missing required field: {field}")
return value
class ChatOpsCommandSerializer(serializers.ModelSerializer):
"""Serializer for ChatOpsCommand model"""
chatops_integration_name = serializers.CharField(source='chatops_integration.name', read_only=True)
triggered_runbook_name = serializers.CharField(source='triggered_runbook.name', read_only=True)
related_incident_title = serializers.CharField(source='related_incident.title', read_only=True)
class Meta:
model = ChatOpsCommand
fields = [
'id', 'chatops_integration', 'chatops_integration_name', 'command', 'arguments',
'user_id', 'channel_id', 'triggered_runbook', 'triggered_runbook_name',
'related_incident', 'related_incident_title', 'status', 'response_message',
'execution_log', 'error_message', 'executed_at', 'completed_at'
]
read_only_fields = [
'id', 'chatops_integration_name', 'triggered_runbook_name',
'related_incident_title', 'executed_at', 'completed_at'
]
class AutoRemediationSerializer(serializers.ModelSerializer):
"""Serializer for AutoRemediation model"""
created_by_username = serializers.CharField(source='created_by.username', read_only=True)
approval_users_usernames = serializers.SerializerMethodField()
success_rate = serializers.FloatField(read_only=True)
class Meta:
model = AutoRemediation
fields = [
'id', 'name', 'description', 'remediation_type', 'trigger_conditions',
'trigger_condition_type', 'remediation_config', 'timeout_seconds',
'requires_approval', 'approval_users', 'approval_users_usernames',
'max_executions_per_incident', 'is_active', 'created_by', 'created_by_username',
'created_at', 'updated_at', 'execution_count', 'success_count',
'last_executed_at', 'success_rate'
]
read_only_fields = [
'id', 'created_by_username', 'approval_users_usernames', 'created_at',
'updated_at', 'execution_count', 'success_count', 'last_executed_at', 'success_rate'
]
def get_approval_users_usernames(self, obj):
"""Get usernames of approval users"""
return [user.username for user in obj.approval_users.all()]
class AutoRemediationExecutionSerializer(serializers.ModelSerializer):
"""Serializer for AutoRemediationExecution model"""
auto_remediation_name = serializers.CharField(source='auto_remediation.name', read_only=True)
incident_title = serializers.CharField(source='incident.title', read_only=True)
approved_by_username = serializers.CharField(source='approved_by.username', read_only=True)
class Meta:
model = AutoRemediationExecution
fields = [
'id', 'auto_remediation', 'auto_remediation_name', 'incident', 'incident_title',
'status', 'trigger_data', 'approved_by', 'approved_by_username', 'approved_at',
'approval_notes', 'execution_log', 'output_data', 'error_message',
'triggered_at', 'started_at', 'completed_at', 'duration'
]
read_only_fields = [
'id', 'auto_remediation_name', 'incident_title', 'approved_by_username',
'triggered_at', 'started_at', 'completed_at', 'duration'
]
class MaintenanceWindowSerializer(serializers.ModelSerializer):
"""Serializer for MaintenanceWindow model"""
created_by_username = serializers.CharField(source='created_by.username', read_only=True)
is_active = serializers.BooleanField(read_only=True)
is_scheduled = serializers.BooleanField(read_only=True)
class Meta:
model = MaintenanceWindow
fields = [
'id', 'name', 'description', 'start_time', 'end_time', 'timezone',
'affected_services', 'affected_components', 'suppress_incident_creation',
'suppress_notifications', 'suppress_escalations', 'status', 'created_by',
'created_by_username', 'created_at', 'updated_at', 'incidents_suppressed',
'notifications_suppressed', 'is_active', 'is_scheduled'
]
read_only_fields = [
'id', 'created_by_username', 'created_at', 'updated_at',
'incidents_suppressed', 'notifications_suppressed', 'is_active', 'is_scheduled'
]
def validate(self, data):
"""Validate maintenance window data"""
if data.get('start_time') and data.get('end_time'):
if data['start_time'] >= data['end_time']:
raise serializers.ValidationError("Start time must be before end time")
return data
class WorkflowTemplateSerializer(serializers.ModelSerializer):
"""Serializer for WorkflowTemplate model"""
created_by_username = serializers.CharField(source='created_by.username', read_only=True)
class Meta:
model = WorkflowTemplate
fields = [
'id', 'name', 'description', 'template_type', 'workflow_steps',
'input_parameters', 'output_schema', 'usage_count', 'is_public',
'created_by', 'created_by_username', 'created_at', 'updated_at'
]
read_only_fields = [
'id', 'created_by_username', 'usage_count', 'created_at', 'updated_at'
]
def validate_workflow_steps(self, value):
"""Validate workflow steps"""
if not isinstance(value, list):
raise serializers.ValidationError("Workflow steps must be a list")
for i, step in enumerate(value):
if not isinstance(step, dict):
raise serializers.ValidationError(f"Step {i+1} must be a dictionary")
required_fields = ['name', 'action', 'conditions']
for field in required_fields:
if field not in step:
raise serializers.ValidationError(f"Step {i+1} missing required field: {field}")
return value
class WorkflowExecutionSerializer(serializers.ModelSerializer):
"""Serializer for WorkflowExecution model"""
workflow_template_name = serializers.CharField(source='workflow_template.name', read_only=True)
triggered_by_username = serializers.CharField(source='triggered_by.username', read_only=True)
related_incident_title = serializers.CharField(source='related_incident.title', read_only=True)
related_maintenance_name = serializers.CharField(source='related_maintenance.name', read_only=True)
class Meta:
model = WorkflowExecution
fields = [
'id', 'workflow_template', 'workflow_template_name', 'name', 'triggered_by',
'triggered_by_username', 'trigger_type', 'related_incident', 'related_incident_title',
'related_maintenance', 'related_maintenance_name', 'status', 'current_step',
'total_steps', 'input_data', 'output_data', 'execution_log', 'error_message',
'started_at', 'completed_at', 'duration'
]
read_only_fields = [
'id', 'workflow_template_name', 'triggered_by_username', 'related_incident_title',
'related_maintenance_name', 'started_at', 'completed_at', 'duration'
]

View File

@@ -0,0 +1,63 @@
"""
Signal handlers for automation_orchestration app
"""
from django.db.models.signals import post_save, pre_save
from django.dispatch import receiver
from django.utils import timezone
from .models import (
RunbookExecution,
AutoRemediationExecution,
MaintenanceWindow,
)
@receiver(post_save, sender=RunbookExecution)
def update_runbook_statistics(sender, instance, created, **kwargs):
"""Update runbook statistics when execution is completed"""
if instance.status in ['COMPLETED', 'FAILED', 'CANCELLED', 'TIMEOUT']:
runbook = instance.runbook
runbook.execution_count += 1
if instance.status == 'COMPLETED':
# Update success rate
total_executions = runbook.execution_count
successful_executions = RunbookExecution.objects.filter(
runbook=runbook,
status='COMPLETED'
).count()
runbook.success_rate = successful_executions / total_executions if total_executions > 0 else 0.0
runbook.last_executed_at = instance.started_at
runbook.save(update_fields=['execution_count', 'success_rate', 'last_executed_at'])
@receiver(post_save, sender=AutoRemediationExecution)
def update_auto_remediation_statistics(sender, instance, created, **kwargs):
"""Update auto-remediation statistics when execution is completed"""
if instance.status in ['COMPLETED', 'FAILED', 'CANCELLED', 'TIMEOUT']:
remediation = instance.auto_remediation
remediation.execution_count += 1
if instance.status == 'COMPLETED':
remediation.success_count += 1
remediation.last_executed_at = instance.triggered_at
remediation.save(update_fields=['execution_count', 'success_count', 'last_executed_at'])
@receiver(pre_save, sender=MaintenanceWindow)
def validate_maintenance_window(sender, instance, **kwargs):
"""Validate maintenance window before saving"""
if instance.start_time and instance.end_time:
if instance.start_time >= instance.end_time:
raise ValueError("Start time must be before end time")
# Auto-update status based on current time
now = timezone.now()
if instance.start_time and instance.end_time:
if instance.start_time <= now <= instance.end_time:
if instance.status == 'SCHEDULED':
instance.status = 'ACTIVE'
elif instance.end_time < now:
if instance.status in ['SCHEDULED', 'ACTIVE']:
instance.status = 'COMPLETED'

View File

@@ -0,0 +1,3 @@
from django.test import TestCase
# Create your tests here.

View File

@@ -0,0 +1,36 @@
"""
URL configuration for automation_orchestration app
"""
from django.urls import path, include
from rest_framework.routers import DefaultRouter
from .views.automation import (
RunbookViewSet,
RunbookExecutionViewSet,
IntegrationViewSet,
ChatOpsIntegrationViewSet,
ChatOpsCommandViewSet,
AutoRemediationViewSet,
AutoRemediationExecutionViewSet,
MaintenanceWindowViewSet,
WorkflowTemplateViewSet,
WorkflowExecutionViewSet,
)
# Create router and register viewsets
router = DefaultRouter()
router.register(r'runbooks', RunbookViewSet)
router.register(r'runbook-executions', RunbookExecutionViewSet)
router.register(r'integrations', IntegrationViewSet)
router.register(r'chatops-integrations', ChatOpsIntegrationViewSet)
router.register(r'chatops-commands', ChatOpsCommandViewSet)
router.register(r'auto-remediations', AutoRemediationViewSet)
router.register(r'auto-remediation-executions', AutoRemediationExecutionViewSet)
router.register(r'maintenance-windows', MaintenanceWindowViewSet)
router.register(r'workflow-templates', WorkflowTemplateViewSet)
router.register(r'workflow-executions', WorkflowExecutionViewSet)
app_name = 'automation_orchestration'
urlpatterns = [
path('', include(router.urls)),
]

View File

@@ -0,0 +1,3 @@
from django.shortcuts import render
# Create your views here.

View File

@@ -0,0 +1,29 @@
"""
Views for Automation & Orchestration module
"""
from .automation import (
RunbookViewSet,
RunbookExecutionViewSet,
IntegrationViewSet,
ChatOpsIntegrationViewSet,
ChatOpsCommandViewSet,
AutoRemediationViewSet,
AutoRemediationExecutionViewSet,
MaintenanceWindowViewSet,
WorkflowTemplateViewSet,
WorkflowExecutionViewSet,
)
__all__ = [
'RunbookViewSet',
'RunbookExecutionViewSet',
'IntegrationViewSet',
'ChatOpsIntegrationViewSet',
'ChatOpsCommandViewSet',
'AutoRemediationViewSet',
'AutoRemediationExecutionViewSet',
'MaintenanceWindowViewSet',
'WorkflowTemplateViewSet',
'WorkflowExecutionViewSet',
]

View File

@@ -0,0 +1,411 @@
"""
Views for Automation & Orchestration models
"""
from rest_framework import viewsets, status, permissions
from rest_framework.decorators import action
from rest_framework.response import Response
from django_filters.rest_framework import DjangoFilterBackend
from rest_framework.filters import SearchFilter, OrderingFilter
from django.utils import timezone
from django.db.models import Q
from ..models import (
Runbook,
RunbookExecution,
Integration,
ChatOpsIntegration,
ChatOpsCommand,
AutoRemediation,
AutoRemediationExecution,
MaintenanceWindow,
WorkflowTemplate,
WorkflowExecution,
)
from ..serializers.automation import (
RunbookSerializer,
RunbookExecutionSerializer,
IntegrationSerializer,
ChatOpsIntegrationSerializer,
ChatOpsCommandSerializer,
AutoRemediationSerializer,
AutoRemediationExecutionSerializer,
MaintenanceWindowSerializer,
WorkflowTemplateSerializer,
WorkflowExecutionSerializer,
)
class RunbookViewSet(viewsets.ModelViewSet):
"""ViewSet for Runbook model"""
queryset = Runbook.objects.all()
serializer_class = RunbookSerializer
permission_classes = [permissions.IsAuthenticated]
filter_backends = [DjangoFilterBackend, SearchFilter, OrderingFilter]
filterset_fields = ['status', 'trigger_type', 'category', 'is_public']
search_fields = ['name', 'description', 'category']
ordering_fields = ['name', 'created_at', 'updated_at', 'execution_count', 'success_rate']
ordering = ['-created_at']
def get_queryset(self):
"""Filter runbooks based on user permissions"""
queryset = super().get_queryset()
# Filter by public runbooks or user's own runbooks
if not self.request.user.is_staff:
queryset = queryset.filter(
Q(is_public=True) | Q(created_by=self.request.user)
)
return queryset
def perform_create(self, serializer):
"""Set the creator when creating a runbook"""
serializer.save(created_by=self.request.user)
def perform_update(self, serializer):
"""Set the last modifier when updating a runbook"""
serializer.save(last_modified_by=self.request.user)
@action(detail=True, methods=['post'])
def execute(self, request, pk=None):
"""Execute a runbook"""
runbook = self.get_object()
if not runbook.can_be_triggered_by(request.user):
return Response(
{'error': 'You do not have permission to execute this runbook'},
status=status.HTTP_403_FORBIDDEN
)
# Create execution record
execution = RunbookExecution.objects.create(
runbook=runbook,
triggered_by=request.user,
trigger_type='MANUAL',
trigger_data=request.data.get('trigger_data', {}),
total_steps=len(runbook.steps)
)
# TODO: Start actual execution in background task
serializer = RunbookExecutionSerializer(execution, context={'request': request})
return Response(serializer.data, status=status.HTTP_201_CREATED)
@action(detail=False, methods=['get'])
def available_for_trigger(self, request):
"""Get runbooks available for triggering by current user"""
queryset = self.get_queryset().filter(status='ACTIVE')
available_runbooks = [rb for rb in queryset if rb.can_be_triggered_by(request.user)]
serializer = self.get_serializer(available_runbooks, many=True)
return Response(serializer.data)
class RunbookExecutionViewSet(viewsets.ReadOnlyModelViewSet):
"""ViewSet for RunbookExecution model (read-only)"""
queryset = RunbookExecution.objects.all()
serializer_class = RunbookExecutionSerializer
permission_classes = [permissions.IsAuthenticated]
filter_backends = [DjangoFilterBackend, SearchFilter, OrderingFilter]
filterset_fields = ['status', 'trigger_type', 'runbook', 'incident']
search_fields = ['runbook__name']
ordering_fields = ['started_at', 'completed_at', 'duration']
ordering = ['-started_at']
def get_queryset(self):
"""Filter executions based on user permissions"""
queryset = super().get_queryset()
# Users can only see executions they triggered or for incidents they have access to
if not self.request.user.is_staff:
queryset = queryset.filter(
Q(triggered_by=self.request.user) |
Q(incident__assigned_to=self.request.user) |
Q(incident__reporter=self.request.user)
)
return queryset
class IntegrationViewSet(viewsets.ModelViewSet):
"""ViewSet for Integration model"""
queryset = Integration.objects.all()
serializer_class = IntegrationSerializer
permission_classes = [permissions.IsAuthenticated]
filter_backends = [DjangoFilterBackend, SearchFilter, OrderingFilter]
filterset_fields = ['integration_type', 'status', 'health_status']
search_fields = ['name', 'description']
ordering_fields = ['name', 'created_at', 'last_used_at']
ordering = ['name']
def perform_create(self, serializer):
"""Set the creator when creating an integration"""
serializer.save(created_by=self.request.user)
@action(detail=True, methods=['post'])
def test_connection(self, request, pk=None):
"""Test integration connection"""
integration = self.get_object()
# TODO: Implement actual connection testing
# For now, just return a mock response
return Response({
'status': 'success',
'message': f'Connection test for {integration.name} completed',
'health_status': 'HEALTHY'
})
@action(detail=True, methods=['post'])
def health_check(self, request, pk=None):
"""Perform health check on integration"""
integration = self.get_object()
# TODO: Implement actual health check
# For now, just update the timestamp
integration.last_health_check = timezone.now()
integration.health_status = 'HEALTHY'
integration.save()
return Response({
'status': 'success',
'health_status': integration.health_status,
'last_health_check': integration.last_health_check
})
class ChatOpsIntegrationViewSet(viewsets.ModelViewSet):
"""ViewSet for ChatOpsIntegration model"""
queryset = ChatOpsIntegration.objects.all()
serializer_class = ChatOpsIntegrationSerializer
permission_classes = [permissions.IsAuthenticated]
filter_backends = [DjangoFilterBackend, SearchFilter, OrderingFilter]
filterset_fields = ['platform', 'is_active']
search_fields = ['name']
ordering_fields = ['name', 'created_at', 'last_activity']
ordering = ['name']
def perform_create(self, serializer):
"""Set the creator when creating a ChatOps integration"""
serializer.save(created_by=self.request.user)
@action(detail=True, methods=['post'])
def test_webhook(self, request, pk=None):
"""Test ChatOps webhook"""
integration = self.get_object()
# TODO: Implement actual webhook testing
return Response({
'status': 'success',
'message': f'Webhook test for {integration.name} completed'
})
class ChatOpsCommandViewSet(viewsets.ReadOnlyModelViewSet):
"""ViewSet for ChatOpsCommand model (read-only)"""
queryset = ChatOpsCommand.objects.all()
serializer_class = ChatOpsCommandSerializer
permission_classes = [permissions.IsAuthenticated]
filter_backends = [DjangoFilterBackend, SearchFilter, OrderingFilter]
filterset_fields = ['status', 'chatops_integration', 'command']
search_fields = ['command', 'user_id']
ordering_fields = ['executed_at', 'completed_at']
ordering = ['-executed_at']
class AutoRemediationViewSet(viewsets.ModelViewSet):
"""ViewSet for AutoRemediation model"""
queryset = AutoRemediation.objects.all()
serializer_class = AutoRemediationSerializer
permission_classes = [permissions.IsAuthenticated]
filter_backends = [DjangoFilterBackend, SearchFilter, OrderingFilter]
filterset_fields = ['remediation_type', 'trigger_condition_type', 'is_active', 'requires_approval']
search_fields = ['name', 'description']
ordering_fields = ['name', 'created_at', 'execution_count', 'success_count']
ordering = ['name']
def perform_create(self, serializer):
"""Set the creator when creating an auto-remediation"""
serializer.save(created_by=self.request.user)
@action(detail=True, methods=['post'])
def test_trigger(self, request, pk=None):
"""Test auto-remediation trigger conditions"""
remediation = self.get_object()
# TODO: Implement actual trigger testing
return Response({
'status': 'success',
'message': f'Trigger test for {remediation.name} completed',
'trigger_conditions': remediation.trigger_conditions
})
class AutoRemediationExecutionViewSet(viewsets.ReadOnlyModelViewSet):
"""ViewSet for AutoRemediationExecution model (read-only)"""
queryset = AutoRemediationExecution.objects.all()
serializer_class = AutoRemediationExecutionSerializer
permission_classes = [permissions.IsAuthenticated]
filter_backends = [DjangoFilterBackend, SearchFilter, OrderingFilter]
filterset_fields = ['status', 'auto_remediation', 'incident']
search_fields = ['auto_remediation__name', 'incident__title']
ordering_fields = ['triggered_at', 'started_at', 'completed_at']
ordering = ['-triggered_at']
@action(detail=True, methods=['post'])
def approve(self, request, pk=None):
"""Approve a pending auto-remediation execution"""
execution = self.get_object()
if execution.status != 'PENDING':
return Response(
{'error': 'Only pending remediations can be approved'},
status=status.HTTP_400_BAD_REQUEST
)
if not execution.auto_remediation.requires_approval:
return Response(
{'error': 'This remediation does not require approval'},
status=status.HTTP_400_BAD_REQUEST
)
if request.user not in execution.auto_remediation.approval_users.all():
return Response(
{'error': 'You do not have permission to approve this remediation'},
status=status.HTTP_403_FORBIDDEN
)
execution.status = 'APPROVED'
execution.approved_by = request.user
execution.approved_at = timezone.now()
execution.approval_notes = request.data.get('approval_notes', '')
execution.save()
# TODO: Start actual remediation execution
serializer = self.get_serializer(execution)
return Response(serializer.data)
@action(detail=True, methods=['post'])
def reject(self, request, pk=None):
"""Reject a pending auto-remediation execution"""
execution = self.get_object()
if execution.status != 'PENDING':
return Response(
{'error': 'Only pending remediations can be rejected'},
status=status.HTTP_400_BAD_REQUEST
)
execution.status = 'REJECTED'
execution.approved_by = request.user
execution.approved_at = timezone.now()
execution.approval_notes = request.data.get('rejection_notes', '')
execution.save()
serializer = self.get_serializer(execution)
return Response(serializer.data)
class MaintenanceWindowViewSet(viewsets.ModelViewSet):
"""ViewSet for MaintenanceWindow model"""
queryset = MaintenanceWindow.objects.all()
serializer_class = MaintenanceWindowSerializer
permission_classes = [permissions.IsAuthenticated]
filter_backends = [DjangoFilterBackend, SearchFilter, OrderingFilter]
filterset_fields = ['status']
search_fields = ['name', 'description']
ordering_fields = ['name', 'start_time', 'end_time', 'created_at']
ordering = ['start_time']
def perform_create(self, serializer):
"""Set the creator when creating a maintenance window"""
serializer.save(created_by=self.request.user)
@action(detail=False, methods=['get'])
def active(self, request):
"""Get currently active maintenance windows"""
now = timezone.now()
active_windows = self.get_queryset().filter(
start_time__lte=now,
end_time__gte=now,
status='ACTIVE'
)
serializer = self.get_serializer(active_windows, many=True)
return Response(serializer.data)
@action(detail=False, methods=['get'])
def upcoming(self, request):
"""Get upcoming maintenance windows"""
now = timezone.now()
upcoming_windows = self.get_queryset().filter(
start_time__gt=now,
status='SCHEDULED'
)
serializer = self.get_serializer(upcoming_windows, many=True)
return Response(serializer.data)
class WorkflowTemplateViewSet(viewsets.ModelViewSet):
"""ViewSet for WorkflowTemplate model"""
queryset = WorkflowTemplate.objects.all()
serializer_class = WorkflowTemplateSerializer
permission_classes = [permissions.IsAuthenticated]
filter_backends = [DjangoFilterBackend, SearchFilter, OrderingFilter]
filterset_fields = ['template_type', 'is_public']
search_fields = ['name', 'description']
ordering_fields = ['name', 'created_at', 'usage_count']
ordering = ['name']
def get_queryset(self):
"""Filter templates based on user permissions"""
queryset = super().get_queryset()
# Filter by public templates or user's own templates
if not self.request.user.is_staff:
queryset = queryset.filter(
Q(is_public=True) | Q(created_by=self.request.user)
)
return queryset
def perform_create(self, serializer):
"""Set the creator when creating a workflow template"""
serializer.save(created_by=self.request.user)
class WorkflowExecutionViewSet(viewsets.ReadOnlyModelViewSet):
"""ViewSet for WorkflowExecution model (read-only)"""
queryset = WorkflowExecution.objects.all()
serializer_class = WorkflowExecutionSerializer
permission_classes = [permissions.IsAuthenticated]
filter_backends = [DjangoFilterBackend, SearchFilter, OrderingFilter]
filterset_fields = ['status', 'workflow_template', 'trigger_type']
search_fields = ['name', 'workflow_template__name']
ordering_fields = ['started_at', 'completed_at', 'duration']
ordering = ['-started_at']
def get_queryset(self):
"""Filter executions based on user permissions"""
queryset = super().get_queryset()
# Users can only see executions they triggered or for incidents they have access to
if not self.request.user.is_staff:
queryset = queryset.filter(
Q(triggered_by=self.request.user) |
Q(related_incident__assigned_to=self.request.user) |
Q(related_incident__reporter=self.request.user)
)
return queryset