Updates

2025-09-19 11:58:53 +03:00
parent 306b20e24a
commit 6b247e5b9f
11423 changed files with 1500615 additions and 778 deletions
--- a/ETB-API/incident_intelligence/Documentations/INCIDENT_INTELLIGENCE_API.md
+++ b/ETB-API/incident_intelligence/Documentations/INCIDENT_INTELLIGENCE_API.md
@@ -0,0 +1,363 @@
+# Incident Intelligence API Documentation
+
+## Overview
+
+The Incident Intelligence module provides AI-driven capabilities for incident management, including:
+
+- **AI-driven incident classification** using NLP to categorize incidents from free text
+- **Automated severity suggestion** based on impact analysis
+- **Correlation engine** for linking related incidents and problem detection
+- **Duplication detection** for merging incidents that describe the same outage
+
+## Features
+
+### 1. AI-Driven Incident Classification
+
+Automatically classifies incidents into categories and subcategories based on their content:
+
+- **Categories**: Infrastructure, Application, Security, User Experience, Data, Integration
+- **Subcategories**: Specific types within each category (e.g., API_ISSUE, DATABASE_ISSUE)
+- **Confidence Scoring**: AI confidence level for each classification
+- **Keyword Extraction**: Identifies relevant keywords from incident text
+- **Sentiment Analysis**: Analyzes the sentiment of incident descriptions
+- **Urgency Detection**: Identifies urgency indicators in the text
+
+### 2. Automated Severity Suggestion
+
+Suggests incident severity based on multiple factors:
+
+- **User Impact Analysis**: Number of affected users and impact level
+- **Business Impact Assessment**: Revenue and operational impact
+- **Technical Impact Evaluation**: System and infrastructure impact
+- **Text Analysis**: Severity indicators in incident descriptions
+- **Confidence Scoring**: AI confidence in severity suggestions
+
+### 3. Correlation Engine
+
+Links related incidents and detects patterns:
+
+- **Correlation Types**: Same Service, Same Component, Temporal, Pattern Match, Dependency, Cascade
+- **Problem Detection**: Identifies when correlations suggest larger problems
+- **Time-based Analysis**: Considers temporal proximity of incidents
+- **Service Similarity**: Analyzes shared services and components
+- **Pattern Recognition**: Detects recurring issues and trends
+
+### 4. Duplication Detection
+
+Identifies and manages duplicate incidents:
+
+- **Duplication Types**: Exact, Near Duplicate, Similar, Potential Duplicate
+- **Similarity Analysis**: Text, temporal, and service similarity scoring
+- **Merge Recommendations**: Suggests actions (Merge, Link, Review, No Action)
+- **Confidence Scoring**: AI confidence in duplication detection
+- **Shared Elements**: Identifies common elements between incidents
+
+## API Endpoints
+
+### Incidents
+
+#### Create Incident
+```http
+POST /api/incidents/incidents/
+Content-Type: application/json
+
+{
+    "title": "Database Connection Timeout",
+    "description": "Users are experiencing timeouts when trying to access the database",
+    "free_text": "Database is down, can't connect, getting timeout errors",
+    "affected_users": 150,
+    "business_impact": "Critical business operations are affected",
+    "reporter": 1
+}
+```
+
+#### Get Incident Analysis
+```http
+GET /api/incidents/incidents/{id}/analysis/
+```
+
+Returns comprehensive AI analysis including:
+- Classification results
+- Severity suggestions
+- Correlations with other incidents
+- Potential duplicates
+- Associated patterns
+
+#### Trigger AI Analysis
+```http
+POST /api/incidents/incidents/{id}/analyze/
+```
+
+Manually triggers AI analysis for a specific incident.
+
+#### Get Incident Statistics
+```http
+GET /api/incidents/incidents/stats/
+```
+
+Returns statistics including:
+- Total incidents by status and severity
+- Average resolution time
+- AI processing statistics
+- Duplicate and correlation counts
+
+### Correlations
+
+#### Get Correlations
+```http
+GET /api/incidents/correlations/
+```
+
+#### Get Problem Indicators
+```http
+GET /api/incidents/correlations/problem_indicators/
+```
+
+Returns correlations that indicate larger problems.
+
+### Duplications
+
+#### Get Duplications
+```http
+GET /api/incidents/duplications/
+```
+
+#### Approve Merge
+```http
+POST /api/incidents/duplications/{id}/approve_merge/
+```
+
+#### Reject Merge
+```http
+POST /api/incidents/duplications/{id}/reject_merge/
+```
+
+### Patterns
+
+#### Get Patterns
+```http
+GET /api/incidents/patterns/
+```
+
+#### Get Active Patterns
+```http
+GET /api/incidents/patterns/active_patterns/
+```
+
+#### Resolve Pattern
+```http
+POST /api/incidents/patterns/{id}/resolve_pattern/
+```
+
+## Data Models
+
+### Incident
+- **id**: UUID primary key
+- **title**: Incident title
+- **description**: Detailed description
+- **free_text**: Original free text from user
+- **category**: AI-classified category
+- **subcategory**: AI-classified subcategory
+- **severity**: Current severity level
+- **suggested_severity**: AI-suggested severity
+- **status**: Current status (Open, In Progress, Resolved, Closed)
+- **assigned_to**: Assigned user
+- **reporter**: User who reported the incident
+- **affected_users**: Number of affected users
+- **business_impact**: Business impact description
+- **ai_processed**: Whether AI analysis has been completed
+- **is_duplicate**: Whether this is a duplicate incident
+
+### IncidentClassification
+- **incident**: Related incident
+- **predicted_category**: AI-predicted category
+- **predicted_subcategory**: AI-predicted subcategory
+- **confidence_score**: AI confidence (0.0-1.0)
+- **alternative_categories**: Alternative predictions
+- **extracted_keywords**: Keywords extracted from text
+- **sentiment_score**: Sentiment analysis score (-1 to 1)
+- **urgency_indicators**: Detected urgency indicators
+
+### SeveritySuggestion
+- **incident**: Related incident
+- **suggested_severity**: AI-suggested severity
+- **confidence_score**: AI confidence (0.0-1.0)
+- **user_impact_score**: User impact score (0.0-1.0)
+- **business_impact_score**: Business impact score (0.0-1.0)
+- **technical_impact_score**: Technical impact score (0.0-1.0)
+- **reasoning**: AI explanation for suggestion
+- **impact_factors**: Factors that influenced the severity
+
+### IncidentCorrelation
+- **primary_incident**: Primary incident in correlation
+- **related_incident**: Related incident
+- **correlation_type**: Type of correlation
+- **confidence_score**: Correlation confidence (0.0-1.0)
+- **correlation_strength**: Strength of correlation
+- **shared_keywords**: Keywords shared between incidents
+- **time_difference**: Time difference between incidents
+- **similarity_score**: Overall similarity score
+- **is_problem_indicator**: Whether this suggests a larger problem
+
+### DuplicationDetection
+- **incident_a**: First incident in pair
+- **incident_b**: Second incident in pair
+- **duplication_type**: Type of duplication
+- **similarity_score**: Overall similarity score
+- **confidence_score**: Duplication confidence (0.0-1.0)
+- **text_similarity**: Text similarity score
+- **temporal_proximity**: Temporal proximity score
+- **service_similarity**: Service similarity score
+- **recommended_action**: Recommended action (Merge, Link, Review, No Action)
+- **status**: Current status (Detected, Reviewed, Merged, Rejected)
+
+### IncidentPattern
+- **name**: Pattern name
+- **pattern_type**: Type of pattern (Recurring, Seasonal, Trend, Anomaly)
+- **description**: Pattern description
+- **frequency**: How often the pattern occurs
+- **affected_services**: Services affected by the pattern
+- **common_keywords**: Common keywords in pattern incidents
+- **incidents**: Related incidents
+- **confidence_score**: Pattern confidence (0.0-1.0)
+- **is_active**: Whether the pattern is active
+- **is_resolved**: Whether the pattern is resolved
+
+## AI Components
+
+### IncidentClassifier
+- **Categories**: Predefined categories with keywords
+- **Keyword Extraction**: Extracts relevant keywords from text
+- **Sentiment Analysis**: Analyzes sentiment of incident text
+- **Urgency Detection**: Identifies urgency indicators
+- **Confidence Scoring**: Provides confidence scores for classifications
+
+### SeverityAnalyzer
+- **Impact Analysis**: Analyzes user, business, and technical impact
+- **Severity Indicators**: Identifies severity keywords in text
+- **Weighted Scoring**: Combines multiple factors for severity suggestion
+- **Reasoning Generation**: Provides explanations for severity suggestions
+
+### IncidentCorrelationEngine
+- **Similarity Analysis**: Calculates various similarity metrics
+- **Temporal Analysis**: Considers time-based correlations
+- **Service Analysis**: Analyzes service and component similarities
+- **Problem Detection**: Identifies patterns that suggest larger problems
+- **Cluster Detection**: Groups related incidents into clusters
+
+### DuplicationDetector
+- **Text Similarity**: Multiple text similarity algorithms
+- **Temporal Proximity**: Time-based duplication detection
+- **Service Similarity**: Service and component similarity
+- **Metadata Similarity**: Similarity based on incident metadata
+- **Merge Recommendations**: Suggests appropriate actions
+
+## Background Processing
+
+The module uses Celery for background processing of AI analysis:
+
+### Tasks
+- **process_incident_ai**: Processes a single incident with AI analysis
+- **batch_process_incidents_ai**: Processes multiple incidents
+- **find_correlations**: Finds correlations for an incident
+- **find_duplicates**: Finds duplicates for an incident
+- **detect_all_duplicates**: Batch duplicate detection
+- **correlate_all_incidents**: Batch correlation analysis
+- **merge_duplicate_incidents**: Merges duplicate incidents
+
+### Processing Logs
+All AI processing activities are logged in the `AIProcessingLog` model for audit and debugging purposes.
+
+## Setup and Configuration
+
+### 1. Install Dependencies
+```bash
+pip install -r requirements.txt
+```
+
+### 2. Run Migrations
+```bash
+python manage.py makemigrations incident_intelligence
+python manage.py migrate
+```
+
+### 3. Create Sample Data
+```bash
+python manage.py setup_incident_intelligence --create-sample-data --create-patterns
+```
+
+### 4. Run AI Analysis
+```bash
+python manage.py setup_incident_intelligence --run-ai-analysis
+```
+
+### 5. Start Celery Worker
+```bash
+celery -A core worker -l info
+```
+
+## Usage Examples
+
+### Creating an Incident with AI Analysis
+```python
+from incident_intelligence.models import Incident
+from incident_intelligence.tasks import process_incident_ai
+
+# Create incident
+incident = Incident.objects.create(
+    title="API Response Slow",
+    description="The user service API is responding slowly",
+    free_text="API is slow, taking forever to respond",
+    affected_users=50,
+    business_impact="User experience is degraded"
+)
+
+# Trigger AI analysis
+process_incident_ai.delay(incident.id)
+```
+
+### Finding Correlations
+```python
+from incident_intelligence.ai.correlation import IncidentCorrelationEngine
+
+engine = IncidentCorrelationEngine()
+correlations = engine.find_related_incidents(incident_data, all_incidents)
+```
+
+### Detecting Duplicates
+```python
+from incident_intelligence.ai.duplication import DuplicationDetector
+
+detector = DuplicationDetector()
+duplicates = detector.find_duplicate_candidates(incident_data, all_incidents)
+```
+
+## Performance Considerations
+
+- **Batch Processing**: Use batch operations for large datasets
+- **Caching**: Consider caching frequently accessed data
+- **Indexing**: Database indexes are configured for optimal query performance
+- **Background Tasks**: AI processing runs asynchronously to avoid blocking requests
+- **Rate Limiting**: Consider implementing rate limiting for API endpoints
+
+## Security Considerations
+
+- **Authentication**: All endpoints require authentication
+- **Authorization**: Users can only access incidents they have permission to view
+- **Data Privacy**: Sensitive information is handled according to data classification levels
+- **Audit Logging**: All AI processing activities are logged for audit purposes
+
+## Monitoring and Maintenance
+
+- **Processing Logs**: Monitor AI processing logs for errors and performance
+- **Model Performance**: Track AI model accuracy and update as needed
+- **Database Maintenance**: Regular cleanup of old processing logs and resolved incidents
+- **Health Checks**: Monitor Celery workers and Redis for background processing health
+
+## Future Enhancements
+
+- **Machine Learning Models**: Integration with more sophisticated ML models
+- **Real-time Processing**: Real-time incident analysis and correlation
+- **Advanced NLP**: More sophisticated natural language processing
+- **Predictive Analytics**: Predictive incident analysis and prevention
+- **Integration APIs**: APIs for integrating with external incident management systems