File size: 5,972 Bytes
7c4d825 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 |
flowchart TD
classDef processNode fill:#dae8fc,stroke:#6c8ebf,color:#000
classDef imageNode fill:#d5e8d4,stroke:#82b366,color:#000
classDef networkNode fill:#ffe6cc,stroke:#d79b00,color:#000
classDef errorNode fill:#f8cecc,stroke:#b85450,color:#000
classDef successNode fill:#d5e8d4,stroke:#82b366,color:#000
classDef configNode fill:#fff2cc,stroke:#d6b656,color:#000
classDef metricNode fill:#e1d5e7,stroke:#9673a6,color:#000
Input[/"Image URLs List"/]:::imageNode --> BatchProcess
BatchProcess["Batch Processing Initialization
• Setup session with retries
• Configure headers and timeouts
• Initialize batch logging"]:::processNode --> ContextLoop
ContextLoop["For Each Context:
Check skip conditions
Validate URL format"]:::processNode --> SkipCheck
SkipCheck{"Skip Conditions
skip_run OR skip_processing?"}:::processNode
SkipCheck -->|Yes| SkipLogging["Log Skip Reason
Update skip counter"]:::processNode
SkipCheck -->|No| DownloadAttempt
DownloadAttempt["Download Attempt
• Apply retry strategy
• Exponential backoff
• Validate content type"]:::networkNode --> ContentValidation
ContentValidation{"Content Validation
Is valid image?
Check MIME type
Verify file signature"}:::processNode
ContentValidation -->|Valid| ImageProcessing["Image Processing
• Open with PIL
• Extract EXIF data
• Get original dimensions
• Store content in context"]:::successNode
ContentValidation -->|Invalid| RetryLogic{"Retry Logic
Attempts < MAX_RETRIES?
Check error type"}:::errorNode
RetryLogic -->|Yes| BackoffDelay["Exponential Backoff
delay = RETRY_DELAY * (BACKOFF_MULTIPLIER ^ attempt)
Wait before retry"]:::errorNode
BackoffDelay --> DownloadAttempt
RetryLogic -->|No| MarkFailed["Mark Context as Failed
Set skip_run = True
Log failure details"]:::errorNode
ImageProcessing --> CleanupContent["Cleanup Downloaded Content
Remove _download_content
Free memory"]:::processNode
CleanupContent --> UpdateLogs["Update Batch Logs
• Record success metrics
• Store performance data
• Update counters"]:::metricNode
MarkFailed --> UpdateLogs
SkipLogging --> UpdateLogs
UpdateLogs --> CheckNext{"More Contexts?"}:::processNode
CheckNext -->|Yes| ContextLoop
CheckNext -->|No| FinalSummary
FinalSummary["Generate Final Summary
• Calculate success rate
• Analyze error distribution
• Generate performance metrics
• Check for batch abort conditions"]:::metricNode --> AbortCheck
AbortCheck{"Any Download Errors?
error_count > 0"}:::processNode
AbortCheck -->|Yes| BatchAbort["Batch Abort Logic
Mark all contexts skip_run = True
Log abort reason"]:::errorNode
AbortCheck -->|No| Output
BatchAbort --> Output
Output[/"Processing Complete
Updated contexts with images or skip flags"/]:::imageNode
subgraph RetryStrategy["Retry Strategy Configuration"]
RetryConfig["Session Retry Strategy:
• MAX_RETRIES_PER_REQUEST = 2
• Status codes: [429, 500, 502, 503, 504]
• Backoff factor: 1
• Allowed methods: [GET]"]:::configNode
CustomRetry["Application Retry Strategy:
• MAX_RETRIES = 3
• RETRY_DELAY = 2 seconds
• BACKOFF_MULTIPLIER = 1.5
• Exponential backoff calculation"]:::configNode
SessionConfig["Session Configuration:
• User-Agent: Mozilla/5.0 (compatible; ImageProcessor/1.0)
• Accept: image/*
• Accept-Encoding: gzip, deflate
• Connection: keep-alive
• Timeout: BATCH_DOWNLOAD_TIMEOUT (30s)"]:::configNode
RetryConfig --> CustomRetry --> SessionConfig
end
BatchProcess -.-> RetryStrategy
subgraph ErrorHandling["Error Categorization & Handling"]
NetworkErrors["Network Errors:
• Connection timeouts
• DNS resolution failures
• SSL certificate issues
• Socket errors"]:::errorNode
ContentErrors["Content Errors:
• Non-image MIME types
• Corrupted image data
• Empty responses
• Invalid file signatures"]:::errorNode
ServerErrors["Server Errors:
• HTTP 4xx/5xx responses
• Rate limiting (429)
• Server unavailable (503)
• Gateway errors (502, 504)"]:::errorNode
ProcessingErrors["Processing Errors:
• PIL image opening failures
• Memory allocation errors
• File format unsupported"]:::errorNode
NetworkErrors --> ContentErrors --> ServerErrors --> ProcessingErrors
end
DownloadAttempt -.-> ErrorHandling
subgraph PerformanceMetrics["Performance Tracking"]
DownloadMetrics["Download Metrics:
• Download time per image
• Content size tracking
• Attempts per success
• Bandwidth utilization"]:::metricNode
BatchMetrics["Batch Metrics:
• Total processing time
• Success rate calculation
• Error rate by category
• Resource utilization"]:::metricNode
QualityMetrics["Quality Metrics:
• Image dimensions
• File format distribution
• Content type validation
• Error pattern analysis"]:::metricNode
DownloadMetrics --> BatchMetrics --> QualityMetrics
end
UpdateLogs -.-> PerformanceMetrics
subgraph SecurityMeasures["Security & Validation"]
ContentValidation2["Content Type Validation:
• HTTP Content-Type header
• PIL format detection
• Magic byte verification
• File extension matching"]:::configNode
SecurityHeaders["Security Headers:
• User-Agent masking
• Accept header specification
• Connection management
• Timeout enforcement"]:::configNode
MemoryProtection["Memory Protection:
• Streaming downloads
• Content size limits
• Immediate cleanup
• Resource monitoring"]:::configNode
ContentValidation2 --> SecurityHeaders --> MemoryProtection
end
ContentValidation -.-> SecurityMeasures |