Delayed AWS Metrics and Events

highDatadogMonitoringMay 2, 2025 00:56Duration: 1h 16m
api
API Issue

Summary

This incident has been resolved.

Impact

major

Timeline

May 2, 2025 00:56

[investigating] We are investigating increased latency processing AWS metrics and events. As a result of this issue, some users may see delays or gaps in graphs that contain these metrics and events. To prevent spurious alerts, we have temporarily disabled monitors based on this data.

via statuspage
+23m
May 2, 2025 01:19

[identified] The issue has been identified and a fix is being implemented.

via statuspage
+33m
May 2, 2025 01:52

[identified] A fix has been implemented and recovery is in progress. To prevent spurious alerts, monitors on AWS Metrics and Events remain disabled until recovery is complete.

via statuspage
+21m
May 2, 2025 02:13

[resolved] This incident has been resolved.

via statuspage

Lessons Learned

Datadog has experienced 33 incidents in the past year. This frequency suggests systemic reliability challenges that may warrant additional monitoring.

📊Incidents related to api have occurred 185 times across all providers in the past year. This is one of the most common failure categories in cloud infrastructure.

💡This incident is categorized as: API Issue. Consider implementing preventive measures specific to this failure category.