Disruption with some GitHub services

lowGitHubMar 5, 2026 01:13Duration: 9m

networkcomputedatabaseapiauthenticationdeploymentconfigurationcapacityrouting

Configuration ErrorDatabase OverloadNetwork / RoutingCapacity IssueDeployment FailureAuthentication IssueAPI Issue

Summary

From Feb 2, 2026 17:13 UTC to Feb 2, 2026 17:36 UTC we experienced failures on ~0.02% of Git operations. While deploying an internal service, a misconfiguration caused a small subset of traffic to route to a service that was not ready. During the incident we observed the degradation and statused publicly. To mitigate the issue, traffic was redirected to healthy instances and we resumed normal operation. We are improving our monitoring and deployment processes in this area t

Impact

minor

Timeline

Jan 22, 2026 14:12

[investigating] We are investigating reports of impacted performance for some GitHub services.

via statuspage

+11m

Jan 22, 2026 14:23

[investigating] Issues is experiencing degraded performance. We are continuing to investigate.

via statuspage

+4m

Jan 22, 2026 14:27

[investigating] Issues is operating normally.

via statuspage

+55m

Jan 22, 2026 15:22

[investigating] We have identified an issue in one of our services and have mitigated it. Services have recovered and we have a mitigation but we are working on a longer term solution.

via statuspage

+0m

Jan 22, 2026 15:22

[resolved] On January 22, 2026, our authentication service experienced an issue between 14:00 UTC and 14:50 UTC, resulting in downstream disruptions for users. From 14:00 UTC to 14:23 UTC, authenticated API requests experienced higher-than-normal error rates, with an average of 16.9% and occasional peaks up to 22.2% resulting in HTTP 401 responses for authenticated API requests. From 14:00 UTC to 14:50 UTC, git operations over HTTP were impacted, with error rates averaging 3.8% and peaking at 10.8%. As a result, some users may have been unable to run git commands as expected. This was due to the authentication service reaching the maximum allowed number of database connections. We mitigated the incident by increasing the maximum number of database connections in the authentication service. We are adding additional monitoring around database connection pool usage and improving our traffic projection to reduce our time to detection and mitigation of issues like this one in the future.

via statuspage

+266h 11m

Feb 2, 2026 17:34

[investigating] We are investigating reports of impacted performance for some GitHub services.

via statuspage

+1m

Feb 2, 2026 17:35

[investigating] Git Operations is experiencing degraded performance. We are continuing to investigate.

via statuspage

+6m

Feb 2, 2026 17:41

[investigating] We are investigating reports of impacted performance for some GitHub services.

via statuspage

+1m

Feb 2, 2026 17:42

[investigating] We’ve observed a low rate (~0.01%) of 5xx errors for HTTP-based fetches and clones. We’re currently routing traffic away from the affected location and are seeing recovery.

via statuspage

+1m

Feb 2, 2026 17:43

[resolved] From Feb 2, 2026 17:13 UTC to Feb 2, 2026 17:36 UTC we experienced failures on ~0.02% of Git operations. While deploying an internal service, a misconfiguration caused a small subset of traffic to route to a service that was not ready. During the incident we observed the degradation and statused publicly. To mitigate the issue, traffic was redirected to healthy instances and we resumed normal operation. We are improving our monitoring and deployment processes in this area to avoid future routing issues.

via statuspage

+15m

Feb 2, 2026 17:58

[investigating] Dependabot is currently experiencing an issue that may cause scheduled update jobs to fail when creating pull requests. Our team has identified the problem and deployed a fix. We’re seeing signs of recovery and expect full resolution within the next few hours.

via statuspage

+48m

Feb 2, 2026 18:46

[resolved] From Jan 31, 2026 00:30 UTC to Feb 2, 2026 18:00 UTC Dependabot service was degraded and failed to create 10% of Automated Pull Requests. This was due to a cluster failover that connected to a read-only database. We mitigated the incident by pausing Dependabot queues until traffic was properly routed to healthy clusters. We’re working on identifying and rerunning all failed jobs during this time. We’re adding new monitors and alerts to reduce our time to detection and prevent this in the future.

via statuspage

+188h 20m

Feb 10, 2026 15:07

[investigating] We are investigating reports of impacted performance for some GitHub services.

via statuspage

+1m

Feb 10, 2026 15:08

[investigating] We are seeing intermittent timeouts on some pages and are investigating.

via statuspage

+0m

Feb 10, 2026 15:08

[investigating] Pull Requests is experiencing degraded performance. We are continuing to investigate.

via statuspage

+25m

Feb 10, 2026 15:33

[investigating] We continue investigating intermittent timeouts on some pages.

via statuspage

+14m

Feb 10, 2026 15:47

[investigating] We believe we have found the cause of the problem and are working on mitigation.

via statuspage

+4m

Feb 10, 2026 15:51

[investigating] We have deployed a mitigation for the issue and are observing what we believe is the start of recovery. We will continue to monitor.

via statuspage

+8m

Feb 10, 2026 15:58

[investigating] Pull Requests is operating normally.

via statuspage

+0m

Feb 10, 2026 15:58

[resolved] On February 10th, 2026, between 14:35 UTC and 15:58 UTC web experiences on GitHub.com were degraded including Pull Requests and Authentication, resulting in intermittent 5xx errors and timeouts. The error rate on web traffic peaked at approximately 2%. This was due to increased load on a critical database, which caused significant memory pressure resulting in intermittent errors. We mitigated the incident by applying a configuration change to the database to increase available memory on the host. We are working to identify changes in load patterns and are reviewing the configuration of our databases to ensure there is sufficient capacity to meet growth. Additionally, we are improving monitoring and self-healing functionalities for database memory issues to reduce our time to detect and mitigation.

via statuspage

+26h 59m

Feb 11, 2026 18:58

[investigating] We are investigating reports of impacted performance for some GitHub services.

via statuspage

+2m

Feb 11, 2026 19:00

[investigating] Actions is experiencing capacity constraints with larger hosted runners, leading to high wait times. Standard hosted labels and self-hosted runners are not impacted. We're working with the capacity provider to mitigate the impact.

via statuspage

+37m

Feb 11, 2026 19:37

[investigating] We're continuing to work toward mitigation with our capacity provider, and adding capacity.

via statuspage

+1h 56m

Feb 11, 2026 21:33

[investigating] Actions is experiencing capacity constraints with larger hosted runners, leading to high wait times. Standard hosted labels and self-hosted runners are not impacted. The issue is mitigated and we are monitoring recovery.

via statuspage

+3h 26m

Feb 12, 2026 00:59

[resolved] On February 11 between 16:37 UTC and 00:59 UTC the following day, 4.7% of workflows running on GitHub Larger Hosted Runners were delayed by an average of 37 minutes. Standard Hosted and self-hosted runners were not impacted. This incident was caused by capacity degradation in Central US for Larger Hosted Runners. Workloads not pinned to that region were picked up by other regions, but were delayed as those regions became saturated. Workloads configured with private networking in that region were delayed until compute capacity in that region recovered. The issue was mitigated by rebalancing capacity across internal and external workloads and general increases in capacity in affected regions to speed recovery. In addition to working with our compute partners on the core capacity degradation, we are working to ensure other regions are better able to absorb load with less delay to customer workloads. For pinned workflows using private networking, we are shipping support soon for customers to failover if private networking is configured in a paired region.

via statuspage

+9h 39m

Feb 12, 2026 10:38

[investigating] We are investigating reports of impacted performance for some GitHub services.

via statuspage

+1m

Feb 12, 2026 10:39

[investigating] We are investigating an issue with downloading repository archives that include Git LFS objects.

via statuspage

+22m

Feb 12, 2026 11:01

[investigating] We have resolved the issue and are seeing full recovery.

via statuspage

+10m

Feb 12, 2026 11:12

[resolved] From Feb 12, 2026 09:16:00 UTC to Feb 12, 2026 11:01 UTC, users attempting to download repository archives (tar.gz/zip) that include Git LFS objects received errors. Standard repository archives without LFS objects were not affected. On average, the archive download error rate was 0.0042% and peaked at 0.0339% of requests to the service. This was caused by deploying a corrupt configuration bundle, resulting in missing data used for network interface connections by the service. We mitigated the incident by applying the correct configuration to each site. We have added checks for corruption in this deployment, and will add auto-rollback detection for this service to prevent issues like this in the future.

via statuspage

+7h 24m

Feb 12, 2026 18:36

[investigating] We are investigating reports of impacted performance for some GitHub services.

via statuspage

+10m

Feb 12, 2026 18:46

[investigating] We are experiencing degraded availability in Australia for Copilot completions and suggestions. We are working to resolve the issue

via statuspage

+32m

Feb 12, 2026 19:18

[investigating] We are experiencing degraded availability in Australia and Brazil for Copilot completions and suggestions. We are working to resolve the issue

via statuspage

+42m

Feb 12, 2026 19:59

[investigating] Next Edit Suggestions availability is recovering. We are continuing to monitor until fully restored.

via statuspage

+35m

Feb 12, 2026 20:34

[resolved] Between February 11th 21:30 UTC and February 12th 15:40 UTC, users in Western Europe experienced degraded quality for all Next Edit Suggestions requests. Additionally, on February 12th, between 18:40 UTC and 20:30 UTC, users in Australia and South America experienced degraded quality and increased latency of up to 500ms for all Next Edit Suggestions requests. The root cause was a newly introduced regression in an upstream service dependency. The incident was mitigated by failing over Next Edit Suggestions traffic to unaffected regions, which caused the increased latency. Once the regression was identified and rolled back, we restored the impacted capacity. We have improved our quality analysis tooling and are working on more robust quality impact alerting to accelerate detection of these issues in the future.

via statuspage

+484h 39m

Mar 5, 2026 01:13

[investigating] We are investigating reports of impacted performance for some GitHub services.

via statuspage

+8m

Mar 5, 2026 01:21

[investigating] Users were temporarily unable to see tasks listed in mission control surfaces. The ability to submit new tasks, view existing tasks via direct link, or manage tasks was unaffected throughout. A revert is currently being deployed and we are seeing recovery.

via statuspage

+8m

Mar 5, 2026 01:30

[investigating] Copilot coding agent mission control is fully restored. Tasks are now listed as expected.

via statuspage

+0m

Mar 5, 2026 01:30

[resolved] On March 5, 2026, between 12:53 UTC and 13:35 UTC, the Copilot mission control service was degraded. This resulted in empty responses returned for users' agent session lists across GitHub web surfaces. Impacted users were unable to see their lists of current and previous agent sessions in GitHub web surfaces. This was caused by an incorrect database query that falsely excluded records that have an absent field. We mitigated the incident by rolling back the database query change. There were no data alterations nor deletions during the incident. To prevent similar issues in the future, we're improving our monitoring depth to more easily detect degradation before changes are fully rolled out.

via statuspage

+0m

Mar 5, 2026 01:30

[resolved] This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

via statuspage

Lessons Learned

⚠GitHub has experienced 39 incidents in the past year. This frequency suggests systemic reliability challenges that may warrant additional monitoring.

📊Incidents related to network, compute, database, api, authentication, deployment, configuration, capacity, routing have occurred 259 times across all providers in the past year. This is one of the most common failure categories in cloud infrastructure.

💡This incident is categorized as: Configuration Error, Database Overload, Network / Routing, Capacity Issue, Deployment Failure, Authentication Issue, API Issue. Consider implementing preventive measures specific to this failure category.