Incident with Codespaces

highGitHubFeb 12, 2026 07:53Duration: 4h 38m

networkcomputestoragedatabaseapiauthenticationdeploymentconfigurationrouting

Configuration ErrorDatabase OverloadNetwork / RoutingCapacity IssueDeployment FailureAuthentication IssueStorage FailureAPI Issue

Summary

On February 2, 2026, GitHub Codespaces were unavailable between 18:55 and 22:20 UTC and degraded until the service fully recovered at February 3, 2026 00:15 UTC. During this time, Codespaces creation and resume operations failed in all regions. This outage was caused by a backend storage access policy change in our underlying compute provider that blocked access to critical VM metadata, causing all VM create, delete, reimage, and other operations to fail. More information is availabl

Impact

major

Timeline

Feb 2, 2026 20:17

[investigating] We are investigating reports of degraded availability for Codespaces

via statuspage

+2m

Feb 2, 2026 20:19

[investigating] Users may see errors creating or resuming codespaces. We are investigating and will provide further updates as we have them.

via statuspage

+3h 33m

Feb 2, 2026 23:52

[investigating] Codespaces is seeing steady recovery

via statuspage

+33m

Feb 3, 2026 00:25

[investigating] Codespaces is experiencing degraded performance. We are continuing to investigate.

via statuspage

+29m

Feb 3, 2026 00:54

[investigating] Codespaces is operating normally.

via statuspage

+0m

Feb 3, 2026 00:54

[resolved] On February 2, 2026, GitHub Codespaces were unavailable between 18:55 and 22:20 UTC and degraded until the service fully recovered at February 3, 2026 00:15 UTC. During this time, Codespaces creation and resume operations failed in all regions. This outage was caused by a backend storage access policy change in our underlying compute provider that blocked access to critical VM metadata, causing all VM create, delete, reimage, and other operations to fail. More information is available at https://azure.status.microsoft/en-us/status/history/?trackingId=FNJ8-VQZ. This was mitigated by rolling back the policy change, which started at 22:15 UTC. As VMs came back online, our runners worked through the backlog of requests that hadn’t timed out. We are working with our compute provider to improve our incident response and engagement time, improve early detection before they impact our customers, and ensure safe rollout should similar changes occur in the future. We recognize this was a significant outage to our users that rely on GitHub’s workloads and apologize for the impact this had.

via statuspage

+222h 59m

Feb 12, 2026 07:53

[investigating] We are investigating reports of degraded availability for Codespaces

via statuspage

+8m

Feb 12, 2026 08:02

[investigating] We are seeing an increase in Codespaces creation and resuming failures across multiple regions, primarily in EMEA. Our team are analysing the situation and are working to mitigate this impact. While we are working, customers are advised to create Codespaces in US East and US West regions via the "New with options..." button when creating a Codespace. More updates as we have them.

via statuspage

+30m

Feb 12, 2026 08:32

[investigating] We now understand the source of the VM create/resume failures and are working with our partners to mitigate the impact.

via statuspage

+32m

Feb 12, 2026 09:04

[investigating] We have identified the issue causing Codespace create/resume actions to fail and are applying a fix. This is estimated to take ~2 hours to complete but impact will begin to reduce sooner than that. We will continue to monitor recovery progress and will report back when more information is available.

via statuspage

+35m

Feb 12, 2026 09:39

[investigating] We are seeing widespread recovery across all our regions. We will continue to monitor progress and will resolve the incident when we are confident on durable recovery.

via statuspage

+2m

Feb 12, 2026 09:42

[investigating] Codespaces is experiencing degraded performance. We are continuing to investigate.

via statuspage

+14m

Feb 12, 2026 09:56

[investigating] Recovery looks consistent with Codespaces creating and resuming successfully across all regions. Thank you for your patience.

via statuspage

+0m

Feb 12, 2026 09:56

[resolved] On February 12, 2026, between 00:51 UTC and 09:35 UTC, users attempting to create or resume Codespaces experienced elevated failure rates across Europe, Asia and Australia, peaking at a 90% failure rate. The disconnects were triggered by a bad configuration rollout in a core networking dependency, which led to internal resource provisioning failures. We are working to improve our alerting thresholds to catch issues before they impact customers and strengthening rollout safeguards to prevent similar incidents.

via statuspage

+593h 8m

Mar 9, 2026 03:04

[investigating] We are investigating reports of degraded performance for Codespaces

via statuspage

+0m

Mar 9, 2026 03:04

[investigating] We are seeing about 5% of new Codespace creation requests failing. We are investigating the root cause and identifying the impacted regions.

via statuspage

+28m

Mar 9, 2026 03:32

[investigating] We are seeing recovery, with the failure rate for new Codespace creation requests dropping from 5% to about 3%.

via statuspage

+19m

Mar 9, 2026 03:51

[investigating] This incident has been resolved. New Codespace creation requests are now completing successfully.

via statuspage

+0m

Mar 9, 2026 03:51

[resolved] This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

via statuspage

+0m

Mar 9, 2026 03:51

[resolved] On March 9, 2026, between 01:23 UTC and 03:25 UTC, users attempting to create or resume codespaces in the Australia East region experienced elevated failures, peaking at a 100% failure rate for this region. Codespaces in other regions were not affected. The create and resume failures were caused by degraded network connectivity between our control plane services and the VMs hosting the codespaces. This was resolved by redirecting traffic to an alternate site within the region. While we are addressing the core network infrastructure issue, we have also improved our observability of components in this area to improve detection. This will also enable our existing automated failovers to cover this failure mode. These changes will prevent or significantly reduce the time any similar incident causes user impact.

via statuspage

Lessons Learned

⚠GitHub has experienced 39 incidents in the past year. This frequency suggests systemic reliability challenges that may warrant additional monitoring.

📊Incidents related to network, compute, storage, database, api, authentication, deployment, configuration, routing have occurred 259 times across all providers in the past year. This is one of the most common failure categories in cloud infrastructure.

💡This incident is categorized as: Configuration Error, Database Overload, Network / Routing, Capacity Issue, Deployment Failure, Authentication Issue, Storage Failure, API Issue. Consider implementing preventive measures specific to this failure category.