Railway - Elevated deployment latency – Incident details

Elevated deployment latency

Resolved
Partial outage 30 %
Started 6 days agoLasted about 2 hours

Affected

Deployments (Railway Metal)

Partial outage from 4:26 PM to 5:55 PM

US West (Metal / California, USA)

Partial outage from 4:26 PM to 5:55 PM

US East (Metal / Virginia, USA)

Partial outage from 4:26 PM to 5:55 PM

EU West (Metal / Amsterdam, Netherlands)

Partial outage from 4:26 PM to 5:55 PM

Southeast Asia (Metal / SIngapore)

Partial outage from 4:26 PM to 5:55 PM

Updates
  • Resolved
    Resolved

    Our deployment pipeline throughput was impacted by a resource constraint on the database backing our Temporal cluster. We have increased cluster resources and upgraded the database; we have seen deployment throughput restored.

  • Monitoring
    Monitoring

    Queue backpresssure has significantly reduced, we have re-enabled non-pro deployments and are monitoring

  • Identified
    Identified

    Deployments are now processing as we're reducing the queue backpressure. We have halted some background operations to relieve deployment delays.

  • Update
    Update

    We continue to investigate the issue and have halted some active background operations to relieve the deployment delays

  • Investigating
    Investigating

    We are investigating a number of deploys that are stuck initializing. Non-Pro deployments have been temporarily paused.