Back to Repositories

Validating Failed Deployment Recovery Workflow in Kamal

This integration test suite validates deployment failure scenarios in Kamal, focusing on handling bad image deployments and ensuring system stability during failed deployments. The tests verify proper rollback behavior and error handling when deploying corrupted or invalid Docker images.

Test Coverage Overview

The test suite provides comprehensive coverage of deployment failure scenarios in Kamal.

Key areas tested include:
  • Bad image deployment handling
  • Version rollback verification
  • Container state management across multiple hosts
  • Worker process coordination during failures
Integration points span across Docker container management, nginx configuration, and multi-host deployment coordination.

Implementation Analysis

The testing approach employs a systematic verification of deployment states and container behaviors across different hosts. The implementation utilizes Minitest’s assertion patterns for validating deployment outcomes, container states, and error messages.

Technical patterns include:
  • Version tracking across deployments
  • Multi-host container state verification
  • Error message pattern matching
  • Deployment process monitoring

Technical Details

Testing infrastructure includes:
  • Minitest framework for test organization
  • Custom assertions for deployment validation
  • Docker container management utilities
  • Nginx configuration verification
  • Multi-host deployment coordination
  • Version management system

Best Practices Demonstrated

The test suite exemplifies robust integration testing practices with thorough error condition handling and state verification.

Notable practices include:
  • Comprehensive error message validation
  • State verification across multiple hosts
  • Graceful failure handling checks
  • Clear test case organization
  • Effective use of helper methods for common assertions

basecamp/kamal

test/integration/broken_deploy_test.rb

            
require_relative "integration_test"

class BrokenDeployTest < IntegrationTest
  test "deploying a bad image" do
    @app = "app_with_roles"

    first_version = latest_app_version

    kamal :deploy

    assert_app_is_up version: first_version
    assert_container_running host: :vm3, name: "app_with_roles-workers-#{first_version}"

    second_version = break_app

    output = kamal :deploy, raise_on_error: false, capture: true

    assert_failed_deploy output
    assert_app_is_up version: first_version
    assert_container_running host: :vm3, name: "app_with_roles-workers-#{first_version}"
    assert_container_not_running host: :vm3, name: "app_with_roles-workers-#{second_version}"
  end

  private
    def assert_failed_deploy(output)
      assert_match "Waiting for the first healthy web container before booting workers on vm3...", output
      assert_match /First web container is unhealthy on vm[12], not booting any other roles/, output
      assert_match "First web container is unhealthy, not booting workers on vm3", output
      assert_match "nginx: [emerg] unexpected end of file, expecting \";\" or \"}\" in /etc/nginx/conf.d/default.conf:2", output
    end
end