Appendix: CI/CD Complete Reference

This appendix contains all CI/CD-related reference materials: testing guides and workflow details. For the main workflow documentation, see CI/CD Workflow.

Part 1: CI/CD Testing Guide
Part 2: CI/CD Workflow Details

Part 1: CI/CD Testing Guide

id: APPENDIX_H_CI_CD_TESTING title: Appendix H - CI/CD Testing Guide sidebar_position: 8

Overview

Testing CI/CD workflows is highly practical and recommended! This guide covers multiple approaches to test your GitHub Actions workflows.

Why Test CI/CD?

✅ Catch issues early - Find workflow problems before pushing to GitHub
✅ Faster feedback - Test locally without waiting for GitHub Actions
✅ Cost savings - Avoid consuming GitHub Actions minutes during development
✅ Confidence - Ensure workflows work correctly before deployment

Testing Approaches

1. Local Testing with `act` (Recommended)

act is a tool that runs GitHub Actions workflows locally using Docker.

Installation

# Linux
curl https://raw.githubusercontent.com/nektos/act/master/install.sh | sudo bash

# macOS
brew install act

# Windows (with Chocolatey)
choco install act-cli

Prerequisites: Docker must be installed and running.

Quick Start

# List all workflows and jobs
cd tasks/04_devops_cicd
act -l

# Run all workflows (push event)
act push

# Run specific job
act push -j python-validation

# Run with verbose output
act push --verbose

Using the Test Script

We provide a convenient test script:

# Make script executable
chmod +x tasks/04_devops_cicd/scripts/test_ci_workflow.sh

# Run the test script
./tasks/04_devops_cicd/scripts/test_ci_workflow.sh

The script will:

Check if act is installed
Check if Docker is running
List available jobs
Let you choose which job to test

Limitations

Secrets: Must be provided manually (use act secrets)
AWS Services: Cannot test AWS-specific steps (S3, Glue) without credentials
GitHub API: Some actions require GitHub API access

Example: Testing with Secrets

# Create secrets file
cat > .secrets <<EOF
AWS_ACCESS_KEY_ID=test
AWS_SECRET_ACCESS_KEY=test
EOF

# Run with secrets
act push --secret-file .secrets

2. Workflow Syntax Validation

Validate YAML syntax and workflow structure:

# Using actionlint (recommended)
chmod +x tasks/04_devops_cicd/scripts/validate_workflow_syntax.sh
./tasks/04_devops_cicd/scripts/validate_workflow_syntax.sh

Or manually:

# Install actionlint
# Linux: https://github.com/rhymond/actionlint#installation
# macOS: brew install actionlint

# Validate workflow
actionlint tasks/04_devops_cicd/.github/workflows/ci.yml

3. Manual Testing on GitHub

For full integration testing, push to a test branch:

# Create test branch
git checkout -b test/ci-workflow

# Make a small change (e.g., add a comment)
echo "# Test CI" >> tasks/04_devops_cicd/.github/workflows/ci.yml

# Commit and push
git add .
git commit -m "test: CI workflow"
git push origin test/ci-workflow

# Check GitHub Actions tab for results

Cleanup:

git checkout main
git branch -D test/ci-workflow
git push origin --delete test/ci-workflow

4. Unit Testing Workflow Steps

Test individual workflow steps in isolation:

Example: Test Python Setup

# Test Python setup step locally
python3 -m venv test-venv
source test-venv/bin/activate
pip install --upgrade pip
pip install -r tasks/01_data_ingestion_transformation/requirements.txt
pip install -r tasks/01_data_ingestion_transformation/requirements-dev.txt

# Test linting
cd tasks/01_data_ingestion_transformation
ruff check src/ tests/

# Test unit tests
pytest tests/test_etl.py tests/test_integration.py -v

Example: Test PySpark Setup

# Test Java setup
java -version  # Should be Java 17

# Test PySpark installation
python3 -c "import pyspark; print(pyspark.__version__)"

# Test PySpark tests
export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64
pytest tests/test_etl_spark.py -v

Testing Strategy

What to Test

✅ Workflow Syntax - YAML is valid
✅ Job Dependencies - Jobs run in correct order
✅ Step Execution - Each step completes successfully
✅ Environment Setup - Python, Java, dependencies install correctly
✅ Test Execution - Unit tests run and pass
✅ Linting - Code style checks pass
⚠️ AWS Integration - Requires AWS credentials (test separately)

What's Hard to Test Locally

❌ AWS Services (S3, Glue, Step Functions) - Require AWS credentials
❌ GitHub API - Some actions need GitHub API access
❌ Secrets Management - Must be provided manually
❌ Matrix Builds - Can be slow locally

Recommended Testing Flow

CI/CD Test Checklist

Before merging CI/CD changes:

Workflow YAML syntax is valid
All jobs can run locally (with act)
Python setup works (Python 3.10)
Java setup works (Java 17)
Dependencies install correctly
Linting passes (ruff, sqlfluff)
Unit tests pass (pytest)
Workflow runs on test branch
No secrets exposed in logs

Troubleshooting

`act` Issues

Problem: act can't find Docker

# Solution: Ensure Docker is running
docker info

Problem: act uses wrong image size

# Solution: Select image size on first run
act push
# Choose: micro, medium, or large

Problem: Workflow fails with "secrets not found"

# Solution: Provide secrets manually
act push --secret-file .secrets

Workflow Issues

Problem: Tests fail locally but pass on GitHub

Check Python version (should be 3.10)
Check Java version (should be 17)
Verify dependencies match requirements.txt

Problem: Linting fails

# Run linting manually to see errors
cd tasks/01_data_ingestion_transformation
ruff check src/ tests/

Continuous Improvement

Add Workflow Tests to CI

You can even test your CI/CD workflows in CI! Add a workflow validation step:

# .github/workflows/validate-workflows.yml
name: Validate Workflows
on:
  pull_request:
    paths:

      - '.github/workflows/**'

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:

      - uses: actions/checkout@v3
      - name: Validate workflow syntax

        uses: schema-tools/actionlint@v1
        with:
          files: '.github/workflows/*.yml'

Resources

act: https://github.com/nektos/act
actionlint: https://github.com/rhymond/actionlint
GitHub Actions Docs: https://docs.github.com/en/actions

Summary

✅ CI/CD testing is practical and recommended!

Use act for local testing
Validate syntax with actionlint
Test individual steps manually
Push to test branch for full integration
Test AWS-specific steps separately

Time Investment: ~10 minutes to set up, saves hours of debugging later!

Task 4 Documentation

CI/CD Workflow Design - Complete workflow design and architecture
Test Suite Summary - Test suite implementation details
Test Suite Documentation - Detailed test documentation
Scripts Documentation - CI/CD testing scripts

ETL Pipeline - Code tested by this CI/CD
Data Lake Architecture - Infrastructure provisioned by this CI/CD
SQL Query - Code validated by this CI/CD

Technical Documentation

Unified Testing Convention - Testing standards
Testing Guide - Comprehensive testing documentation

Part 2: CI/CD Workflow Details

This section contains detailed information referenced in the CI/CD Workflow document.

id: APPENDIX_J_CI_CD_WORKFLOW_DETAILS title: Appendix J - CI/CD Workflow Details sidebar_position: 10

This appendix contains detailed information referenced in the CI/CD Workflow document.

Appendix A: Failure Scenarios

Critical Rule: Failed runs never update _LATEST.json or current/ prefix.

Failure Types:

ETL Job Failure: Non-zero exit, no _SUCCESS, no data written → Alert triggers, safe rerun
Partial Write: Job crashes mid-execution → Partial files ignored, new run_id on rerun
Validation Failure: Quarantine rate > threshold → Data Quality Team reviews, fixes source, reruns
Circuit Breaker: >100 same errors/hour → Pipeline halts, Platform Team investigates
Schema Validation: Schema drift detected → Fail fast, update schema registry, rerun

Safe Rerun: Each rerun uses new run_id, failed runs preserved for audit, only successful runs promoted.

Promotion Workflow: ETL writes to isolated run_id path → _SUCCESS marker → CloudWatch alarm → Human review (Domain Analyst + Platform Team) → Approval → Promote to production.

Appendix B: Infrastructure Details

Step Functions Orchestration:

RunETL State: Invokes Glue job synchronously, auto-retries (≤3 attempts, exponential backoff)
ValidateOutput State: Checks _SUCCESS marker, retries on eventual consistency
Error Handling: Catches failures, publishes CloudWatch metrics, logs execution details

IAM Prefix-Scoped Permissions:

ETL Job: bronze/* (read), silver/* (write), quarantine/* (write)
Platform Team: bronze/*, silver/*, quarantine/* (read/write)
Domain Teams: silver/{domain}/* (write), gold/{domain}/* (read)
Business/Analysts: gold/* (read-only via Athena)
Compliance: bronze/*, quarantine/* (read-only for audit)

Appendix C: Monitoring Details

Volume Metrics: run_id, input_rows, valid_rows_count, quarantined_rows_count, condemned_rows_count

Quality Metrics: quarantine_rate, validation_failure_rate, error_type_distribution

Loop Prevention: avg_attempt_count, duplicate_detection_rate, auto_condemnation_rate, circuit_breaker_triggers

Performance: rows_processed_per_run, duration_seconds, missing_partitions, runtime_anomalies

Alert Ownership:

P1 (Immediate): Job failures, infrastructure errors, circuit breaker, SLA breaches → Data Platform Team
P2 (2-4 hours): Quarantine spikes, validation failures, high attempt counts → Data Quality Team
P3 (8 hours): Volume anomalies → Domain Teams

Appendix D: Governance Details

Ownership Matrix (abbreviated):

Pipeline/CI/CD/Infrastructure: Data Platform Team
Validation Rules: Domain Teams (Silver) / Business (Gold)
Data Quality: Data Quality Team
Schema: Domain Teams (Silver) / Business (Gold) approve; Platform implements
Backfill: Platform executes; Domain/Business approves

Governance Workflows:

Schema Change: Request → Layer-based review (Domain/Business) → Platform feasibility → Approval → Implementation → Versioning → Validation → Promotion
Quality Issue: Alert → Data Quality triage → Source/Validation/Platform issue → Fix → Backfill approval → Reprocess → Validate → Promote
Backfill: Request → Layer-based approval → Platform assessment → Schedule → Execute → Validate → Promote

Key Rules:

Infrastructure changes via Terraform IaC and CI/CD only
Failed runs never update _LATEST.json or current/
Run isolation via run_id mandatory
Human approval required for Silver promotion and condemned data deletion
Quarantine rate thresholds configurable per dataset (default: 1%)
Schema changes versioned via schema_v for backward compatibility

Appendix: CI/CD Complete Reference

Table of Contents

Part 1: CI/CD Testing Guide

Overview

Why Test CI/CD?

Testing Approaches

1. Local Testing with `act` (Recommended)

Installation

Quick Start

Using the Test Script

Limitations

Example: Testing with Secrets

2. Workflow Syntax Validation

3. Manual Testing on GitHub

4. Unit Testing Workflow Steps

Example: Test Python Setup

Example: Test PySpark Setup

Testing Strategy

What to Test

What's Hard to Test Locally

Recommended Testing Flow

CI/CD Test Checklist

Troubleshooting

`act` Issues

Workflow Issues

Continuous Improvement

Add Workflow Tests to CI

Resources

Summary

Task 4 Documentation

Technical Documentation

Part 2: CI/CD Workflow Details

Appendix A: Failure Scenarios

Appendix B: Infrastructure Details

Appendix C: Monitoring Details

Appendix D: Governance Details

See Also

See Also

Table of Contents​

Part 1: CI/CD Testing Guide​

Overview​

Why Test CI/CD?​

Testing Approaches​

1. Local Testing with act (Recommended)​

Installation​

Quick Start​

Using the Test Script​

Limitations​

Example: Testing with Secrets​

2. Workflow Syntax Validation​

3. Manual Testing on GitHub​

4. Unit Testing Workflow Steps​

Example: Test Python Setup​

Example: Test PySpark Setup​

Testing Strategy​

What to Test​

What's Hard to Test Locally​

Recommended Testing Flow​

CI/CD Test Checklist​

Troubleshooting​

act Issues​

Workflow Issues​

Continuous Improvement​

Add Workflow Tests to CI​

Resources​

Summary​

Related Documentation​

Task 4 Documentation​

Related Tasks​

Technical Documentation​

Part 2: CI/CD Workflow Details​

Appendix A: Failure Scenarios​

Appendix B: Infrastructure Details​

Appendix C: Monitoring Details​

Appendix D: Governance Details​

See Also​

See Also​

Table of Contents

Part 1: CI/CD Testing Guide

Overview

Why Test CI/CD?

Testing Approaches

1. Local Testing with `act` (Recommended)

Installation

Quick Start

Using the Test Script

Limitations

Example: Testing with Secrets

2. Workflow Syntax Validation

3. Manual Testing on GitHub

4. Unit Testing Workflow Steps

Example: Test Python Setup

Example: Test PySpark Setup

Testing Strategy

What to Test

What's Hard to Test Locally

Recommended Testing Flow

CI/CD Test Checklist

Troubleshooting

`act` Issues

Workflow Issues

Continuous Improvement

Add Workflow Tests to CI

Resources

Summary

Related Documentation

Task 4 Documentation

Related Tasks

Technical Documentation

Part 2: CI/CD Workflow Details

Appendix A: Failure Scenarios

Appendix B: Infrastructure Details

Appendix C: Monitoring Details

Appendix D: Governance Details

See Also

See Also