Tutorial
Chapter 6 - Going to Production
Goal: Understand the CI/CD workflow for Daana projects and practice validation locally.
Prerequisites: You must have completed Chapter 5: Mastering DMDL.
The Real-World Workflow
In production environments, data teams follow a structured workflow to ensure quality and prevent outages. Here's how it typically works with Daana:
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ 1. DEVELOP │ │ 2. TEST │ │ 3. REVIEW │
│ (Local) │────▶│ (Dev Schema) │────▶│ (Pull Request) │
│ │ │ │ │ │
│ - Edit YAML │ │ - Deploy to │ │ - Code review │
│ - Check syntax │ │ dev schema │ │ - CI tests run │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│
▼
┌─────────────────┐ ┌─────────────────┐
│ 5. EXECUTE │◀────│ 4. MERGE │
│ (Production) │ │ (Main Branch) │
│ │ │ │
│ - Run workflow │ │ - Auto-deploy │
│ - Load data │ │ triggered │
└─────────────────┘ └─────────────────┘
Step 1: The Development Phase
When developing locally, daana-cli check workflow is your go-to validation command. It validates the model, all mappings, and the connection profile in one shot:
daana-cli check workflow
Validation in action. Introduce an intentional error in your mapping:
# Edit mappings/order-mapping.yaml and change the entity reference to something invalid:
# entity_id: ORDER_INVALID (instead of ORDER)
Now run validation:
daana-cli check workflow
You'll see output similar to this (warnings about hardcoded credentials and sslmode come from connections.yaml and are expected for the local tutorial):
Checking workflow: workflow.yaml
Workflow: BOOK_RETAILER_WORKFLOW
Workflow ID: 1424339281
✘ Errors:
Mapping Model Validation:
• order-mapping.yaml: Entity 'ORDER_INVALID' not found in model (did you mean: ORDER_LINE?) Available: [CUSTOMER, ORDER, PRODUCT, ORDER_LINE]
⚠ Warnings:
Entity Not Mapped:
• model.yaml: Entity 'ORDER' defined in model but has no mapping
Connection Profile Warning:
• connections.yaml: [dev] 'password' appears to be hardcoded. Use environment variables for security: ${PASSWORD}
• connections.yaml: [dev] 'user' appears to be hardcoded. Use environment variables for security: ${USER}
• connections.yaml: [dev] root.sslmode='disable' is insecure for production. Use 'require' or 'verify-full'
Summary: 1 error(s), 4 warning(s)
Error: workflow validation failed
The ✘ Errors section is what you must fix before proceeding. The ⚠ Warnings are advisory: the "Connection Profile" group is unrelated to your edit and will remain until you replace the hardcoded credentials with environment variables (see "Best Practices" at the end of this chapter).
Fix it by changing back to entity_id: ORDER, then verify:
daana-cli check workflow
# Workflow valid
Note:
daana-cli checkvalidates structural correctness (entity/attribute references, YAML syntax). It does NOT validate column names against the database schema - those errors are caught at deploy/execute time.
Step 2: Testing in a Dev Schema
Before deploying to production, you deploy to a developer-specific schema. This lets you:
- Test actual SQL execution
- Verify data transformations
- Catch runtime issues
Configure Multiple Environments
Open connections.yaml and set up separate schemas:
See Connection Profiles for all supported fields, database types, and SSL options.
connections:
# Your personal development environment
dev:
type: postgresql
host: localhost
port: 5432
user: dev
password: devpass
database: customerdb
sslmode: disable
target_schema: daana_dw_yourname # Developer-specific schema
# Shared production environment
production:
type: postgresql
host: localhost
port: 5432
user: dev
password: devpass
database: customerdb
sslmode: disable
target_schema: daana_dw # Production schema
Deploy and Test Locally
The exercises below mix CLI commands with SQL queries. Open a psql shell in a second terminal so the first stays free for daana-cli invocations:
docker exec -it daana-customerdb psql -U dev -d customerdb
Deploy and execute against your dev schema:
daana-cli deploy --connection dev
daana-cli execute --connection dev
Query your dev schema to verify (in the psql terminal):
SELECT COUNT(*) FROM daana_dw_yourname.view_customer;
Step 3: The CI/CD Pipeline
Once your changes work locally, you commit and push to trigger CI/CD.
Typical CI Configuration
Here's what a CI pipeline might look like (GitHub Actions example):
# .github/workflows/daana-ci.yml
name: Daana CI
on:
pull_request:
paths:
- 'model.yaml'
- 'workflow.yaml'
- 'mappings/**'
- 'connections.yaml'
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install Daana CLI
run: |
# Download and install daana-cli
- name: Validate Workflow
run: daana-cli check workflow
deploy-staging:
needs: validate
runs-on: ubuntu-latest
environment: staging
steps:
- uses: actions/checkout@v4
- name: Deploy to Staging
run: |
daana-cli deploy --connection staging
daana-cli execute --connection staging
- name: Run Data Quality Tests
run: |
# Run assertions on staging data
What CI Tests Catch
The CI pipeline validates:
- Syntax errors: Invalid YAML, missing fields
- Reference errors: Attributes referencing non-existent entities
- SQL errors: Invalid column names, type mismatches
- Schema drift: Source table changes that break mappings
Step 4: Deploying to Production
After PR approval and merge, production deployment happens:
# Production deployment (typically automated)
daana-cli deploy --connection production
daana-cli execute --connection production
Simulate This Locally
Simulate the full flow:
# 1. Make a change (add a comment to model.yaml)
echo "# Updated: $(date)" >> model.yaml
# 2. Validate (CI would do this)
daana-cli check workflow
# 3. Deploy to "staging" (your dev schema)
daana-cli deploy --connection dev
# 4. Test in staging
daana-cli execute --connection dev
# 5. If all good, deploy to "production"
daana-cli deploy --connection production
daana-cli execute --connection production
# 6. Then verify production data from your psql terminal
SELECT COUNT(*) FROM daana_dw.view_order;
Hands-On: The Pre-Deployment Checklist
Before deploying any changes, run through this checklist:
Exercise 1: Full Validation Sweep
# check workflow validates everything (model + mappings + connections)
daana-cli check workflow && echo "All checks passed - safe to deploy!"
Note:
check workflowvalidates your model, all mappings, and connection profiles in one command.
If the check fails, fix the issue before proceeding.
Exercise 2: Simulate a Runtime Error
daana-cli check catches structural errors, but some errors only appear at deploy time. Reproduce this case:
Edit
mappings/order-mapping.yamland change a column name to something that doesn't exist:- id: order_status transformation_expression: order_status_TYPO # This column doesn't exist!Run check (it passes - check validates YAML structure, not database schema):
daana-cli check workflow # ✓ Workflow validTry to deploy:
daana-cli deploydeployruns a pre-deploy validation step that issues each transformation expression against the source database. It fails before any DDL is run, with a structured message that names the file, the attribute, the missing column, and the underlying SQLSTATE:Error: pre-deploy validation failed: validation failed with 1 error(s): - order-mapping.yaml: invalid transformation expression for 'order_status': pq: column "order_status_typo" does not exist at column 8 (42703)
Key Learning:
checkvalidates YAML structure and references against the model.deployruns a pre-deploy validation step that catches schema-level errors (missing columns, type mismatches) before applying any DDL, so a typo here cannot leave the warehouse in a half-deployed state.- Both are important in your workflow.
Fix it by changing back to order_status before continuing.
Exercise 3: Verify Your Fix
# Redeploy with the fix
daana-cli deploy
# Execute to ensure everything works
daana-cli execute
Verify data is correct (in the psql terminal):
SELECT order_id, order_status FROM daana_dw.view_order LIMIT 3;
Best Practices
1. Always Validate Before Commit
# Add to your pre-commit hook or run manually
daana-cli check workflow
2. Use Descriptive Schema Names
# Good - clear ownership and purpose
target_schema: daana_dw_alice_feature123
target_schema: daana_dw_staging
target_schema: daana_dw_prod
# Bad - ambiguous
target_schema: dw
target_schema: test
3. Environment Variables for Secrets
Never commit passwords. Use environment variables:
connections:
production:
type: postgresql
host: "${PROD_DB_HOST}"
user: "${PROD_DB_USER}"
password: "${PROD_DB_PASSWORD}"
database: "${PROD_DB_NAME}"
4. Version Control Everything
Your Daana project should be in Git:
my-project/
├── model.yaml # Version controlled
├── workflow.yaml # Version controlled
├── mappings/ # Version controlled
├── connections.yaml # Version controlled (no secrets!)
└── .github/workflows/ # CI/CD configuration
Summary
You've learned the production workflow:
- Develop locally with
daana-cli checkfor fast feedback - Test in dev schema to validate actual SQL execution
- CI validates on pull request
- Deploy to production after merge
This workflow ensures:
- Errors caught early (before reaching any database)
- Changes tested in isolation (dev schemas)
- Code review and approval gates
- Automated, repeatable deployments
One topic remains: building a dimensional consumption layer (dimensions and facts) from Daana's output views.