Tutorial
5. Mastering DMDL

Chapter 5: Mastering DMDL

Goal: Expand your model with new entities (PRODUCT, ORDER_LINE) to master Relationships and advanced Attribute Types.

Prerequisites: You must have completed Chapter 4: Loading Strategies.


The Scenario

Marketing comes to you with a request: "We need to know which Books are selling the best."

Currently, your model only has CUSTOMER and ORDER. We need to add PRODUCT and the link between them (ORDER_LINE).

Step 1: Define the New Entities

Edit model.yaml and add these two new entities after your existing CUSTOMER and ORDER:

    # ... existing entities (CUSTOMER, ORDER) ...
 
    - id: "PRODUCT"
      name: "PRODUCT"
      definition: "Items available for sale"
      description: "Book catalog"
      attributes:
        - id: "PRODUCT_ID"
          name: "PRODUCT_ID"
          definition: "Unique product identifier"
          type: "STRING"
          effective_timestamp: true
 
        - id: "PRODUCT_TITLE"
          name: "PRODUCT_TITLE"
          definition: "Title of the book"
          type: "STRING"
          effective_timestamp: true
 
        - id: "PRODUCT_FORMAT"
          name: "PRODUCT_FORMAT"
          definition: "Format of the book (Hardcover, Paperback)"
          type: "STRING"
          effective_timestamp: true
 
    - id: "ORDER_LINE"
      name: "ORDER_LINE"
      definition: "Individual item within an order"
      description: "Link table between Orders and Products"
      attributes:
        - id: "ORDER_LINE_ID"
          name: "ORDER_LINE_ID"
          definition: "Unique ID for the line item"
          type: "STRING"
          effective_timestamp: true
 
        - id: "QUANTITY"
          name: "QUANTITY"
          definition: "Number of units purchased"
          type: "NUMBER"
          effective_timestamp: true
 
        - id: "LINE_AMOUNT"
          name: "LINE_AMOUNT"
          definition: "Total value of this line item"
          type: "NUMBER"
          effective_timestamp: true

Step 2: Add the Relationships

Add these relationships to your existing relationships section:

  # ... existing relationship
 
    # NEW: Order line references a product
    - name: "IS_FOR_PRODUCT"
      definition: "Line item references a product"
      source_entity_id: "ORDER_LINE"
      target_entity_id: "PRODUCT"
 
    # NEW: Order contains order lines
    - name: "BELONGS_TO"
      definition: "Order line belongs to an order"
      source_entity_id: "ORDER_LINE"
      target_entity_id: "ORDER"

Key Insight: Relationships are defined from the "many" side to the "one" side. An ORDER_LINE belongs to exactly one ORDER, but an ORDER can have many ORDER_LINEs.

Step 3: Generate Mappings for the New Entities

Generate mapping templates for all entities that don't have mappings yet:

daana-cli generate mapping

This command scans your model and generates templates only for new entities. Existing mappings are safely skipped:

Generating mapping templates...
 
Model: model.yaml
Output: mappings/
 
✓ Generated 2 mapping template(s)
  Generated PRODUCT: product-mapping.yaml
  Generated ORDER_LINE: order_line-mapping.yaml
 
⊘ Skipped 2 existing mapping(s)
  Skipped CUSTOMER: customer-mapping.yaml already exists
  Skipped ORDER: order-mapping.yaml already exists
 
To update existing mappings, use: daana-cli merge mapping

Note: If you need to update existing mappings after model changes, use daana-cli merge mapping instead - it preserves your transformation expressions while adding new attributes.

Step 4: Edit the Product Mapping

Edit mappings/product-mapping.yaml to configure the source table:

entity_id: PRODUCT
 
mapping_groups:
  - name: default_mapping_group
    allow_multiple_identifiers: false
    tables:
      - connection: dev
        table: stage.products
        primary_keys:
          - product_id
        ingestion_strategy: FULL
        entity_effective_timestamp_expression: updated_at
        attributes:
          - id: PRODUCT_ID
            transformation_expression: product_id
          - id: PRODUCT_TITLE
            transformation_expression: title
          - id: PRODUCT_FORMAT
            transformation_expression: format

Validate your mappings to check progress:

daana-cli check mapping
Checking 4 mapping(s) in directory: mappings
 
→ customer-mapping.yaml ✓
→ order-mapping.yaml ✓
→ order_line-mapping.yaml ✘
→ product-mapping.yaml ✓
 
✘ order_line-mapping.yaml:
    • [schema] Table 'public.order_lines' not found in database
 
Summary:
  Total: 4
  Valid: 3
  Invalid: 1

The product mapping is now valid. Let's fix the order line mapping next.

Step 5: Edit the Order Line Mapping

Edit mappings/order_line-mapping.yaml:

entity_id: ORDER_LINE
 
mapping_groups:
  - name: default_mapping_group
    allow_multiple_identifiers: false
    tables:
      - connection: dev
        table: stage.sales_order_lines
        primary_keys:
          - CONCAT(CAST(order_id AS TEXT), '-', CAST(order_line_id AS TEXT))
        ingestion_strategy: FULL
        entity_effective_timestamp_expression: CURRENT_TIMESTAMP
        attributes:
          - id: ORDER_LINE_ID
            transformation_expression: CONCAT(CAST(order_id AS TEXT), '-', CAST(order_line_id AS TEXT))
          - id: QUANTITY
            transformation_expression: quantity
          - id: LINE_AMOUNT
            transformation_expression: (quantity * unit_price) - discount_amount
 
    relationships:
      # ORDER_LINE → PRODUCT relationship
      - id: ORDER_LINE_IS_FOR_PRODUCT_PRODUCT
        atomic_context: ORDER_LINE_IS_FOR_PRODUCT_PRODUCT
        source_table: stage.sales_order_lines
        target_transformation_expression: product_id
 
      # ORDER_LINE → ORDER relationship
      - id: ORDER_LINE_BELONGS_TO_ORDER
        atomic_context: ORDER_LINE_BELONGS_TO_ORDER
        source_table: stage.sales_order_lines
        target_transformation_expression: order_id

Important:

  • The relationship id in mappings follows the pattern: SOURCE_ENTITY_RELATIONSHIP_NAME_TARGET_ENTITY.
  • Composite Primary Keys: In real-world scenarios, order_line_id typically restarts for each order (1, 2, 3...). To create a unique identifier, we use a CONCAT expression in primary_keys combining order_id and order_line_id.

Validate all mappings are now correct:

daana-cli check mapping
Checking 4 mapping(s) in directory: mappings
 
→ customer-mapping.yaml ✓
→ order-mapping.yaml ✓
→ order_line-mapping.yaml ✓
→ product-mapping.yaml ✓
 
Summary:
  Total: 4
  Valid: 4
  Invalid: 0

All mappings are valid. Now let's deploy.

Step 6: Update Workflow & Deploy

Add the new mappings to workflow.yaml:

  mappings:
    - "mappings/customer-mapping.yaml"
    - "mappings/order-mapping.yaml"
    - "mappings/product-mapping.yaml"       # NEW
    - "mappings/order_line-mapping.yaml"    # NEW

Validate your workflow:

daana-cli check workflow
Checking workflow: workflow.yaml
 
Workflow: BOOK_RETAILER_WORKFLOW
  Workflow ID: 1424339281
  Model: ✓
 
Entity Coverage:
  - CUSTOMER: mapped ✓
  - ORDER: mapped ✓
  - PRODUCT: mapped ✓
  - ORDER_LINE: mapped ✓
 
✓ Workflow valid

All four entities are now mapped. Deploy and execute:

daana-cli deploy
daana-cli execute

Step 7: Query Your Complete Model

Now you can answer Marketing's question:

docker exec -it daana-customerdb psql -U dev -d customerdb
-- Which books are selling the best?
SELECT
  p."PRODUCT_TITLE",
  p."PRODUCT_FORMAT",
  SUM(ol."QUANTITY") as total_units_sold,
  SUM(ol."LINE_AMOUNT") as total_revenue
FROM daana_dw.view_product p
JOIN daana_dw.view_order_line_with_rel ol  -- use the generated _with_rel view to access related entities' keys
  ON ol.product_key = p.product_key
GROUP BY p."PRODUCT_TITLE", p."PRODUCT_FORMAT"
ORDER BY total_revenue DESC
LIMIT 10;

Result:

          PRODUCT_TITLE           | PRODUCT_FORMAT | total_units_sold | total_revenue
----------------------------------+----------------+------------------+---------------
 The Way of Kings                 | Hardcover      |                2 |         65.98
 Becoming                         | Hardcover      |                2 |         56.98
 Children of Blood and Bone       | Hardcover      |                3 |         51.97
 It Ends with Us                  | Paperback      |                3 |         49.97
 American Gods                    | Hardcover      |                2 |         45.98
 The Ocean at the End of the Lane | Hardcover      |                2 |         31.98
 Americanah                       | Paperback      |                2 |         28.48
 Norwegian Wood                   | Paperback      |                2 |         25.18
 The Testaments                   | Hardcover      |                1 |         24.99
 Words of Radiance                | Hardcover      |                1 |         23.99
(10 rows)

Summary of Key DMDL Concepts

Attribute Types

  • STRING - Text values
  • NUMBER - Numeric values (integers, decimals)
  • START_TIMESTAMP - When an entity lifecycle begins
  • END_TIMESTAMP - When an entity lifecycle ends
  • UNIT - Measurement units (currency, weight units, etc.)

Ingestion Strategies

  • FULL - Complete table refresh
  • INCREMENTAL - Delta processing with watermark
  • FULL_LOG - Snapshot series for SCD2
  • TRANSACTIONAL - Append-only events

Relationship Direction

Relationships are defined from the child (many) to the parent (one):

  • ORDER_LINE → PRODUCT (many lines can reference one product)
  • ORDER_LINE → ORDER (many lines belong to one order)
  • ORDER → CUSTOMER (many orders belong to one customer)

Group Attributes

  • Bundle related attributes (Amount + Currency)
  • Enforce data quality at the model level
  • Semantic clarity in your data warehouse

Next: Chapter 6 - Going to Production