Tutorial
Chapter 5 - Mastering DMDL
Goal: Expand your model with new entities (PRODUCT, ORDER_LINE) to master Relationships and advanced Attribute Types.
Prerequisites: You must have completed Chapter 4: Loading Strategies.
The Scenario
Marketing comes to you with a request: "We need to know which Books are selling the best."
Currently, your model only has CUSTOMER and ORDER. We need to add PRODUCT and the link between them (ORDER_LINE).
Step 1: Define the New Entities
Edit model.yaml and add these two new entities after your existing CUSTOMER and ORDER:
# ... existing entities (CUSTOMER, ORDER) ...
- id: PRODUCT
name: PRODUCT
definition: Items available for sale
description: Book catalog
attributes:
- id: product_id
name: product_id
definition: Unique product identifier
type: STRING
- id: product_title
name: product_title
definition: Title of the book
type: STRING
effective_timestamp: true
- id: product_format
name: product_format
definition: Format of the book (Hardcover, Paperback)
type: STRING
effective_timestamp: true
- id: ORDER_LINE
name: ORDER_LINE
definition: Individual item within an order
description: Link table between Orders and Products
attributes:
- id: order_line_id
name: order_line_id
definition: Unique ID for the line item
type: STRING
- id: quantity
name: quantity
definition: Number of units purchased
type: NUMBER
effective_timestamp: true
- id: line_amount
name: line_amount
definition: Total value of this line item
type: NUMBER
effective_timestamp: true
Step 2: Add the Relationships
Add these relationships to your existing relationships section:
# ... existing relationship
# NEW: Order line references a product
- name: IS_FOR_PRODUCT
definition: Line item references a product
source_entity_id: ORDER_LINE
target_entity_id: PRODUCT
# NEW: Order contains order lines
- name: BELONGS_TO
definition: Order line belongs to an order
source_entity_id: ORDER_LINE
target_entity_id: ORDER
Key Insight: Relationships are defined from the "many" side to the "one" side. An ORDER_LINE belongs to exactly one ORDER, but an ORDER can have many ORDER_LINEs.
Step 3: Generate Mappings for the New Entities
Generate mapping templates for all entities that don't have mappings yet:
daana-cli generate mapping
This command scans your model and generates templates only for new entities. Existing mappings are safely skipped:
Generating mapping templates...
Model: model.yaml
Output: mappings/
✓ Generated 2 mapping template(s)
Generated PRODUCT: product-mapping.yaml
Generated ORDER_LINE: order_line-mapping.yaml
⊘ Skipped 2 existing mapping(s)
Skipped CUSTOMER: customer-mapping.yaml already exists
Skipped ORDER: order-mapping.yaml already exists
To update existing mappings, use: daana-cli merge mapping
Note: If you need to update existing mappings after model changes, use
daana-cli merge mappinginstead - it preserves your transformation expressions while adding new attributes.
Step 4: Edit the Product Mapping
Edit mappings/product-mapping.yaml to configure the source table:
entity_id: PRODUCT
mapping_groups:
- name: default_mapping_group
tables:
- connection: dev
table: stage.products
primary_keys:
- product_id
ingestion_strategy: FULL
entity_effective_timestamp_expression: updated_at
attributes:
- id: product_id
transformation_expression: product_id
- id: product_title
transformation_expression: title
- id: product_format
transformation_expression: format
Run daana-cli check mapping to see progress. The check is supposed to fail partially at this point: you've only updated the product mapping; order_line-mapping.yaml still points at a non-existent table until you edit it in Step 5. Use the output to confirm which files pass (including product-mapping.yaml) and that the remaining error is only the order line schema issue shown below.
daana-cli check mapping
Checking 4 mapping(s) in directory: mappings
→ customer-mapping.yaml ✓
→ order-mapping.yaml ✓
→ order_line-mapping.yaml ✘
→ product-mapping.yaml ✓
✘ order_line-mapping.yaml:
• [schema] Table 'public.order_lines' not found in database
Summary:
Total: 4
Valid: 3
Invalid: 1
The product mapping is now valid. Fix the order line mapping next.
Step 5: Edit the Order Line Mapping
Edit mappings/order_line-mapping.yaml:
entity_id: ORDER_LINE
mapping_groups:
- name: default_mapping_group
tables:
- connection: dev
table: stage.sales_order_lines
primary_keys:
- CONCAT(CAST(order_id AS TEXT), '-', CAST(order_line_id AS TEXT))
ingestion_strategy: FULL
entity_effective_timestamp_expression: CURRENT_TIMESTAMP
attributes:
- id: order_line_id
transformation_expression: CONCAT(CAST(order_id AS TEXT), '-', CAST(order_line_id AS TEXT))
- id: quantity
transformation_expression: quantity
- id: line_amount
transformation_expression: (quantity * unit_price) - discount_amount
relationships:
# ORDER_LINE -> PRODUCT relationship
- id: ORDER_LINE_IS_FOR_PRODUCT_PRODUCT
source_table: stage.sales_order_lines
target_transformation_expression: product_id
# ORDER_LINE -> ORDER relationship
- id: ORDER_LINE_BELONGS_TO_ORDER
source_table: stage.sales_order_lines
target_transformation_expression: order_id
Important:
- The relationship
idin mappings follows the pattern:SOURCE_ENTITY_RELATIONSHIP_NAME_TARGET_ENTITY.- Composite Primary Keys: In real-world scenarios,
order_line_idtypically restarts for each order (1, 2, 3...). To create a unique identifier, we use aCONCATexpression inprimary_keyscombiningorder_idandorder_line_id.See Relationship Mappings for the full field reference.
Validate all mappings are now correct:
daana-cli check mapping
Checking 4 mapping(s) in directory: mappings
→ customer-mapping.yaml ✓
→ order-mapping.yaml ✓
→ order_line-mapping.yaml ✓
→ product-mapping.yaml ✓
Summary:
Total: 4
Valid: 4
Invalid: 0
All mappings are valid. Deploy them next.
Step 6: Update Workflow & Deploy
Add the new mappings to workflow.yaml:
mappings:
- mappings/customer-mapping.yaml
- mappings/order-mapping.yaml
- mappings/product-mapping.yaml # NEW
- mappings/order_line-mapping.yaml # NEW
Validate your workflow:
daana-cli check workflow
Checking workflow: workflow.yaml
Workflow: BOOK_RETAILER_WORKFLOW
Workflow ID: 1424339281
Model: ✓
Entity Coverage:
- CUSTOMER: mapped ✓
- ORDER: mapped ✓
- PRODUCT: mapped ✓
- ORDER_LINE: mapped ✓
✓ Workflow valid
All four entities are now mapped. Deploy and execute:
daana-cli deploy
daana-cli execute
Step 7: Query Your Complete Model
Open a psql shell against the local container so you can run the upcoming SQL queries:
docker exec -it daana-customerdb psql -U dev -d customerdb
Now you can answer Marketing's question:
-- Which books are selling the best?
SELECT
product_title,
product_format,
SUM(quantity)::int AS total_units_sold,
ROUND(SUM(line_amount)::numeric, 2) AS total_revenue
FROM daana_dw.view_product_with_rel
WHERE order_line_key IS NOT NULL
GROUP BY product_title, product_format
ORDER BY total_revenue DESC
LIMIT 10;
Daana stores NUMBER attributes as full-precision NUMERIC, so SUM(quantity) returns 3.00000000 and SUM(line_amount) returns 98.97000000. The ::int cast and ROUND(..., 2) are pure cosmetic formatting for the report.
Why view_product_with_rel and not view_order_line_with_rel?
Daana materializes related-entity columns on the target side of each relationship. ORDER_LINE is the source side of both its relationships (to PRODUCT and to ORDER), so view_order_line_with_rel carries only ORDER_LINE's own attributes, no joined product or order columns. The symmetrical target-side view (here view_product_with_rel) is where the order line attributes live alongside the product attributes.
Result:
product_title | product_format | total_units_sold | total_revenue
----------------------------------+----------------+------------------+---------------
The Way of Kings | Hardcover | 3 | 98.97
Children of Blood and Bone | Hardcover | 4 | 68.96
Words of Radiance | Hardcover | 2 | 58.98
Becoming | Hardcover | 2 | 56.98
The Testaments | Hardcover | 2 | 49.98
It Ends with Us | Paperback | 3 | 49.97
American Gods | Hardcover | 2 | 45.98
Norwegian Wood | Paperback | 3 | 39.17
The Ocean at the End of the Lane | Hardcover | 2 | 31.98
Americanah | Paperback | 2 | 28.48
(10 rows)
Order lines next to product info, no JOIN
The point of the target-side _with_rel views is to eliminate joins for the common case. view_product_with_rel already pairs each product with the order lines that reference it, so reading order line data alongside product data is a single-table query:
SELECT
order_line_id,
quantity::int AS quantity,
product_title,
product_format,
ROUND(line_amount::numeric, 2) AS line_amount
FROM daana_dw.view_product_with_rel
WHERE order_line_id LIKE '1-%'
ORDER BY order_line_id;
order_line_id | quantity | product_title | product_format | line_amount
---------------+----------+---------------------+----------------+-------------
1-1 | 1 | The Handmaid's Tale | Hardcover | 16.99
1-2 | 1 | American Gods | Hardcover | 22.99
1-3 | 1 | It Ends with Us | Paperback | 15.99
(3 rows)
The same applies to view_order_with_rel (carries order line attributes alongside ORDER) and view_customer_with_rel (carries ORDER attributes alongside CUSTOMER). The rule is simple: query the target-side view of the relationship you care about and the joined columns are already there.
Traversing the full chain
The _with_rel views compose. To answer "which customer ordered which book on which date", chain the three target-side views together: each one adds the next step. Two JOINs cover the full customer -> order -> order_line -> product chain.
SELECT
c.customer_name,
c.order_id,
c.order_purchase_ts,
p.product_title,
p.quantity::int AS quantity,
ROUND(p.line_amount::numeric, 2) AS line_amount
FROM daana_dw.view_customer_with_rel c
JOIN daana_dw.view_order_with_rel o
ON c.order_key = o.order_key
JOIN daana_dw.view_product_with_rel p
ON o.order_line_key = p.order_line_key
ORDER BY c.order_id::int, p.order_line_id
LIMIT 8;
customer_name | order_id | order_purchase_ts | product_title | quantity | line_amount
---------------+----------+---------------------+---------------------+----------+-------------
John Smith | 1 | 2024-11-01 10:30:00 | The Handmaid's Tale | 1 | 16.99
John Smith | 1 | 2024-11-01 10:30:00 | American Gods | 1 | 22.99
John Smith | 1 | 2024-11-01 10:30:00 | It Ends with Us | 1 | 15.99
Emily Johnson | 2 | 2024-11-02 14:15:00 | Sapiens | 1 | 22.49
Michael Brown | 3 | 2024-11-03 09:20:00 | Americanah | 1 | 13.49
Michael Brown | 3 | 2024-11-03 09:20:00 | Norwegian Wood | 2 | 25.18
Michael Brown | 3 | 2024-11-03 09:20:00 | Becoming | 1 | 26.99
Sarah Davis | 4 | 2024-11-04 16:45:00 | The Way of Kings | 1 | 32.99
(8 rows)
Each JOIN spans one relationship by its shared key (order_key, then order_line_key). No junction views, no manual key resolution. The _with_rel view of the entity at each end of the chain carries the entity's own attributes; the views in the middle (here view_order_with_rel) connect adjacent relationships and also expose useful columns of their own.
The per-relationship junction views (e.g. view_order_line_is_for_product_product) expose the related entity keys plus their activation timestamps and nothing else. Reach for them when you need only the keys, for example when joining Daana's output to an external table that does not live in the model.
Summary of Key DMDL Concepts
Attribute Types
STRING- Text valuesNUMBER- Numeric values (integers, decimals)START_TIMESTAMP- When an entity lifecycle beginsEND_TIMESTAMP- When an entity lifecycle endsUNIT- Measurement units (currency, weight units, etc.)
Ingestion Strategies
FULL- Complete table refreshINCREMENTAL- Delta processing with watermarkFULL_LOG- Snapshot series for attribute-level historyTRANSACTIONAL- Append-only events
Relationship Direction
Relationships are defined from the child (many) to the parent (one):
ORDER_LINE → PRODUCT(many lines can reference one product)ORDER_LINE → ORDER(many lines belong to one order)ORDER → CUSTOMER(many orders belong to one customer)
Group Attributes
- Bundle related attributes (Amount + Currency)
- Enforce data quality at the model level
- Semantic clarity in your data warehouse