Data Flow
The Problem
Understanding how data moves through WRIT — from raw CSV files to actionable questions and actions — is essential for debugging unexpected results, designing effective domain models, and explaining the system to stakeholders.
How WRIT Handles It
WRIT follows a unidirectional pipeline. Data flows in one direction, and each stage transforms it into a more refined form.
The Pipeline
graph TD
A["Attestation Sources<br/>(CSV, APIs)"] --> B["Entity Sets<br/>(collections of entities)"]
B --> C["FluentFacts<br/>(time-aware attribute values with 3vl)"]
C --> D["Partitions<br/>(classify entities via predicates)"]
D --> E["Expectations<br/>(measure alignment)"]
E --> F["Actions<br/>(false → expectee)"]
E --> G["Questions<br/>(unknown → attestor)"]
style A fill:#ecfeff,stroke:#06B6D4,color:#0e7490
style B fill:#ecfeff,stroke:#06B6D4,color:#0e7490
style C fill:#ecfeff,stroke:#06B6D4,color:#0e7490
style D fill:#ecfeff,stroke:#06B6D4,color:#0e7490
style E fill:#ecfeff,stroke:#06B6D4,color:#0e7490
style F fill:#fef3c7,stroke:#f59e0b,color:#92400e
style G fill:#fef3c7,stroke:#f59e0b,color:#92400e
Stage by Stage
1. Attestation Sources → Entity Sets
Raw data from CSV files (or other sources) is loaded into entity sets. Each row becomes an entity. The entity type schema determines which fields map to which attributes.
services_data.csv:
name,owner_email,tier
CRM,alice@co.com,critical
Email,,standard
HR Portal,N/A,critical
→ Entity Set "all_it_services" with 3 entities
2. Entity Sets → FluentFacts
Each attribute value is wrapped in a FluentFact — a time-aware descriptor that:
- Records when the data was attested (timestamp from the data source).
- Applies sentinel mappings (
"N/A"→ Unknown,"DELETED"→ Void). - Checks the validity period (is the data stale?).
- Produces a three-valued result (True/False/Unknown/Void) with a reason chain explaining how the value was derived.
3. FluentFacts → Partitions
Partition predicates evaluate FluentFact values for each entity. The predicate "owner_email" is known checks whether the FluentFact for owner_email is a known value or Unknown/Void.
Each entity lands in one of five branches: is_true, is_false, is_unknown, is_void, or the convenience branch is_known.
4. Partitions → Expectations
Expectations assert that specific partition branches should be empty. "Expect empty: is_false" means the is_false branch of the partition should contain zero entities.
The expectation evaluates to:
- Met (true) — The branch is empty.
- Not met (false) — The branch contains entities. Actions are generated.
- Uncertain (unknown) — The
is_unknownbranch contains entities. Questions are generated.
5. Expectations → Questions & Actions
The final outputs are directed at specific people:
- Actions go to the expectee (found via relationship traversal from the failing entity).
- Questions go to the attestor (the person responsible for the data source).
Reason Chains
Every value in the pipeline carries a reason chain — a tuple of strings explaining how it was derived:
('owner_email=alice@co.com from services_data',
'owner_email (alice@co.com) is known -> True')
These chains provide full provenance for debugging. When a result is unexpected, the reason chain shows exactly which data and which logical steps produced it.
Campaign Scoping
The pipeline runs within the scope of a campaign. Before evaluation begins, WRIT builds a scoped registry containing only the entity types, entity sets, and data sources relevant to the campaign's expectations. This means:
- Unrelated data is not loaded.
- Unrelated partitions are not evaluated.
- Reports are focused on the campaign's goals.
Worked Example
End-to-end flow for "all IT services must have an owner":
1. services_data.csv loaded → 3 entities in "all_it_services"
2. FluentFacts created:
CRM.owner_email = "alice@co.com" (known, True)
Email.owner_email = "" (sentinel → Unknown)
HR Portal.owner_email = "N/A" (sentinel → Unknown)
3. Partition "has_owner" evaluated:
is_true: [CRM] (owner is known)
is_false: [] (none definitely lack an owner)
is_unknown: [Email, HR Portal] (ownership data missing)
is_void: []
4. Expectation "all_services_owned" checked:
expect empty: "is_false" → is_false is empty ✓
But is_unknown has 2 entities → Questions generated
5. Outputs:
Question: "What is the owner_email for Email?" → attestor of services_data
Question: "What is the owner_email for HR Portal?" → attestor of services_data
Common Pitfalls
- Debugging without reason chains — When a result is unexpected, always check the reason chain first. It shows exactly which data values and logical operations produced the result.
- Forgetting campaign scope — If an expectation is not being evaluated, check that it is included in the campaign's scope tree (via
step towards). - Assuming data loads instantly — Entity sets are loaded from files at evaluation time. If the source file has changed, re-evaluation is needed.
What to Read Next
This is the capstone page of the Core Concepts learning path. From here, you can:
- Domain Modelling — Detailed guide to building domain models.
- Writing Expectations (DSL) — Full DSL syntax reference.
- Glossary — All terms in one place.
WRIT Docs