DSL Reference
This document is the syntax reference for kbpy's Domain Definition Language — the .kb files used to define entity types, partitions, expectations, and everything else in a domain. For the conceptual motivation behind these constructs, see Domain Modeling. For foundational theory, see Core Concepts. For running and operating kbpy, see Serving & Operations.
The DSL Approach
kbpy domains are defined using an English-like DSL inspired by FO(.) (the IDP-Z3 knowledge base language), rather than Python API calls. This design choice brings several benefits:
- Eliminates boilerplate: A 52-line Python predicate function with
isinstance(x, VoidLookup)defensive checks becomes 5 lines of DSL. Void and Unknown propagation happens automatically through kbpy's three-valued logic system. - Enforces consistent patterns: Every entity type, partition, and expectation follows the same structure. There's no risk of forgetting to handle edge cases that the DSL handles automatically.
- Declarative and auditable:
.kbfiles read like specifications, not code. A domain expert can review an expectation definition without understanding Python. - Safe evaluation: The predicate interpreter evaluates DSL expressions against entity instances without
execoreval, eliminating a class of security and correctness issues.
File Layout
src/<domain_name>.kb Single-file domain
src/<domain_name>/domain.kb Multi-file domain (definitions)
src/<domain_name>/stakeholders.kb Multi-file domain (expectations)
Multiple .kb files in the same domain directory are merged during parsing. Use separate files to organise large domains — for example, entity type definitions in domain.kb and stakeholder-facing expectations in stakeholders.kb.
Entity Type
All 8 fields are mandatory.
entity type: "Person"
display name: "Person"
display format: "{name}"
source entity sets: "all_people"
sink entity set: "all_people"
predicate: true
exhaustive: true
identifying attributes: "name"
| Clause | Purpose |
|---|---|
display name |
Human-readable name shown in the web interface |
display format |
Format string for entity display (e.g., "{name}") |
source entity sets |
Set expression using union, intersect, or except — see below |
sink entity set |
Name of the entity set produced by this entity type |
predicate |
Filter expression applied to source entities (use true to accept all) |
exhaustive |
Whether this is a complete (true) or partial (false) set (ignored for intersect and except — derived from source partiality) |
identifying attributes |
Attribute(s) used to identify unique entities, comma-separated |
Entity types derive from source entity sets via a set expression and produce a sink entity set. Attributes, relationships, and attestation sources are separate top-level blocks. See Set Operations on Entity Types for use cases.
Set Expressions in Entity Types
The source entity sets: field accepts a full set expression using three infix keywords:
| Syntax | Operation | Semantics |
|---|---|---|
"set_a" union "set_b" |
Union | Entities in at least one source — deduplicated by identity |
"set_a" intersect "set_b" |
Intersection | Entities whose identity appears in all listed sources |
"set_a" except "set_b" |
Difference | Entities in the primary source not in the secondary source |
Single set (leaf expression):
source entity sets: "all_people"
Union — combine two sources:
source entity sets: "employees_from_hr" union "employees_from_payroll"
Entities from all sources are combined and deduplicated by identifying attributes. When the same entity appears with conflicting attribute values across sources, the conflicting attributes become Unknown (see Domain Modeling: Set Operations).
Intersection — entities that appear in all sources:
source entity sets: "cloud_services" intersect "cmdb_services"
Only entities whose identity matches in every listed source are retained. Useful for "entities we can confirm from multiple sources." exhaustive: is ignored for intersection types — is_partial is derived from source partiality.
Intersection — three sources:
source entity sets: "cloud_assets" intersect "cmdb_assets" intersect "finance_assets"
Difference — primary minus secondary:
source entity sets: "all_clusters" except "decommissioned_clusters"
Entities from the primary source (all_clusters) that do not appear in the secondary source (decommissioned_clusters). Attributes from the secondary source are not available on the sink. exhaustive: is ignored for difference types — is_partial is derived from source partiality.
Compound exclusion (parentheses):
source entity sets: "all_services" except ("decommissioned_services" union "excluded_services")
Composed expression:
source entity sets: ("cloud_services" union "on_prem_services") intersect "cmdb_services"
Mixing operators requires parentheses. "A" union "B" intersect "C" is a parse error — write ("A" union "B") intersect "C" or "A" union ("B" intersect "C") to disambiguate. "A" except "B" except "C" is also a parse error — write "A" except ("B" union "C").
Attribute (top-level, linked via of)
attribute: "salary"
of: "all_people"
type: int
| Clause | Purpose |
|---|---|
of |
Name of the entity set this attribute belongs to |
type |
One of: string, int, float, bool |
Attributes are pure schema declarations. Data field mapping, sentinel values, and validity are configured in the attestation source definition.
Relationship (top-level, linked via of)
relationship: "department"
of: "all_employees"
from: "department_name"
to: "all_departments" of "department"
match on: "name"
| Clause | Purpose |
|---|---|
of |
Name of the entity set this relationship belongs to |
from |
Source attribute on this entity |
to |
Target entity set (with optional of module qualifier) |
match on |
Attribute on the target entity to match against |
Attestation Source
Attestation sources define how external data maps onto entity sets. They carry the field mapping, sentinel values, and validity settings that were previously on attribute definitions.
attestation source: "people_data"
description:
data_source: "HR system export"
method: "CSV download from HR portal"
returns: "Employee names, salaries, and departments"
entity set: "all_people"
exhaustive: true
unknown if missing: "salary"
void if missing: "__EMPTY__"
attestor: "all_stakeholders" ? "id" = "HR_TEAM"
type: csv
from data field: "name" -> "name"
returns: "Employee full name"
unknowns: "N/A", ""
voids: "__EMPTY__"
trues: "__EMPTY__"
falses: "__EMPTY__"
validity: 365 days
from data field: "salary" -> "salary"
returns: "Annual salary in GBP"
unknowns: "N/A", ""
voids: "__EMPTY__"
trues: "__EMPTY__"
falses: "__EMPTY__"
validity: 365 days
match on: "name"
| Clause | Purpose |
|---|---|
description: |
Structured description block (required) |
data_source |
Human-readable name of the external data source |
method |
How to obtain or access the data |
returns |
What attributes or information the source provides |
entity set |
Which entity set this source populates |
exhaustive |
Whether this source provides a complete data set |
unknown if missing |
Attributes that return Unknown when absent ("__EMPTY__" if none) |
void if missing |
Attributes that return Void when absent ("__EMPTY__" if none) |
attestor |
Stakeholder responsible for this data source |
type |
Source type (e.g., csv, sqlite, inline) |
from data field |
Maps a source column to an attribute ("col" -> "attr") |
returns (field) |
Per-field description of what the field provides |
unknowns |
Per-field sentinel values that map to Unknown |
voids |
Per-field sentinel values that map to Void |
trues / falses |
Per-field sentinel values for bool attributes |
validity |
Per-field staleness window (e.g., 365 days) or __ONGOING__ |
match on |
Field used to match records to entities |
file |
File path for file-based sources (optional) |
Partition
All 4 branches are mandatory. Use "__IRRELEVANT__" for branches that are not meaningful. At least one branch must be a real entity set name.
partition: "is_old"
source entity sets: "all_people"
predicate: "age" is greater than 65
when true: "elderly"
when false: "youth"
when unknown: "__IRRELEVANT__"
when void: "__IRRELEVANT__"
| Clause | Purpose |
|---|---|
source entity sets |
The parent entity set(s) to partition (comma-separated) |
predicate |
Expression that classifies each entity |
when true |
Entity set name for entities where predicate is true ("__IRRELEVANT__" if unused) |
when false |
Entity set name for entities where predicate is false ("__IRRELEVANT__" if unused) |
when unknown |
Entity set name for entities where predicate is unknown ("__IRRELEVANT__" if unused) |
when void |
Entity set name for entities where predicate is void ("__IRRELEVANT__" if unused) |
Predicate Expressions
Predicates use an English subset of FO(.) syntax:
| Syntax | Meaning |
|---|---|
"salary" |
Attribute access on current entity |
"legacy_flag" of "dr_set" |
Relationship traversal |
is, is not |
Equality comparisons |
is greater than, is less than |
Numeric comparisons |
is at least, is at most |
Numeric comparisons (inclusive) |
is known, is not void |
Three-valued logic checks |
and, or, not |
Boolean connectives |
there is a x in "set" of "module": ... |
Existential quantifier |
if ... then ... else ... |
Conditional |
true, false |
Constants |
The predicate interpreter (src/kbpy/predicate.py) evaluates these expressions against entity instances, returning Bool3vl with automatic void/unknown propagation through kbpy's existing 3VL system. This eliminates the defensive isinstance(x, VoidLookup) boilerplate needed in Python predicates.
Common Predicate Patterns
Simple attribute check
predicate: "salary" is greater than 100000
Tests a single attribute. If salary is Unknown, the result is Unknown and the entity lands in is_unknown.
Relationship traversal
predicate: "budget" of "department" is greater than 250000
Follows the department relationship and accesses budget on the target entity. If the relationship lookup fails (Void or Unknown), the result propagates.
Existential quantifier
predicate: there is a ns in "all_namespaces" of "namespace":
"cluster_key" of ns is "cluster_key"
Checks whether any entity in the target set satisfies the condition. Useful for testing membership or existence of related entities.
Multi-condition check
predicate: "status" is "active" and "owner" is known
Combines conditions. Remember: if either side is Void, the result follows the propagation rules.
Conditional predicate
predicate: if "is_critical" then "has_backup_plan" else true
Tests different conditions based on entity attributes. Non-critical entities trivially satisfy this.
Partition reference in predicate
predicate: "form_team" of "sport"
References a partition result on a related entity. This checks whether the related sport entity is in the is_true branch of the form_team partition.
Steward Declaration
Use steward: to declare set-level responsibility:
steward: "all_clusters" of "cluster"
by: "all_stakeholders"
match on: "id"
match value: "PLATFORM_LEAD"
After this declaration, the entity set has a steward attribute that resolves to the matched stakeholder entity.
Expectation
Expectations assert that certain partition branches should be empty. They connect an expector (the person who holds the expectation) to a measurable condition, and identify an expectee (who should address misalignment).
Defining an Expectation
expectation: "all_services_owned"
expector: "all_stakeholders" ? "id" = "SERVICE_OWNER_LEAD"
of partition: "has_owner" of "all_it_services" of "it_service"
expect empty: "is_false"
when aligned: "__IRRELEVANT__"
when misaligned: "__IRRELEVANT__"
when uncertain: "__IRRELEVANT__"
expectee: SELF->"owner"
actions:
- "Contact GBGF leads to assign service owners"
steps towards: "__CAMPAIGN__"
All fields are mandatory. Use "__CAMPAIGN__" for steps towards on campaign root expectations. Use - "__EMPTY__" for actions when there are no actions. Named sink entity sets (non-"__IRRELEVANT__") are registered in the entity set registry and can be used as source entity sets for downstream entity types and partitions (see Expectations as Entity Set Creators).
| Clause | Purpose |
|---|---|
expector |
Entity who holds the expectation (entity_set ? field = value) |
of partition |
Which partition to evaluate (partition_name of entity_set of module) |
expect empty |
Which branch(es) should be empty (comma-separated) |
when aligned |
Sink entity set for aligned entities ("__IRRELEVANT__" if unused) |
when misaligned |
Sink entity set for misaligned entities ("__IRRELEVANT__" if unused) |
when uncertain |
Sink entity set for uncertain entities ("__IRRELEVANT__" if unused) |
expectee |
SELF-> traversal to the entity responsible for addressing misalignment |
actions |
List of action descriptions (- "__EMPTY__" if none) |
steps towards |
Parent expectation(s), comma-separated ("__CAMPAIGN__" for roots) |
Expectation Design Guide
A good expectation has four components:
- A clear expector — A specific stakeholder who cares about this condition. Don't create expectations without a named owner.
- A measurable partition — A predicate that cleanly divides entities into branches. The expected-empty branch should contain entities that represent the misalignment.
- An actionable expectee — A person or team who can act on misaligned entities. The
SELF->traversal must resolve to a real person. - Concrete actions — Specific steps to address misalignment, not vague directives.
Worked example: A business requirement says "All IT services must have an owner."
// 1. Define entity type (attributes are separate top-level blocks)
entity type: "ITService"
display name: "IT Service"
display format: "{name}"
unknown if missing: "__EMPTY__"
void if missing: "name"
exhaustive: false
from: "services_data"
identifying attributes: "name"
// 2. Define relationship to owner
relationship: "owner"
of: "ITService"
from: "owner_email"
to: "all_people" of "person"
match on: "email"
// 3. Create partition that classifies services by ownership
partition: "has_owner"
source entity sets: "all_it_services"
predicate: "owner" is known
when true: "owned_services"
when false: "unowned_services"
when unknown: "__IRRELEVANT__"
when void: "__IRRELEVANT__"
// 4. Write expectation
expectation: "all_services_owned"
expector: "all_stakeholders" ? "id" = "SERVICE_GOVERNANCE_LEAD"
of partition: "has_owner" of "all_it_services" of "it_service"
expect empty: "is_false"
when aligned: "__IRRELEVANT__"
when misaligned: "__IRRELEVANT__"
when uncertain: "__IRRELEVANT__"
expectee: SELF->{"all_it_services"}->"steward"
actions:
- "Assign an owner from the relevant GBGF"
steps towards: "__CAMPAIGN__"
Anti-pattern: Overly broad expectations that generate too many questions. If your expectation produces hundreds of misaligned entities, consider whether the partition is too coarse or the data is too incomplete. Split into focused expectations that a single team can act on.
Expectee Traversal
The expectee uses SELF-> arrow syntax to traverse from the subject entity (the entity in the expected-empty branch) to the expectee. The traversal chain supports three step types:
| Step | Level transition | Example |
|---|---|---|
attribute |
entity → entity | SELF->"owner" |
{set_name} |
entity → entity set | SELF->{"all_clusters"} |
?field=value |
entity set → entity | ->"steward" or ->?id=SVC1 |
# Single hop: service → owner (Person)
SELF->"owner"
# Multi-hop: migration → namespace → service → owner (Person)
SELF->"source_namespace"->"service"->"owner"
# Set attribute: cluster → all_clusters entity set → steward
SELF->{"all_clusters"}->"steward"
{set_name} step: Navigates from the current entity to a named entity set. The entity must be a member of the set (membership is validated). After this step, the current context is the entity set itself, and you can access attributes declared via steward declarations.
?field=value step: Filters the current entity set to find a specific entity where field == value. Useful for navigating from an entity set back to a specific entity.
attribute step: Regular attribute traversal. Works on both entities (via relationships) and entity sets (via steward attributes).
Expectation Status
The is_met property returns a Bool3vl:
| Status | Meaning |
|---|---|
| True | All expected-empty branches are empty |
| False | At least one expected-empty branch has entities |
| Unknown | Parent entity set is partial — cannot confirm |
Auto-generated Descriptions
Descriptions are auto-generated from the expectation's components:
{expector name} expects of {expectee name} that there are no {entity set names}
The entity set names are taken from the partition's named branches (e.g. when true: "legacy_clusters"). If a branch is "__IRRELEVANT__", it is excluded.
Alias
Aliases create local names for definitions from other namespaces (other .kb files). This enables cross-file references without fully qualified names.
alias: "XYZCluster"
for: "domain.XYZCluster"
| Clause | Purpose |
|---|---|
alias |
The local name to use in this file |
for |
The fully qualified target (namespace.name) |
After declaring an alias, the local name can be used in source entity sets: clauses, entity set references, and other places where a definition name is expected.
Campaigns (steps towards)
A campaign is an expectation that other expectations point towards. Campaigns are not declared with any special syntax — any expectation becomes a campaign when other expectations reference it via steps towards.
The steps towards clause
Add steps towards to an expectation to indicate it contributes to a larger campaign:
expectation: "no_legacy_clusters"
expector: "all_stakeholders" ? "id" = "PLATFORM_LEAD"
of partition: "is_legacy" of "all_clusters" of "cluster"
expect empty: "is_true"
when aligned: "__IRRELEVANT__"
when misaligned: "__IRRELEVANT__"
when uncertain: "__IRRELEVANT__"
expectee: SELF->{"all_clusters"}->"steward"
actions:
- "Migrate off legacy clusters"
steps towards: "platform_modernized"
Here, "no_legacy_clusters" is a step towards the "platform_modernized" expectation. The "platform_modernized" expectation uses "__CAMPAIGN__" as its steps towards, marking it as a campaign root:
expectation: "platform_modernized"
expector: "all_stakeholders" ? "id" = "CTO"
of partition: "uses_modern_stack" of "all_clusters" of "cluster"
expect empty: "is_false"
when aligned: "__IRRELEVANT__"
when misaligned: "__IRRELEVANT__"
when uncertain: "__IRRELEVANT__"
expectee: "steward" of set "all_clusters"
actions:
- "Review platform modernization progress"
steps towards: "__CAMPAIGN__"
Scoping behaviour
When scoping to a campaign, all expectations with steps towards pointing to that campaign come into scope — recursively. If expectation A steps towards B, and B steps towards C, then scoping to campaign C brings both A and B (and C itself) into scope. An expectation with no steps towards clause that is not referenced by any other expectation's steps towards is a standalone campaign containing only itself.
Example: Complete Workflow
A complete .kb domain definition:
// Entity types
entity type: "Department"
display name: "Department"
display format: "{name}"
source entity sets: "all_departments"
sink entity set: "all_departments"
predicate: true
exhaustive: true
identifying attributes: "name"
entity type: "Employee"
display name: "Employee"
display format: "{name}"
source entity sets: "all_employees"
sink entity set: "all_employees"
predicate: true
exhaustive: false
identifying attributes: "name"
// Attributes (top-level, linked to entity sets via 'of')
attribute: "name"
of: "all_departments"
type: string
attribute: "budget"
of: "all_departments"
type: int
attribute: "name"
of: "all_employees"
type: string
attribute: "salary"
of: "all_employees"
type: int
attribute: "department_name"
of: "all_employees"
type: string
// Attestation sources (data mapping and sentinels)
attestation source: "departments_data"
entity set: "all_departments"
exhaustive: true
unknown if missing: "__EMPTY__"
void if missing: "name", "budget"
attestor: "all_stakeholders" ? "id" = "HR_TEAM"
type: csv
from data field: "name" -> "name"
from data field: "budget" -> "budget"
validity: 365 days
attestation source: "employees_data"
entity set: "all_employees"
exhaustive: false
unknown if missing: "salary"
void if missing: "name"
attestor: "all_stakeholders" ? "id" = "HR_TEAM"
type: csv
from data field: "name" -> "name"
validity: 365 days
from data field: "salary" -> "salary"
unknowns: "N/A", ""
validity: 365 days
from data field: "department" -> "department_name"
validity: 365 days
// Relationships
relationship: "department"
of: "all_employees"
from: "department_name"
to: "all_departments" of "department"
match on: "name"
// Partitions (all 4 branches mandatory)
partition: "well_paid"
source entity sets: "all_employees"
predicate: "salary" is greater than 100000
when true: "well_paid"
when false: "not_well_paid"
when unknown: "salary_unknown"
when void: "__IRRELEVANT__"
partition: "high_budget_dept"
source entity sets: "all_employees"
predicate: "budget" of "department" is greater than 300000
when true: "high_budget_employees"
when false: "low_budget_employees"
when unknown: "__IRRELEVANT__"
when void: "__IRRELEVANT__"
Serve this domain:
uv run kbpy history <repo-path> -o data.sqlite
uv run kbpy serve --db data.sqlite
See src/example_1.kb for a simple example and src/example_3/domain.kb for a complex domain with relationships, hierarchical partitions, and existential quantifiers.
WRIT Docs