#DSL Reference

This document is the syntax reference for kbpy's Domain Definition Language — the .kb files used to define entity types, partitions, expectations, and everything else in a domain. For the conceptual motivation behind these constructs, see Domain Modeling. For foundational theory, see Core Concepts. For running and operating kbpy, see Serving & Operations.

#The DSL Approach

kbpy domains are defined using an English-like DSL inspired by FO(.) (the IDP-Z3 knowledge base language), rather than Python API calls. This design choice brings several benefits:

Eliminates boilerplate: A 52-line Python predicate function with isinstance(x, VoidLookup) defensive checks becomes 5 lines of DSL. Void and Unknown propagation happens automatically through kbpy's three-valued logic system.
Enforces consistent patterns: Every entity type, partition, and expectation follows the same structure. There's no risk of forgetting to handle edge cases that the DSL handles automatically.
Declarative and auditable: .kb files read like specifications, not code. A domain expert can review an expectation definition without understanding Python.
Safe evaluation: The predicate interpreter evaluates DSL expressions against entity instances without exec or eval, eliminating a class of security and correctness issues.

#File Layout

src/<domain_name>.kb                 Single-file domain
src/<domain_name>/domain.kb          Multi-file domain (definitions)
src/<domain_name>/stakeholders.kb    Multi-file domain (expectations)

Multiple .kb files in the same domain directory are merged during parsing. Use separate files to organise large domains — for example, entity type definitions in domain.kb and stakeholder-facing expectations in stakeholders.kb.

#Entity Type

All 8 fields are mandatory.

entity type: "Person"
    display name: "Person"
    display format: "{name}"
    source entity sets: "all_people"
    sink entity set: "all_people"
    predicate: true
    exhaustive: true
    identifying attributes: "name"

Clause	Purpose
`display name`	Human-readable name shown in the web interface
`display format`	Format string for entity display (e.g., `"{name}"`)
`source entity sets`	Set expression using `union`, `intersect`, or `except` — see below
`sink entity set`	Name of the entity set produced by this entity type
`predicate`	Filter expression applied to source entities (use `true` to accept all)
`exhaustive`	Whether this is a complete (`true`) or partial (`false`) set (ignored for `intersect` and `except` — derived from source partiality)
`identifying attributes`	Attribute(s) used to identify unique entities, comma-separated

Entity types derive from source entity sets via a set expression and produce a sink entity set. Attributes, relationships, and attestation sources are separate top-level blocks. See Set Operations on Entity Types for use cases.

#Set Expressions in Entity Types

The source entity sets: field accepts a full set expression using three infix keywords:

Syntax	Operation	Semantics
`"set_a" union "set_b"`	Union	Entities in at least one source — deduplicated by identity
`"set_a" intersect "set_b"`	Intersection	Entities whose identity appears in all listed sources
`"set_a" except "set_b"`	Difference	Entities in the primary source not in the secondary source

Single set (leaf expression):

source entity sets: "all_people"

Union — combine two sources:

source entity sets: "employees_from_hr" union "employees_from_payroll"

Entities from all sources are combined and deduplicated by identifying attributes. When the same entity appears with conflicting attribute values across sources, the conflicting attributes become Unknown (see Domain Modeling: Set Operations).

Intersection — entities that appear in all sources:

source entity sets: "cloud_services" intersect "cmdb_services"

Only entities whose identity matches in every listed source are retained. Useful for "entities we can confirm from multiple sources." exhaustive: is ignored for intersection types — is_partial is derived from source partiality.

Intersection — three sources:

source entity sets: "cloud_assets" intersect "cmdb_assets" intersect "finance_assets"

Difference — primary minus secondary:

source entity sets: "all_clusters" except "decommissioned_clusters"

Entities from the primary source (all_clusters) that do not appear in the secondary source (decommissioned_clusters). Attributes from the secondary source are not available on the sink. exhaustive: is ignored for difference types — is_partial is derived from source partiality.

Compound exclusion (parentheses):

source entity sets: "all_services" except ("decommissioned_services" union "excluded_services")

Composed expression:

source entity sets: ("cloud_services" union "on_prem_services") intersect "cmdb_services"

Mixing operators requires parentheses. "A" union "B" intersect "C" is a parse error — write ("A" union "B") intersect "C" or "A" union ("B" intersect "C") to disambiguate. "A" except "B" except "C" is also a parse error — write "A" except ("B" union "C").

#Attribute (top-level, linked via `of`)

attribute: "salary"
    of: "all_people"
    type: int

Clause	Purpose
`of`	Name of the entity set this attribute belongs to
`type`	One of: `string`, `int`, `float`, `bool`

Attributes are pure schema declarations. Data field mapping, sentinel values, and validity are configured in the attestation source definition.

#Relationship (top-level, linked via `of`)

relationship: "department"
    of: "all_employees"
    from: "department_name"
    to: "all_departments" of "department"
    match on: "name"

Clause	Purpose
`of`	Name of the entity set this relationship belongs to
`from`	Source attribute on this entity
`to`	Target entity set (with optional `of` module qualifier)
`match on`	Attribute on the target entity to match against

#Attestation Source

Attestation sources define how external data maps onto entity sets. They carry the field mapping, sentinel values, and validity settings that were previously on attribute definitions.

attestation source: "people_data"
    description:
        data_source: "HR system export"
        method: "CSV download from HR portal"
        returns: "Employee names, salaries, and departments"
    entity set: "all_people"
    exhaustive: true
    unknown if missing: "salary"
    void if missing: "__EMPTY__"
    attestor: "all_stakeholders" ? "id" = "HR_TEAM"
    type: csv
    from data field: "name" -> "name"
        returns: "Employee full name"
        unknowns: "N/A", ""
        voids: "__EMPTY__"
        trues: "__EMPTY__"
        falses: "__EMPTY__"
        validity: 365 days
    from data field: "salary" -> "salary"
        returns: "Annual salary in GBP"
        unknowns: "N/A", ""
        voids: "__EMPTY__"
        trues: "__EMPTY__"
        falses: "__EMPTY__"
        validity: 365 days
    match on: "name"

Clause	Purpose
`description:`	Structured description block (required)
`data_source`	Human-readable name of the external data source
`method`	How to obtain or access the data
`returns`	What attributes or information the source provides
`entity set`	Which entity set this source populates
`exhaustive`	Whether this source provides a complete data set
`unknown if missing`	Attributes that return Unknown when absent (`"__EMPTY__"` if none)
`void if missing`	Attributes that return Void when absent (`"__EMPTY__"` if none)
`attestor`	Stakeholder responsible for this data source
`type`	Source type (e.g., `csv`, `sqlite`, `inline`)
`from data field`	Maps a source column to an attribute (`"col" -> "attr"`)
`returns` (field)	Per-field description of what the field provides
`unknowns`	Per-field sentinel values that map to Unknown
`voids`	Per-field sentinel values that map to Void
`trues` / `falses`	Per-field sentinel values for bool attributes
`validity`	Per-field staleness window (e.g., `365 days`) or `__ONGOING__`
`match on`	Field used to match records to entities
`file`	File path for file-based sources (optional)

#Partition

All 4 branches are mandatory. Use "__IRRELEVANT__" for branches that are not meaningful. At least one branch must be a real entity set name.

partition: "is_old"
    source entity sets: "all_people"
    predicate: "age" is greater than 65
    when true: "elderly"
    when false: "youth"
    when unknown: "__IRRELEVANT__"
    when void: "__IRRELEVANT__"

Clause	Purpose
`source entity sets`	The parent entity set(s) to partition (comma-separated)
`predicate`	Expression that classifies each entity
`when true`	Entity set name for entities where predicate is true (`"__IRRELEVANT__"` if unused)
`when false`	Entity set name for entities where predicate is false (`"__IRRELEVANT__"` if unused)
`when unknown`	Entity set name for entities where predicate is unknown (`"__IRRELEVANT__"` if unused)
`when void`	Entity set name for entities where predicate is void (`"__IRRELEVANT__"` if unused)

#Predicate Expressions

Predicates use an English subset of FO(.) syntax:

Syntax	Meaning
`"salary"`	Attribute access on current entity
`"legacy_flag" of "dr_set"`	Relationship traversal
`is`, `is not`	Equality comparisons
`is greater than`, `is less than`	Numeric comparisons
`is at least`, `is at most`	Numeric comparisons (inclusive)
`is known`, `is not void`	Three-valued logic checks
`and`, `or`, `not`	Boolean connectives
`there is a x in "set" of "module": ...`	Existential quantifier
`if ... then ... else ...`	Conditional
`true`, `false`	Constants

The predicate interpreter (src/kbpy/predicate.py) evaluates these expressions against entity instances, returning Bool3vl with automatic void/unknown propagation through kbpy's existing 3VL system. This eliminates the defensive isinstance(x, VoidLookup) boilerplate needed in Python predicates.

#Common Predicate Patterns

#Simple attribute check

predicate: "salary" is greater than 100000

Tests a single attribute. If salary is Unknown, the result is Unknown and the entity lands in is_unknown.

#Relationship traversal

predicate: "budget" of "department" is greater than 250000

Follows the department relationship and accesses budget on the target entity. If the relationship lookup fails (Void or Unknown), the result propagates.

#Existential quantifier

predicate: there is a ns in "all_namespaces" of "namespace":
    "cluster_key" of ns is "cluster_key"

Checks whether any entity in the target set satisfies the condition. Useful for testing membership or existence of related entities.

#Multi-condition check

predicate: "status" is "active" and "owner" is known

Combines conditions. Remember: if either side is Void, the result follows the propagation rules.

#Conditional predicate

predicate: if "is_critical" then "has_backup_plan" else true

Tests different conditions based on entity attributes. Non-critical entities trivially satisfy this.

#Partition reference in predicate

predicate: "form_team" of "sport"

References a partition result on a related entity. This checks whether the related sport entity is in the is_true branch of the form_team partition.

#Steward Declaration

Use steward: to declare set-level responsibility:

steward: "all_clusters" of "cluster"
    by: "all_stakeholders"
    match on: "id"
    match value: "PLATFORM_LEAD"

After this declaration, the entity set has a steward attribute that resolves to the matched stakeholder entity.

#Expectation

Expectations assert that certain partition branches should be empty. They connect an expector (the person who holds the expectation) to a measurable condition, and identify an expectee (who should address misalignment).

#Defining an Expectation

expectation: "all_services_owned"
    expector: "all_stakeholders" ? "id" = "SERVICE_OWNER_LEAD"
    of partition: "has_owner" of "all_it_services" of "it_service"
    expect empty: "is_false"
    when aligned: "__IRRELEVANT__"
    when misaligned: "__IRRELEVANT__"
    when uncertain: "__IRRELEVANT__"
    expectee: SELF->"owner"
    actions:
        - "Contact GBGF leads to assign service owners"
    steps towards: "__CAMPAIGN__"

All fields are mandatory. Use "__CAMPAIGN__" for steps towards on campaign root expectations. Use - "__EMPTY__" for actions when there are no actions. Named sink entity sets (non-"__IRRELEVANT__") are registered in the entity set registry and can be used as source entity sets for downstream entity types and partitions (see Expectations as Entity Set Creators).

Clause	Purpose
`expector`	Entity who holds the expectation (`entity_set ? field = value`)
`of partition`	Which partition to evaluate (`partition_name of entity_set of module`)
`expect empty`	Which branch(es) should be empty (comma-separated)
`when aligned`	Sink entity set for aligned entities (`"__IRRELEVANT__"` if unused)
`when misaligned`	Sink entity set for misaligned entities (`"__IRRELEVANT__"` if unused)
`when uncertain`	Sink entity set for uncertain entities (`"__IRRELEVANT__"` if unused)
`expectee`	`SELF->` traversal to the entity responsible for addressing misalignment
`actions`	List of action descriptions (`- "__EMPTY__"` if none)
`steps towards`	Parent expectation(s), comma-separated (`"__CAMPAIGN__"` for roots)

#Expectation Design Guide

A good expectation has four components:

A clear expector — A specific stakeholder who cares about this condition. Don't create expectations without a named owner.
A measurable partition — A predicate that cleanly divides entities into branches. The expected-empty branch should contain entities that represent the misalignment.
An actionable expectee — A person or team who can act on misaligned entities. The SELF-> traversal must resolve to a real person.
Concrete actions — Specific steps to address misalignment, not vague directives.

Worked example: A business requirement says "All IT services must have an owner."

// 1. Define entity type (attributes are separate top-level blocks)
entity type: "ITService"
    display name: "IT Service"
    display format: "{name}"
    unknown if missing: "__EMPTY__"
    void if missing: "name"
    exhaustive: false
    from: "services_data"
    identifying attributes: "name"

// 2. Define relationship to owner
relationship: "owner"
    of: "ITService"
    from: "owner_email"
    to: "all_people" of "person"
    match on: "email"

// 3. Create partition that classifies services by ownership
partition: "has_owner"
    source entity sets: "all_it_services"
    predicate: "owner" is known
    when true: "owned_services"
    when false: "unowned_services"
    when unknown: "__IRRELEVANT__"
    when void: "__IRRELEVANT__"

// 4. Write expectation
expectation: "all_services_owned"
    expector: "all_stakeholders" ? "id" = "SERVICE_GOVERNANCE_LEAD"
    of partition: "has_owner" of "all_it_services" of "it_service"
    expect empty: "is_false"
    when aligned: "__IRRELEVANT__"
    when misaligned: "__IRRELEVANT__"
    when uncertain: "__IRRELEVANT__"
    expectee: SELF->{"all_it_services"}->"steward"
    actions:
        - "Assign an owner from the relevant GBGF"
    steps towards: "__CAMPAIGN__"

Anti-pattern: Overly broad expectations that generate too many questions. If your expectation produces hundreds of misaligned entities, consider whether the partition is too coarse or the data is too incomplete. Split into focused expectations that a single team can act on.

#Expectee Traversal

The expectee uses SELF-> arrow syntax to traverse from the subject entity (the entity in the expected-empty branch) to the expectee. The traversal chain supports three step types:

Step	Level transition	Example
`attribute`	entity → entity	`SELF->"owner"`
`{set_name}`	entity → entity set	`SELF->{"all_clusters"}`
`?field=value`	entity set → entity	`->"steward"` or `->?id=SVC1`

# Single hop: service → owner (Person)
SELF->"owner"

# Multi-hop: migration → namespace → service → owner (Person)
SELF->"source_namespace"->"service"->"owner"

# Set attribute: cluster → all_clusters entity set → steward
SELF->{"all_clusters"}->"steward"

{set_name} step: Navigates from the current entity to a named entity set. The entity must be a member of the set (membership is validated). After this step, the current context is the entity set itself, and you can access attributes declared via steward declarations.

?field=value step: Filters the current entity set to find a specific entity where field == value. Useful for navigating from an entity set back to a specific entity.

attribute step: Regular attribute traversal. Works on both entities (via relationships) and entity sets (via steward attributes).

#Expectation Status

The is_met property returns a Bool3vl:

Status	Meaning
True	All expected-empty branches are empty
False	At least one expected-empty branch has entities
Unknown	Parent entity set is partial — cannot confirm

#Auto-generated Descriptions

Descriptions are auto-generated from the expectation's components:

{expector name} expects of {expectee name} that there are no {entity set names}

The entity set names are taken from the partition's named branches (e.g. when true: "legacy_clusters"). If a branch is "__IRRELEVANT__", it is excluded.

#Alias

Aliases create local names for definitions from other namespaces (other .kb files). This enables cross-file references without fully qualified names.

alias: "XYZCluster"
    for: "domain.XYZCluster"

Clause	Purpose
`alias`	The local name to use in this file
`for`	The fully qualified target (namespace.name)

After declaring an alias, the local name can be used in source entity sets: clauses, entity set references, and other places where a definition name is expected.

#Campaigns (steps towards)

A campaign is an expectation that other expectations point towards. Campaigns are not declared with any special syntax — any expectation becomes a campaign when other expectations reference it via steps towards.

#The `steps towards` clause

Add steps towards to an expectation to indicate it contributes to a larger campaign:

expectation: "no_legacy_clusters"
    expector: "all_stakeholders" ? "id" = "PLATFORM_LEAD"
    of partition: "is_legacy" of "all_clusters" of "cluster"
    expect empty: "is_true"
    when aligned: "__IRRELEVANT__"
    when misaligned: "__IRRELEVANT__"
    when uncertain: "__IRRELEVANT__"
    expectee: SELF->{"all_clusters"}->"steward"
    actions:
        - "Migrate off legacy clusters"
    steps towards: "platform_modernized"

Here, "no_legacy_clusters" is a step towards the "platform_modernized" expectation. The "platform_modernized" expectation uses "__CAMPAIGN__" as its steps towards, marking it as a campaign root:

expectation: "platform_modernized"
    expector: "all_stakeholders" ? "id" = "CTO"
    of partition: "uses_modern_stack" of "all_clusters" of "cluster"
    expect empty: "is_false"
    when aligned: "__IRRELEVANT__"
    when misaligned: "__IRRELEVANT__"
    when uncertain: "__IRRELEVANT__"
    expectee: "steward" of set "all_clusters"
    actions:
        - "Review platform modernization progress"
    steps towards: "__CAMPAIGN__"

#Scoping behaviour

When scoping to a campaign, all expectations with steps towards pointing to that campaign come into scope — recursively. If expectation A steps towards B, and B steps towards C, then scoping to campaign C brings both A and B (and C itself) into scope. An expectation with no steps towards clause that is not referenced by any other expectation's steps towards is a standalone campaign containing only itself.

#Example: Complete Workflow

A complete .kb domain definition:

// Entity types
entity type: "Department"
    display name: "Department"
    display format: "{name}"
    source entity sets: "all_departments"
    sink entity set: "all_departments"
    predicate: true
    exhaustive: true
    identifying attributes: "name"

entity type: "Employee"
    display name: "Employee"
    display format: "{name}"
    source entity sets: "all_employees"
    sink entity set: "all_employees"
    predicate: true
    exhaustive: false
    identifying attributes: "name"

// Attributes (top-level, linked to entity sets via 'of')
attribute: "name"
    of: "all_departments"
    type: string

attribute: "budget"
    of: "all_departments"
    type: int

attribute: "name"
    of: "all_employees"
    type: string

attribute: "salary"
    of: "all_employees"
    type: int

attribute: "department_name"
    of: "all_employees"
    type: string

// Attestation sources (data mapping and sentinels)
attestation source: "departments_data"
    entity set: "all_departments"
    exhaustive: true
    unknown if missing: "__EMPTY__"
    void if missing: "name", "budget"
    attestor: "all_stakeholders" ? "id" = "HR_TEAM"
    type: csv
    from data field: "name" -> "name"
    from data field: "budget" -> "budget"
        validity: 365 days

attestation source: "employees_data"
    entity set: "all_employees"
    exhaustive: false
    unknown if missing: "salary"
    void if missing: "name"
    attestor: "all_stakeholders" ? "id" = "HR_TEAM"
    type: csv
    from data field: "name" -> "name"
        validity: 365 days
    from data field: "salary" -> "salary"
        unknowns: "N/A", ""
        validity: 365 days
    from data field: "department" -> "department_name"
        validity: 365 days

// Relationships
relationship: "department"
    of: "all_employees"
    from: "department_name"
    to: "all_departments" of "department"
    match on: "name"

// Partitions (all 4 branches mandatory)
partition: "well_paid"
    source entity sets: "all_employees"
    predicate: "salary" is greater than 100000
    when true: "well_paid"
    when false: "not_well_paid"
    when unknown: "salary_unknown"
    when void: "__IRRELEVANT__"

partition: "high_budget_dept"
    source entity sets: "all_employees"
    predicate: "budget" of "department" is greater than 300000
    when true: "high_budget_employees"
    when false: "low_budget_employees"
    when unknown: "__IRRELEVANT__"
    when void: "__IRRELEVANT__"

Serve this domain:

uv run kbpy history <repo-path> -o data.sqlite
uv run kbpy serve --db data.sqlite

See src/example_1.kb for a simple example and src/example_3/domain.kb for a complex domain with relationships, hierarchical partitions, and existential quantifiers.