Knowledge GraphsProduct Updates

Vadalog Rules Frontend: Per-Space Normalization UI

A point-and-click UI replacing the curl-only workflow for editing per-space normalization rules — aliases, label maps, predicate maps, and CSV field maps — with dry-run validation and a full round-trip contract.

Dawson Bauer

May 21, 2026

What it does

Per-space normalization rules (.vada facts) used to be editable only via curl: users had to write Datalog-style syntax by hand and POST it to /api/v1/graph/rules. This article documents the end-to-end frontend stack that replaces that workflow with a point-and-click UI, plus the backend patches that made it possible.

The work ships in three layers:

Layer 1 — Per-space rules panel. A modal in ThoughtSpace (header dropdown → "Space Rules") with tab editors for aliases, label maps, predicate maps, and CSV field maps.
Layer 2 — Datasource onboarding wizard. Three-step flow (Connect → Map → Ingest) for registering external CSV/JSON feeds. Supports remote URL sources and browser CSV uploads.
Layer 3 — Raw .vada editor. Plain monospaced textarea for power users, backed by a new POST /rules?dry_run=true flag that validates without persisting and returns a per-line error list.

Architecture

The stack spans three services. The ThoughtSpace frontend, built on Next.js, holds the rules-panel modal, the datasource wizard, the raw editor, and the Vadalog renderer. It talks over HTTP to Orchard, a FastAPI service that handles the rules routes and proxies requests onward. Orchard forwards to Anatypical — also FastAPI, with Celery for background work — which runs the graph routes and the Vadalog rule engine. Each space's rules are persisted as text on a SpaceConfig node in Neo4j, and any uploaded datasource files are stored in S3.

Endpoint Reference

Rules (save / get / delete)

Method	Orchard path	Notes
GET	`/rules?space_id=`	Returns `parsed_summary` • `parsed_rules`
POST	`/rules` `{space_id,content}`	`?dry_run=true` validates only
DELETE	`/rules?space_id=`	Clears space rules (reverts to global)

Datasource wizard

Method	Orchard path	Purpose
POST	`/rules/datasource/preview`	Fetch first N rows
POST	`/rules/datasource/upload`	Browser CSV → S3 + `datasource()` rule
POST	`/rules/datasource/ingest`	Queue `datasource_ingest_task`

`.vada` Syntax Reference

Every rule is a single-line literal fact ending with ). — args are CSV-parsed (RFC 4180). Lines starting with % are comments.

Rule	Signature	Behavior
`canonical`	`canonical("alias", "Canonical").`	Merge alias → canonical entity name
`label_map`	`label_map("surface", "canonical").`	Normalize GLiNER surface labels
`predicate_map`	`predicate_map("raw phrase", "CANONICAL").`	Rewrite relation predicates
`field_map`	`field_map("ExtCol", "text", "entity_label").`	Map CSV/JSON column → TextNode field
`datasource`	`datasource("name", "csv\	json", "url").`
`datasource_auth`	`datasource_auth("name", "Header", "value").`	HTTP header for fetching source
`group_by`	`group_by("field", "label", 20).`	Bucket datasource rows by column

Round-Trip Contract

The UI edits a structured state tree, not raw text — saves go through renderVada(state): string. The invariant the renderer must preserve:

The renderer's core invariant is that a round trip must be lossless: parsing a rules fixture, rendering it back to text, and parsing that result again must produce exactly the same structure as the original. This is enforced by a unit test, and any change to the TypeScript renderer must be mirrored in the Python parser the test checks against.

This is checked in Anatypical/unit_tests/test_vadalog_roundtrip.py. Any change to the TS renderer must update the Python mirror in the test.

Three specific invariants:

CSV-quote escaping. Values with " inside must round-trip via RFC 4180 doubling.
Key casing. The parser lowercases the first arg of canonical, label_map, and predicate_map.
Numbers are CSV-quoted. group_by("field", "label", "20") — not bare numeric tokens.

Notable Fixes

Pre-existing ingestion gap was patched. Before this work, .vada rules saved via the API were silently ignored during regular file uploads. The fix added load_space_rules_sync and wired it into both preprocess_text_node and _extract_relations.
Per-upload compression moved out of memify. extract_graph_task now calls compress_entities_llm directly on every ingestion.
Datasource preview URL validation. fetch_datasource_rows runs _validate_datasource_url which rejects SSRF targets (private IPs, loopback, link-local, metadata endpoints).
Recluster button semantics. Now wired to /clustering/recluster (memify only), not /clustering/full_recluster.

Keep Reading

Knowledge GraphsEnterprise AI

Entity and Relation Extraction & Compression

A deep dive into Anatypical's two-phase pipeline: GLiNER for zero-shot NER and a single-pass LLM for relation triplets, followed by cross-document deduplication and pre-materialized RelationStar summaries.

June 4, 2026

Knowledge GraphsEnterprise AI

Branching Memory: Persistent Conversational Context in GraphRAG

Anatypical stores conversation turns as a persistent graph in Neo4j, enabling durable context, branching threads, and provenance tracking that survives session restarts.

June 3, 2026

Knowledge GraphsGlass Box

Source Traceability: From Answer Back to Passage

Every answer Anatypical generates is anchored to specific document passages and entities via persistent Neo4j graph edges — surviving re-ingestion, entity merges, and session restarts.