Vadalog Rules Frontend: Per-Space Normalization UI
A point-and-click UI replacing the curl-only workflow for editing per-space normalization rules — aliases, label maps, predicate maps, and CSV field maps — with dry-run validation and a full round-trip contract.
What it does
Per-space normalization rules (.vada facts) used to be editable only via curl: users had to write Datalog-style syntax by hand and POST it to /api/v1/graph/rules. This article documents the end-to-end frontend stack that replaces that workflow with a point-and-click UI, plus the backend patches that made it possible.
The work ships in three layers:
- Layer 1 — Per-space rules panel. A modal in ThoughtSpace (header dropdown → "Space Rules") with tab editors for aliases, label maps, predicate maps, and CSV field maps.
- Layer 2 — Datasource onboarding wizard. Three-step flow (Connect → Map → Ingest) for registering external CSV/JSON feeds. Supports remote URL sources and browser CSV uploads.
- Layer 3 — Raw
.vadaeditor. Plain monospaced textarea for power users, backed by a newPOST /rules?dry_run=trueflag that validates without persisting and returns a per-line error list.
Architecture
The stack spans three services. The ThoughtSpace frontend, built on Next.js, holds the rules-panel modal, the datasource wizard, the raw editor, and the Vadalog renderer. It talks over HTTP to Orchard, a FastAPI service that handles the rules routes and proxies requests onward. Orchard forwards to Barnyard — also FastAPI, with Celery for background work — which runs the graph routes and the Vadalog rule engine. Each space's rules are persisted as text on a SpaceConfig node in Neo4j, and any uploaded datasource files are stored in S3.
Endpoint Reference
Rules (save / get / delete)
| Method | Orchard path | Notes |
|---|---|---|
| GET | /rules?space_id= | Returns parsed_summary • parsed_rules |
| POST | /rules {space_id,content} | ?dry_run=true validates only |
| DELETE | /rules?space_id= | Clears space rules (reverts to global) |
Datasource wizard
| Method | Orchard path | Purpose |
|---|---|---|
| POST | /rules/datasource/preview | Fetch first N rows |
| POST | /rules/datasource/upload | Browser CSV → S3 + datasource() rule |
| POST | /rules/datasource/ingest | Queue datasource_ingest_task |
.vada Syntax Reference
Every rule is a single-line literal fact ending with ). — args are CSV-parsed (RFC 4180). Lines starting with % are comments.
| Rule | Signature | Behavior |
|---|---|---|
canonical | canonical("alias", "Canonical"). | Merge alias → canonical entity name |
label_map | label_map("surface", "canonical"). | Normalize GLiNER surface labels |
predicate_map | predicate_map("raw phrase", "CANONICAL"). | Rewrite relation predicates |
field_map | field_map("ExtCol", "text", "entity_label"). | Map CSV/JSON column → TextNode field |
datasource | `datasource("name", "csv\ | json", "url").` |
datasource_auth | datasource_auth("name", "Header", "value"). | HTTP header for fetching source |
group_by | group_by("field", "label", 20). | Bucket datasource rows by column |
Round-Trip Contract
The UI edits a structured state tree, not raw text — saves go through renderVada(state): string. The invariant the renderer must preserve:
The renderer's core invariant is that a round trip must be lossless: parsing a rules fixture, rendering it back to text, and parsing that result again must produce exactly the same structure as the original. This is enforced by a unit test, and any change to the TypeScript renderer must be mirrored in the Python parser the test checks against.
This is checked in Barnyard/unit_tests/test_vadalog_roundtrip.py. Any change to the TS renderer must update the Python mirror in the test.
Three specific invariants:
- CSV-quote escaping. Values with
"inside must round-trip via RFC 4180 doubling. - Key casing. The parser lowercases the first arg of
canonical,label_map, andpredicate_map. - Numbers are CSV-quoted.
group_by("field", "label", "20")— not bare numeric tokens.
Notable Fixes
- Pre-existing ingestion gap was patched. Before this work,
.vadarules saved via the API were silently ignored during regular file uploads. The fix addedload_space_rules_syncand wired it into bothpreprocess_text_nodeand_extract_relations. - Per-upload compression moved out of memify.
extract_graph_tasknow callscompress_entities_llmdirectly on every ingestion. - Datasource preview URL validation.
fetch_datasource_rowsruns_validate_datasource_urlwhich rejects SSRF targets (private IPs, loopback, link-local, metadata endpoints). - Recluster button semantics. Now wired to
/clustering/recluster(memify only), not/clustering/full_recluster.
Keep Reading
Source Traceability: From Answer Back to Passage
Every answer Anatypical generates is anchored to specific document passages and entities via persistent Neo4j graph edges — surviving re-ingestion, entity merges, and session restarts.
Vadalog Semantic Grouping: Structured Predicate Taxonomy for Knowledge Graphs
How Barnyard normalizes inconsistent LLM-extracted predicates into a 30+ canonical predicate ontology across 13 semantic groups, preventing knowledge graph fragmentation.
Tribrid RAG: Three-Signal Retrieval with MMR Fusion
Barnyard combines entity search (BM25 + vector), topic cluster retrieval, and knowledge graph expansion into a single ranked passage pool using Maximum Marginal Relevance fusion.