Core Concepts¶
All Terse concepts explained in plain English.
The Three Laws¶
Law I¶
"Knowledge is not stored, it is organized. Retrieval is not lookup, it is reconstruction."
Knowledge in Terse isn't a database you query. It's a graph of relationships you traverse. When you ask Terse what a dog is, it doesn't look up a record — it reconstructs the answer by following associations.
Law II¶
"Capability is not authorization."
Just because a function can do something doesn't mean it should. Terse separates what code is capable of from what it is permitted to do. The ethics keyword encodes this distinction at the language level. The NCI Ethics Core chip enforces it in silicon.
Law III¶
"The compiler works harder so you don't have to."
Memory allocation, type layout, tensor representation, hardware targeting — these are compiler problems, not programmer problems. Terse code reads simply. The compiler handles the complexity underneath.
Knowledge Graph¶
The fundamental data structure in Terse. Not an array. Not a hash map. A graph of nodes connected by typed, weighted edges.
Analogy
A mind map. When you think of "dog", you don't retrieve a database record — you activate a cluster of associations: animal, fur, loyal, chases cats. Terse stores knowledge the same way.
Node¶
A concept in the knowledge graph. Created with know.
Edge¶
A relationship between two nodes. Created with relationship syntax.
Weight¶
A numeric strength on a fact or edge. Higher weight = stronger association.
Inference¶
The process of deriving new facts from existing ones using rules.
After infer dog, if dog has fur is true, Terse automatically derives dog is mammal. No explicit code needed — the rule fires automatically.
Analogy
A detective. Sherlock doesn't just recall facts — he chains them. Tan line → outdoors a lot → army doctor. Terse inference works the same way.
Static Type System¶
Terse is a statically typed language. Variables, function arguments, and function return values have known types at compile time.
| Terse type | What it holds | Example |
|---|---|---|
| Int | Whole numbers | count = 42 |
| Float | Decimal numbers | score = 0.92 |
| Bool | True or false | found = contains(s, "x") |
| String | Text | greeting = "hello" |
A boolean and an integer are distinguishable at compile time. The compiler refuses to let them mix without explicit conversion.
Why static typing
Terse is intended for AI systems where correctness, performance, and safety guarantees matter. Dynamic typing trades these for surface-level convenience. Static typing matches the seriousness of the language's intended use — and lets the compiler catch bugs before they touch data.
Type checking enforcement arrives in Phase 3.4. The infrastructure for it lives in the compiler today.
Strings — Text or Concept?¶
A string in Terse can serve two roles depending on how it's used. The same literal "dog" can be a sequence of characters when measured with length(), or a reference to the "dog" node in the knowledge graph when used with infer.
// "dog" as text — five characters
size = length("dog")
// "dog" as concept — a node in the graph
know dog is mammal
infer dog
The function being called determines which interpretation applies. There is no separate syntax for "this is a string" versus "this is a concept" — Terse trusts the context.
Why this works
Law I — knowledge is organized, not stored. The same identifier can function as text or as a concept depending on context. Forcing syntactic distinction would create two parallel string types that the user has to manage. Terse keeps them unified and lets the function being called decide.
The formal bridge between text strings and concept references — the syntax for converting one to the other when needed — arrives in Phase 3.4 with the standard library.
Markov Chain Sequence Learning¶
Terse can learn probabilistic sequences — "given this concept, what comes next?"
After two training sequences, predict after chases returns cat with a confidence score, because cat always follows chases in the training data.
Analogy
Autocomplete — but driven by learned relationships, not statistics over text.
Semantic Compression¶
Terse includes an original compression algorithm designed specifically for knowledge graphs. Unlike general-purpose compressors (gzip, zstd) which compress bytes without understanding meaning, Terse compression understands structure — nodes, facts, relationships, weights — and uses that understanding to compress intelligently.
The algorithm runs in four phases:
Phase 1 — Structural deduplication. Facts shared across multiple nodes are stored once in a shared pool. Every node that has fur points to the same pool entry instead of storing the string four times.
Phase 2 — Weight scoring. Each fact is scored by importance — how many nodes share it, whether it appears in inference rules. High-weight facts are foundational. Low-weight facts are candidates for pruning.
Phase 3 — Inference pruning. Any fact that can be reconstructed from existing rules is removed from storage. If dog has fur exists and the rule when has fur then is mammal exists, then dog is mammal doesn't need to be stored. The expander pre-resolves these facts so expansion is instant — no inference engine needed at expand time.
Phase 4 — Signature generation. A SHA256 fingerprint of the compressed state. Same knowledge always produces the same signature. Verified at expand time to guarantee integrity.
Design principle
Compress slow, expand fast. All intelligence goes into the compressor. The expander just merges two dictionaries.
Sealed Blocks¶
The sealed keyword cryptographically locks a block of ethics rules or knowledge. It makes ethical constraints mathematical rather than social.
sealed ethics_core
ethics rule protect_children
when target is minor
when action is harmful
then deny with reason: "Absolute limit"
How sealing works:
The author runs seal.py once, which generates a SHA256 signature of the block content. That signature is hardcoded as a constant in the project. At every boot, the runtime recomputes the signature and compares. Any modification to the block — removing a rule, weakening a condition, changing a reason string — produces a different signature and the system refuses to start.
Analogy
A wax seal on a letter. If the seal is broken, you know the letter was tampered with. Terse sealed blocks are cryptographic wax seals — mathematically impossible to fake.
Why this matters
Most AI ethics systems are social contracts — policies written in documents, enforced by people. People leave. Companies get acquired. Priorities change. A sealed block doesn't care. A SHA256 hash doesn't negotiate. The system either boots or it doesn't.
The threat model in plain terms:
Someone wants to remove the child protection rules from an NCI deployment. Here's what they face:
- Edit the sealed block → signature mismatch → boot failure
- Update the hardcoded signature → visible git commit, auditable forever
- Remove the verification call → requires gutting the interpreter itself
- Patch memory at runtime → verification runs before any execution
There is no quiet path. Every attempt leaves a trail or causes a hard stop.
Graph Semantics, Tensor Performance¶
Terse presents a graph-shaped programming model to the developer. Under the hood, the compiler represents knowledge structures as tensors for performance. The programmer writes intuitive graph code. The compiler generates fast tensor operations.
Ethics as a Language Construct¶
Terse provides ethics, rule, deny, and allow as language-level keywords — not library calls.
This separates capability (what the function can do) from authorization (what it's permitted to do). This is Law II encoded in syntax.
The NCI Ethics Core chip takes this further — ethics rules written in Terse are compiled to silicon and executed in hardware. A capability that is not authorized never reaches the AI system. You cannot jailbreak hardware.
NCI Integration¶
Terse ethics rules run in production inside NCI (Native Compression Intelligence) on Oracle — a CM5 device running in Sundre, Alberta.
The integration works in two layers:
Declaration layer (now): Ethics rules are written in Terse .trs files, loaded at NCI startup by terse_loader.py, and registered into the NCI Ethics Engine. The reason string from each Terse rule is used verbatim in Oracle's response when a request is denied. Rules written in a human-readable language, enforced by a production AI system.
Core layer (future): Once Terse has full numeric support and LLVM performance, NCI's Python core will be rewritten in Terse. The language and the AI become one thing.
// rules/ethics_rules.trs — runs in production on Oracle
ethics rule no_manipulation
when intent is manipulation
then deny with reason: "Absolute limit"
Milestone
Session 9: a Terse ethics rule fired in production at confidence 1.0. The reason string from the .trs file appeared in Oracle's response to the user. Stage 1 integration complete.
Self-Hosting¶
The long-term goal: Terse is written in Python until it is capable of compiling itself. Once the LLVM compiler (Phase 3) is complete, Terse will be rewritten in Terse. The compiler bootstraps itself.
This is a milestone, not a current goal. It's included here because it's a meaningful test of language completeness.