Computer Science and Parser IF in Sharpee
This post explains the data structures and algorithms behind Sharpee, a parser-based interactive fiction engine built in TypeScript. It's written for developers who learn best by seeing how CS concepts appear in real, working code rather than in textbook abstractions.
Every decision I made from early 2024 through today was based on identifying both modern and/or the optimal data structures and design patterns to build a modern Parser-IF platform. I started with some preferences and assumptions, but those were deeply investigated (partly in the C# days) and eventually evolved to serve the end goals.
I did not go to college for computer science and some of these patterns I had to learn as I was in design/build mode and ended up rewriting the foundations of Sharpee many times in those early iterations. Big O concepts, graphs, hash maps, directed graphs, bidirectional cached indexes...these were all things I had to learn, experiment with, and find the right balance for this solution.
And yes, I do find it ironic that a core part of Sharpee is nearly identical to TADS and Inform data structures, though Sharpee has useful additions. This document does not cover the Text Service or anything Typescript specific. It covers the underlying logic of the platform: the data store, parser, grammar, traits, and behaviors.
Every section names the CS concept first, then shows how Sharpee uses it.
The World Model: Choosing the Right Data Structures
An interactive fiction world is a collection of entities — rooms, objects, characters — connected by relationships: containment, location, direction. The original design discussion considered two approaches:
- Graph database: Every entity is a node, every relationship is an edge. Flexible but expensive to query and hard to constrain.
- Immutable state tree: Entities organized in a hierarchy with clean state management and fast serialization.
Sharpee chose a hybrid that uses the right structure for each job.
The Data Structure Taxonomy
| Structure | Access Pattern | Sharpee Usage |
|---|---|---|
| Hash Map | "Get X by name" — O(1) | Entity store, trait lookup by type |
| Tree | "What's inside X?" — parent/child | Containment, spatial model |
| Graph | "What connects to X?" — arbitrary edges | Room connections, directions |
| Array | "Do these in order" — indexed | Trait lists, event handler chains |
| Set | "Is X a member?" — O(1) membership | Children sets, visited tracking |
| Queue | "First in, first out" | Semantic event processing |
Why This Combination?
The most common operations in an IF engine are:
- Find entity by ID — happens every turn, multiple times. Hash map: O(1).
- What's in this room? — happens on every LOOK. Tree children: O(1) lookup.
- Where is this item? — happens on every TAKE/DROP. Tree parent: O(1) lookup.
- Where does north go? — happens on movement. Adjacency list: O(1) lookup.
A pure graph database would make operations 2 and 3 expensive (scan all edges to find relationships of a given type). A pure tree wouldn't handle room connections (rooms connect to siblings, not just parents and children). The hybrid gives O(1) for all four.
The Entity Store
entities: Map<string, IFEntity>
"brass-lantern" → { id: "brass-lantern", traits: [...], ... }
"living-room" → { id: "living-room", traits: [...], ... }
"troll" → { id: "troll", traits: [...], ... }
This is a hash map (TypeScript Map). Looking up any entity by its string ID is O(1) regardless of how many entities exist. The tradeoff: it tells you nothing about relationships between entities. That's what the SpatialIndex and room connections handle.
Computed Properties: The Description Getter
CS Concept: Priority Chain (Chain of Responsibility)
IFEntity.description is not a simple field read — it's a computed getter that walks a priority chain of traits:
get description():
1. OpenableTrait present? → return openDescription or closedDescription based on isOpen
2. SwitchableTrait present? → return onDescription or offDescription based on isOn
3. LightSourceTrait present? → return litDescription or unlitDescription based on isLit
4. Fallback → IdentityTrait.description
Each trait in the chain is checked only if the entity has it, and only if the trait's state-specific description field is populated. If not, the chain falls through to the next trait. This eliminates the need for event handlers to mutate descriptions when state changes — the description is always derived from current state.
Tradeoff: A getter runs on every access (no caching). For IF games where descriptions are read a few times per turn, this is negligible. The benefit is that state and description can never diverge — a classic "make illegal states unrepresentable" pattern applied to derived data.
The SpatialIndex: A Bidirectional Hash Map
CS Concept: Bidirectional Index (Inverted Index)
The SpatialIndex tracks who is inside what. It's the backbone of containment in Sharpee: items inside rooms, items inside containers, items carried by the player.
The Core Trick: Two Maps Pointing Opposite Directions
class SpatialIndex {
private parentToChildren: Map<string, Set<string>>;
private childToParent: Map<string, string>;
}
The same relationship is stored twice — once in each direction:
childToParent parentToChildren
───────────── ────────────────
"lantern" → "living-room" "living-room" → { "lantern", "rug", "player" }
"rug" → "living-room" "player" → { "sword", "bottle" }
"player" → "living-room" "bottle" → { "water" }
"sword" → "player"
"bottle" → "player"
"water" → "bottle"
Both questions are instant:
| Question | Which Map | Cost |
|---|---|---|
| "Where is the lantern?" | childToParent |
O(1) |
| "What's in the living room?" | parentToChildren |
O(1) |
Without the bidirectional index, one of these would require scanning every entity in the game — O(n). For a game with hundreds of entities, that matters.
Tradeoff: Double the memory for instant lookups in both directions. For an IF game with ~1000 entities, this is negligible — a thousand extra string pointers.
Why Set Instead of Array for Children
parentToChildren maps to a Set, not an Array:
| Operation | Array | Set |
|---|---|---|
| "Is lantern in this room?" | O(n) scan | O(1) |
| "Remove lantern from room" | O(n) find + splice | O(1) |
| "Add lantern to room" | O(1) push | O(1) |
| "List everything in room" | Already an array | O(n) conversion |
Sets win on the operations that happen most (membership check, removal). The only cost is converting to an array when listing room contents, which happens less frequently.
The Move Operation: Enforcing the Tree Invariant
addChild(parentId: string, childId: string): void {
// Remove from current parent (enforces: one parent only)
const currentParent = this.childToParent.get(childId);
if (currentParent) {
this.removeChild(currentParent, childId);
}
// Add to new parent
this.parentToChildren.get(parentId)!.add(childId);
this.childToParent.set(childId, parentId);
}
CS Concept: Structural Invariant
A tree guarantees that every node has exactly one parent. The addChild method enforces this — before adding to the new parent, it removes from the old one. An item can never be in two places simultaneously. The data structure makes the illegal state unrepresentable.
This is what every TAKE, DROP, PUT IN, and PUT ON command calls under the hood. take lantern calls addChild(playerId, lanternId). drop lantern calls addChild(roomId, lanternId).
Tree Traversal: Descendants and Ancestors
CS Concept: Depth-First Traversal
getAllDescendants walks down the tree recursively:
Living Room
├── table ← depth 1
│ ├── lantern ← depth 2
│ └── newspaper ← depth 2
├── rug ← depth 1
└── player ← depth 1
└── sword ← depth 2
The visited Set prevents infinite loops if a cycle is accidentally introduced — defensive programming against corrupted state. The maxDepth parameter caps traversal at 10 levels (a lantern inside a box inside a bag inside a chest... is unlikely to nest deeper).
getAncestors walks up: sword → player → living-room → (no parent, stop). This is a linked list traversal — follow parent pointers one hop at a time. Used for scope and visibility to determine what chain of containers an item is nested inside.
Serialization: Save/Restore
toJSON(): { parentToChildren: [...], childToParent: [...] }
loadJSON(data): void // rebuilds Maps and Sets from plain arrays
Maps and Sets are not directly JSON-serializable. toJSON converts them to plain arrays; loadJSON rebuilds. This is the "clean snapshot" advantage of the tree approach — the entire spatial state of the world reduces to a JSON object for save games.
Room Connections: Directed Graph as Adjacency List
CS Concept: Adjacency List Representation of a Directed Graph
Room connections are the one genuinely graph-like structure in Sharpee. Unlike the containment tree (one parent per item), rooms connect to arbitrary other rooms via directional exits.
The Data Structure
Each room stores its own exits as a dictionary of direction to destination:
// RoomTrait on the "kitchen" entity
exits: {
north: { destination: "hallway" },
east: { destination: "pantry", via: "pantry-door" },
}
This is an adjacency list — each node holds a map of its outgoing edges. The alternative is an adjacency matrix (a 2D grid of all possible connections):
| Representation | Memory | "Where does north go?" | "What leads to kitchen?" |
|---|---|---|---|
| Adjacency list | O(edges) | O(1) lookup | O(all rooms) scan |
| Adjacency matrix | O(rooms^2) | O(1) lookup | O(rooms) scan one row |
For a Zork-sized game with 191 rooms, a matrix would be 191 x 191 = 36,481 cells, almost all empty. The adjacency list only stores edges that exist. This is the right choice for sparse graphs — where each node connects to a small fraction of all other nodes (typically 3-6 exits per room).
Bidirectional Connections
connectRooms(room1Id, room2Id, direction): void {
const opposite = getOppositeDirection(direction);
RoomBehavior.setExit(room1, direction, room2Id); // kitchen.north → hallway
RoomBehavior.setExit(room2, opposite, room1Id); // hallway.south → kitchen
}
This creates an undirected edge using two directed edges. But the system supports one-way connections too — call setExit directly for a chute you fall down but cannot climb back up. This flexibility comes free from using a directed graph.
Edges with Attributes
interface IExitInfo {
destination: string; // which room you end up in
via?: string; // a door entity you must pass through
mapHint?: IExitMapHint; // rendering info for the auto-mapper
}
CS Concept: Labeled/Weighted Edges
The via field is a conditional edge — movement requires the door entity to be open. The door itself is an entity in the containment tree (the player can see it, examine it, unlock it). But it also appears as a property on a graph edge. The entity system and the graph system reference the same thing by ID — a clean cross-reference between two data structures.
Graph Search: findPath
The WorldModel exposes findPath(fromRoomId, toRoomId): string[] | null. This is a breadth-first search (BFS) — it starts at the source room, explores all neighbors, then all their neighbors, expanding outward until it reaches the destination.
findPath("kitchen", "library")
Step 1: kitchen → neighbors: hallway, pantry
Step 2: hallway → neighbors: kitchen (visited), library ← FOUND
Path: kitchen → hallway → library
BFS guarantees the shortest path (fewest rooms). Used for NPC navigation and hint systems.
Visibility: Filtered Tree Traversal
CS Concept: Tree Traversal with Predicate Filtering
Visibility answers the physical simulation question: given the laws of IF physics, what can the player perceive?
Three Layered Systems
┌──────────────────────────────────────────────┐
│ Scope (What the parser considers) │ ← Rule engine, indexed registry
│ "Can the player refer to this in a command?"│
├──────────────────────────────────────────────┤
│ Visibility (What the player perceives) │ ← Tree traversal with filters
│ "Can the player see/feel/notice this?" │
├──────────────────────────────────────────────┤
│ SpatialIndex (Where things are) │ ← Bidirectional hash maps
│ "What contains what?" │
└──────────────────────────────────────────────┘
Each layer adds meaning on top of the one below. The SpatialIndex is pure structure — no game logic. Visibility adds IF physics. Scope adds parser intelligence.
The Walk-Up Algorithm: Line of Sight
To check if the player can see the wrench, walk up the containment tree from the wrench to the room, checking each container:
Can the player see the wrench?
wrench
└── inside toolbox ← Is toolbox a container? YES
Is it opaque? YES
Is it open? NO
BLOCKED — can't see inside closed opaque toolbox
Can the player see the sword?
sword
└── carried by player ← Is player an actor? YES
Actors don't block visibility
└── in Living Room ← Is this a room? YES — done
VISIBLE
This is a linked list traversal with predicates — follow parent pointers, check a condition at each hop. Each hop is O(1). The number of hops is bounded by nesting depth (capped at 10).
All three visibility checks (isAccessible, hasLineOfSight, isVisible) delegate to a single shared algorithm — isContainmentPathClear — which walks from an entity up to its room, returning false if any closed opaque container blocks the path. The three public-facing methods are thin wrappers that add their own pre-checks before calling the shared walker.
The Recursive Descent: getVisible
When the game needs to describe a room (LOOK command), getVisible walks down the tree from the room:
Living Room
├── mailbox (visible)
│ └── leaflet — is mailbox open? YES → leaflet visible
├── glass case (visible)
│ └── jeweled egg — is case open? NO, but transparent? YES → egg visible
├── locked safe (visible)
│ └── gold coins — is safe open? NO, transparent? NO → coins NOT visible
└── player (skip self)
└── sword (added separately as "carried items")
CS Concept: Depth-First Tree Traversal with Pruning
Closed opaque containers are pruning points — the recursion stops there, avoiding unnecessary work. The seen Set prevents duplicate entries.
Darkness: Changing the Traversal Rules
The darkness system does not change the containment tree. It changes the traversal rules:
- Lit room: Normal visibility — walk the tree, check containers.
- Dark room with no light: Short-circuit. Only two things visible: what you are carrying (you can feel it), and lit light sources in the room (they glow).
hasLightSource does a full recursive search of the room, then for each light source found, walks back up with isContainmentPathClear to check if the light can "escape" its containers. A flashlight inside a closed opaque box does not light the room.
CS Concept: Nested Traversals — a downward traversal (find lights) with an upward traversal (check accessibility) at each candidate. For an IF game with ~20 objects per room, this is negligible. For 10,000 entities in one room, you'd want a dedicated index.
Scope: A Three-Phase Pipeline
CS Concept: Pipeline Architecture with Specialized Resolvers
Scope determines what the player can refer to in a command. It overlaps with visibility but is not identical — you can see a painting across the room, but can you take it?
Three specialized resolvers handle scope at different phases of the turn cycle. They share no code and serve different callers:
┌─────────────────────────────────────────────────────────────┐
│ 1. RuleScopeEvaluator (world-model, pre-parse) │
│ Rule engine with indexed registry. Populates vocabulary.│
│ │
│ 2. GrammarScopeResolver (parser, parse phase) │
│ Delegates to WorldModel.getVisibleEntities() etc. │
│ Filters candidates for grammar slot constraints. │
│ │
│ 3. StandardScopeResolver (stdlib, validation phase) │
│ Entity resolution with scoring and disambiguation. │
│ Delegates canSee() to world.canSee(). │
└─────────────────────────────────────────────────────────────┘
Phase 1: RuleScopeEvaluator — Triple-Indexed Rule Store
CS Concept: Production Rule System with Multi-Key Indexing
class RuleScopeEvaluator {
private rules: Map<string, IScopeRule>; // primary store
private rulesByLocation: Map<string, Set<string>>; // location index
private rulesByAction: Map<string, Set<string>>; // action index
private globalRules: Set<string>; // always-apply index
}
Three indexes over the same rule set (the same bidirectional indexing idea as SpatialIndex, but with three dimensions):
rules (primary store)
────────────────────
"core.room-contents" → { rule definition... }
"core.inventory" → { rule definition... }
"story.magic-mirror" → { rule definition... }
rulesByLocation (index) rulesByAction (index)
─────────────────────── ─────────────────────
"dark-cave" → { "story.echo" } "taking" → { "core.touchable" }
"tower-top" → { "story.telescope"} "looking" → { "core.visible" }
globalRules (index)
───────────────────
{ "core.room-contents", "core.inventory" }
When evaluating scope, the evaluator:
- Grabs all global rules (always apply) — Set lookup: O(1)
- Grabs rules for the current room (location-indexed) — Map lookup: O(1)
- Filters by the current action ("taking") — Map lookup: O(1)
- Sorts by priority — O(n log n) on a small set
- Evaluates each rule's condition function
- Collects all entity IDs from passing rules into a Set
This is a mini query engine. The indexes make steps 1-3 fast. Without them, you'd scan all rules every time — acceptable for 10 rules, slow for 100.
The evaluator caches results:
private cache: Map<string, IScopeEvaluationResult>;
// Key: "actorId:locationId:actionId"
CS Concept: Memoization — if scope has been computed for this actor+location+action combination, return the cached result. The cache is invalidated after any world state mutation (move, open, close).
Phase 2: GrammarScopeResolver — Parse-Time Filtering
CS Concept: Delegation / Facade
During grammar matching, the slot consumer needs to know which entities are valid for a constraint like .touchable(). The GrammarScopeResolver delegates to WorldModel methods (getVisibleEntities, getTouchableEntities, etc.), which internally use VisibilityBehavior. It does not call the RuleScopeEvaluator — it accesses the visibility layer directly.
Phase 3: StandardScopeResolver — Validation-Time Disambiguation
The command validator resolves noun phrases to specific entity IDs. The StandardScopeResolver builds candidate lists, scores them, and handles disambiguation. It delegates visibility checks to world.canSee(), which calls through to VisibilityBehavior.canSee() — the single canonical implementation.
Scope Levels
All three resolvers use the same scope level enumeration:
CARRIED = 4 // in inventory — can manipulate freely
REACHABLE = 3 // can physically touch — required for TAKE, OPEN
VISIBLE = 2 // can see — required for EXAMINE, LOOK AT
AWARE = 1 // know it exists (heard, smelled) — for LISTEN, SMELL
UNAWARE = 0 // doesn't exist to the player
These form an ordered enumeration — each level includes all levels above it. If something is REACHABLE (3), it is also VISIBLE (2) and AWARE (1). The filter check is a simple comparison: entityScope >= requiredScope.
The Parser Pipeline: Data Structure Transformations
When the player types take brass lantern, the input passes through a pipeline where each stage transforms one data structure into another.
Stage 1: Tokenization — String to Token Array
"take brass lantern" → [
{ word: "take", normalized: "take", position: 0 },
{ word: "brass", normalized: "brass", position: 1 },
{ word: "lantern", normalized: "lantern", position: 2 }
]
CS Concept: Lexical Analysis (Tokenization)
The same first step as any compiler or interpreter. Raw text becomes structured tokens with normalized forms (lowercased, punctuation stripped) and positional metadata.
Stage 2: Grammar Matching — Token Array to Pattern Match Tree
The grammar engine tries each registered rule against the tokens:
Rule "take :item" → match! (confidence 1.0, priority 100)
Rule "take :item from :src" → fail (no "from" token)
Rule "put :item in :dest" → fail ("take" ≠ "put")
CS Concept: Pattern Matching / Unification
Each rule is a pattern that either unifies with the input or fails. The engine collects all successful matches and sorts them by confidence then priority. This is similar to how Prolog resolves queries against a knowledge base, or how a parser generator tries grammar productions.
Stage 3: Slot Consumption — Token Span to Entity Candidates
The matched slot :item needs to consume "brass lantern" and find the actual entity. The EntitySlotConsumer:
- Gets the scope constraint from the grammar rule (e.g.,
.touchable()) - Asks the world model for all touchable entities
- Matches "brass lantern" against entity names and aliases
- Returns candidates with confidence scores
CS Concept: Constraint Satisfaction
The consumer doesn't just match text — it checks whether the match makes sense given world state. The entity must exist, be in scope, and satisfy trait constraints. This is a constraint satisfaction problem (CSP) with a small search space (typically 15-20 entities in a room).
Stage 4: Parsed Command — Structured Object
{
action: "if.action.taking",
directObject: {
text: "brass lantern",
head: "lantern",
modifiers: ["brass"],
candidates: ["brass lantern"]
}
}
The parser's output is a structured intermediate representation — not yet a resolved command, but no longer raw text.
Stage 5: Validation and Disambiguation — Candidates to Resolved Entity
The command validator does definitive entity resolution:
- Search by name, type, synonym, adjective — multiple hash map lookups
- Filter by required scope level — tree traversal for visibility
- Score candidates — weighted sum of match quality factors
- Disambiguate — pick highest score or prompt the player
Scoring factors:
| Factor | Points |
|---|---|
| Exact name match | +10 |
| Entity type match | +8 |
| Alias/synonym match | +6 |
| Adjective match | +4 |
| Each matching modifier | +5 |
| In inventory | +2 |
| Reachable | +1 |
| Visible | +1 |
If scores tie, the engine produces the classic IF prompt:
> take lantern
Which lantern do you mean, the brass lantern or the crystal lantern?
CS Concept: Scoring/Ranking with Weighted Features
This is a simplified version of the same approach used in information retrieval (search engines) — compute a relevance score from multiple signals, rank candidates, pick the best.
Stage 6: Action Execution — Resolved Command to State Mutation
The action receives resolved entity IDs, executes its four-phase cycle (validate/execute/report/blocked), and calls SpatialIndex.addChild() to move entities. The cycle continues on the next turn.
Complete Pipeline
"take brass lantern"
│
▼ String
[Tokenize] ─── String → Array of tokens
│
▼ Token[]
[Grammar Match] ─── Token[] → PatternMatch[] (sorted by confidence)
│
▼ PatternMatch
[Slot Consumer] ─── Token span → Entity candidates
│ GrammarScopeResolver → SpatialIndex → VisibilityBehavior
│
▼ IParsedCommand
[Validator] ─── Noun phrases → Resolved entity IDs
│ Search (hash map) → Filter (tree traversal) → Score (sort)
│
▼ Resolved command
[Action] ─── Execute → SpatialIndex.addChild() to move entities
Every stage is a data structure transformation. The parser never touches the SpatialIndex directly — it goes through scope/visibility. The scope/visibility layers never touch raw token strings. Clean boundaries, each layer speaking its own data structure language.
The Grammar System: A Production Rule Engine
CS Concept: Production Rule System (Expert System Pattern)
The grammar system is a collection of pattern rules evaluated against input, producing ranked matches. It is a specialized pattern-matching language — a miniature programming language for describing what English sentences look like and what they mean.
The GrammarRule Structure
interface GrammarRule {
id: string; // "if.action.taking#take_:item"
pattern: string; // "take :item"
compiledPattern: CompiledPattern;
slots: Map<string, SlotConstraint>;
action: string; // "if.action.taking"
priority: number; // 100
semantics?: SemanticMapping;
}
A production rule says "this pattern of symbols produces this meaning." In Sharpee: "the pattern take :item produces the action if.action.taking with the :item slot filled by whatever the player typed."
Pattern Compilation
CS Concept: Compiled Pattern (same idea as compiled regular expressions)
The raw pattern string is compiled into a CompiledPattern — an array of PatternToken objects — at startup. This avoids re-parsing the pattern string on every player command.
Pattern: "put :item in|into|inside :container"
Compiled:
Index 0: { type: 'literal', value: 'put' }
Index 1: { type: 'slot', value: 'item', slotType: ENTITY }
Index 2: { type: 'alternates', value: 'in', alternates: ['in','into','inside'] }
Index 3: { type: 'slot', value: 'container', slotType: ENTITY }
Pattern Token Types
| Token Type | Syntax | What It Matches | Example |
|---|---|---|---|
| Literal | put |
Exactly that word | "put" and nothing else |
| Alternates | in|into|inside |
Any one of those words | "into" matches |
| Slot | :item |
One or more words, consumed by a slot consumer | "brass lantern" |
| Optional | [carefully] |
Zero or one occurrence | Can be skipped |
| Greedy Slot | :message... |
Everything until next pattern element | "hello how are you" |
The Two Builder APIs
.forAction() — The Code Generator
grammar
.forAction('if.action.attacking')
.verbs(['attack', 'kill', 'fight', 'slay', 'murder', 'hit', 'strike'])
.pattern(':target')
.build();
CS Concept: Cartesian Product / Macro Expansion
This single call generates seven grammar rules (one per verb). With two patterns, it would generate fourteen. It takes a compact specification and expands it into many rules — the same principle as macros or template metaprogramming.
The .directions() variant is even more dramatic:
grammar
.forAction('if.action.going')
.directions({
'north': ['north', 'n'],
'south': ['south', 's'],
// ... 10 more directions
})
.build();
// Generates 24 rules (12 directions × 2 aliases each)
Each generated rule carries semantic data — { direction: 'north' } — so the action knows which direction was intended regardless of whether the player typed "north" or "n".
.define() — The Explicit Form
grammar
.define('put :item in|into|inside :container')
.hasTrait('container', TraitType.CONTAINER)
.mapsTo('if.action.inserting')
.withPriority(100)
.build();
One pattern, one rule. Use .define() for:
- Phrasal verbs: "pick up :item", "put down :item" —
.forAction()expects a single-word verb - Preposition patterns: "put X in Y" has structure between slots
- One-off commands: "save", "help", "score" — no verb expansion needed
Slot Constraints: The Filter Pipeline
CS Concept: Predicate Chain / Filter Pipeline
Each :slot can have constraints that limit which entities are valid:
.where('target', scope => scope
.touchable() // base: must be reachable
.matching({ portable: true }) // property: must be portable
.hasTrait(TraitType.CONTAINER) // trait: must be a container
.orExplicitly(['special-id']) // always include this one
)
Each method narrows the candidate set:
All entities (~500)
→ .touchable() → entities you can reach (~15)
→ .matching(...) → those that are portable (~10)
→ .hasTrait(...) → those with ContainerTrait (~2)
→ .orExplicitly() → plus one specific entity (3)
This is the same pattern as .filter().filter().filter() on arrays, or SQL WHERE ... AND ... AND .... Each step reduces the search space.
Multi-Word Entity Names: The Boundary Problem
CS Concept: Greedy Parsing with Lookahead
When the parser sees put brass lantern in toolbox, it must determine where "brass lantern" ends and "in" begins.
The slot consumer looks ahead for the next non-slot pattern element (in|into|inside) and consumes tokens until it hits one of those delimiters:
Tokens: "brass" "lantern" "in" "toolbox"
^^^^^^^^^^^^^^^^^^^^^^
Consume until "in" → item = "brass lantern"
For consecutive slots without a delimiter (give :recipient :item), it tries every possible split point and picks the one where constraints are satisfied:
"give troll brass lantern"
Try: recipient="troll", item="brass lantern"
→ "troll" is an Actor? YES
→ "brass lantern" is an entity? YES → match
Try: recipient="troll brass", item="lantern"
→ "troll brass" is an Actor? NO → reject
This is a brute-force search over partition points — but with only 3-4 tokens to split, the search space is trivially small.
Priority and Confidence: Resolving Ambiguity
CS Concept: Multi-Criteria Sorting
When multiple rules match, the engine sorts by confidence first, then priority:
matches.sort((a, b) => {
if (b.confidence !== a.confidence) return b.confidence - a.confidence;
return b.rule.priority - a.rule.priority;
});
Confidence starts at 1.0 and decreases:
- Each skipped optional element: multiply by 0.9
- Uncertain slot consumption (partial entity match): reduced further
- Explicit
experimentalConfidenceoverride on the rule
Priority is set by the grammar author:
| Level | Usage |
|---|---|
| 150+ | Story-specific patterns (override stdlib) |
| 110 | Instrument patterns (with :weapon) |
| 100 | Standard patterns (default) |
| 95 | Synonyms and alternatives |
| 90 | Single-character abbreviations |
This creates a natural specificity ordering — more specific patterns (more words matched, fewer optionals skipped, higher priority) rank higher. Similar to CSS specificity: more specific selectors win.
Semantic Mappings: Preserving Intent
grammar
.define('throw :item at|to :target')
.withSemanticPrepositions({
'at': 'at', // hostile: throw AT the troll
'to': 'to' // friendly: throw ball TO the child
})
.mapsTo('if.action.throwing')
.build();
The grammar captures not just structure but intent. The action reads semantics.spatialRelation and behaves differently for "at" vs "to".
Story Grammar Extensions
CS Concept: Open/Closed Principle
Stories extend grammar using the same builder API:
extendParser(grammar: GrammarBuilder): void {
grammar
.define('incant :spell')
.fromVocabulary('spell', 'incantations')
.mapsTo('dungeo.action.incant')
.withPriority(150) // beats all stdlib rules
.build();
}
Story rules are appended after core rules but sorted by priority, so they are checked first. The grammar is closed for modification (core rules don't change) but open for extension (stories add new rules).
Grammar Catalog
A complete reference of all grammar rules currently defined in the Sharpee parser.
.forAction() Rules (Verb Expansion)
Each entry generates one rule per verb listed.
| Action | Verbs | Pattern | Constraint | Pri |
|---|---|---|---|---|
looking |
look, l | (none) | — | 100 |
examining |
examine, x, inspect | :target |
— | 100 |
taking |
take, get, grab | :item |
— | 100 |
dropping |
drop, discard | :item |
— | 100 |
eating |
eat, consume, devour | :item |
— | 100 |
drinking |
drink, sip, quaff | :item |
— | 100 |
reading |
read, peruse, study | :target |
— | 100 |
inventory |
inventory, inv, i | (none) | — | 100 |
pushing |
push, press, shove, move | :target |
— | 100 |
pulling |
pull, drag, yank | :target |
— | 100 |
lowering |
lower | :target |
— | 100 |
raising |
raise, lift | :target |
— | 100 |
waiting |
wait, z | (none) | — | 100 |
quitting |
quit, q | (none) | — | 100 |
touching |
touch, rub, feel, pat, stroke, poke, prod | :target |
— | 100 |
attacking |
attack, kill, fight, slay, murder, hit, strike | :target |
— | 100 |
switching_on |
turn, switch, flip | on :device |
hasTrait SWITCHABLE | 100 |
switching_off |
turn, switch, flip | off :device |
hasTrait SWITCHABLE | 100 |
going |
(24 direction aliases) | bare direction | direction semantic | 100 |
Direction aliases: north/n, south/s, east/e, west/w, northeast/ne, northwest/nw, southeast/se, southwest/sw, up/u, down/d, in, out.
.define() Rules — No Constraints
| Pattern | Action | Pri | Notes |
|---|---|---|---|
look at :target |
examining | 95 | phrasal verb |
look [carefully] at :target |
examining_carefully | 96 | optional adverb |
look [around] |
looking | 101 | optional word |
search [carefully] |
searching | 100 | optional adverb |
search :target |
searching | 100 | |
look in|inside :target |
searching | 100 | alternates |
look through :target |
searching | 100 | |
rummage in|through :target |
searching | 95 | synonym |
pick up :item |
taking | 100 | phrasal verb |
put down :item |
dropping | 100 | phrasal verb |
go :direction |
going | 100 | where: direction |
save |
saving | 100 | literal |
restore |
restoring | 100 | literal |
restart |
restarting | 100 | literal |
undo |
undoing | 100 | literal |
score |
scoring | 100 | literal |
version |
version | 100 | literal |
help |
help | 100 | literal |
about |
about | 100 | literal |
info |
about | 100 | synonym |
credits |
about | 100 | synonym |
say :message |
saying | 100 | text slot |
shout :message |
shouting | 100 | text slot |
write :message |
writing | 100 | text slot |
exit |
exiting | 100 | bare command |
get out |
exiting | 100 | phrasal verb |
leave |
exiting | 95 | synonym |
climb out |
exiting | 100 | phrasal verb |
disembark |
exiting | 100 | bare command |
alight |
exiting | 95 | synonym |
again |
again | 100 | repeat command |
g |
again | 90 | abbreviation |
.define() Rules — Trait Constraints
| Pattern | Action | Constrained Slot | Trait | Pri |
|---|---|---|---|---|
open :door |
opening | door | OPENABLE | 100 |
close :door |
closing | door | OPENABLE | 100 |
turn :device on |
switching_on | device | SWITCHABLE | 100 |
turn :device off |
switching_off | device | SWITCHABLE | 100 |
put :item in|into|inside :ctr |
inserting | ctr | CONTAINER | 100 |
insert :item in|into :ctr |
inserting | ctr | CONTAINER | 100 |
put :item on|onto :sup |
putting | sup | SUPPORTER | 100 |
enter :portal |
entering | portal | ENTERABLE | 100 |
get in :portal |
entering | portal | ENTERABLE | 100 |
get into :portal |
entering | portal | ENTERABLE | 100 |
climb in :portal |
entering | portal | ENTERABLE | 100 |
climb into :portal |
entering | portal | ENTERABLE | 100 |
go in :portal |
entering | portal | ENTERABLE | 100 |
go into :portal |
entering | portal | ENTERABLE | 100 |
board :vehicle |
entering | vehicle | ENTERABLE | 100 |
get on :vehicle |
entering | vehicle | ENTERABLE | 100 |
exit :container |
exiting | container | ENTERABLE | 100 |
disembark :vehicle |
exiting | vehicle | ENTERABLE | 100 |
get off :vehicle |
exiting | vehicle | ENTERABLE | 100 |
give :item to :recipient |
giving | recipient | ACTOR | 100 |
give :recipient :item |
giving | recipient | ACTOR | 95 |
offer :item to :recipient |
giving | recipient | ACTOR | 100 |
show :item to :recipient |
showing | recipient | ACTOR | 100 |
show :recipient :item |
showing | recipient | ACTOR | 95 |
tell :recipient about :topic |
telling | recipient | ACTOR | 100 |
ask :recipient about :topic |
asking | recipient | ACTOR | 100 |
say :message to :recipient |
saying_to | recipient | ACTOR | 105 |
whisper :message to :recipient |
whispering | recipient | ACTOR | 100 |
write :message on :surface |
writing_on | — | — | 105 |
.define() Rules — Instrument Slots
All instrument patterns use priority 110 to beat their simpler counterparts.
| Pattern | Action | Instrument | Extra Constraint | Pri |
|---|---|---|---|---|
take :item from :container with :tool |
taking_with | tool | — | 110 |
unlock :door with :key |
unlocking | key | — | 110 |
open :ctr with :tool |
opening_with | tool | hasTrait OPENABLE | 110 |
cut :object with :tool |
cutting | tool | — | 110 |
attack :target with :weapon |
attacking | weapon | — | 110 |
kill :target with :weapon |
attacking | weapon | — | 110 |
hit :target with :weapon |
attacking | weapon | — | 110 |
strike :target with :weapon |
attacking | weapon | — | 110 |
dig :location with :tool |
digging | tool | — | 110 |
hang :item on :hook |
putting | — | — | 110 |
throw :item at :target |
throwing | — | — | 100 |
throw :item to :recipient |
throwing | — | — | 100 |
Summary by Constraint Type
| Constraint Type | Where Used | Count |
|---|---|---|
| No constraint | Most forAction rules, bare commands, meta | ~85 |
| hasTrait OPENABLE | open, close, open_with | 3 |
| hasTrait SWITCHABLE | turn/switch/flip on/off | 4 |
| hasTrait CONTAINER | put in, insert in | 2 |
| hasTrait SUPPORTER | put on | 1 |
| hasTrait ENTERABLE | enter, board, get in/on, exit, disembark | 11 |
| hasTrait ACTOR | give, offer, show, tell, ask, say to, whisper | 9 |
| instrument() | with :key, with :weapon, with :tool | 9 |
| where (direction) | go :direction | 1 |
Trait constraints cluster around physical interaction (OPENABLE, CONTAINER, ENTERABLE) and social interaction (ACTOR). Most action verbs (take, drop, push, pull, eat, drink, read) have no grammar-level constraints — their validation happens in the action's validate phase instead, which has access to richer context.
Glossary
| Term | Definition |
|---|---|
| Adjacency list | Graph representation where each node stores its own edges |
| Bidirectional index | Storing a relationship in both directions for O(1) lookup either way |
| BFS | Breadth-first search — explore all neighbors before going deeper |
| Constraint satisfaction | Finding values that satisfy all given constraints simultaneously |
| Depth-first traversal | Tree/graph traversal that goes deep before going wide |
| Filter pipeline | Chained predicates, each narrowing the candidate set |
| Hash map | Key-value store with O(1) lookup (Map, Dictionary, Object) |
| Inverted index | Mapping from values back to keys (children to parent alongside parent to children) |
| Lexical analysis | Breaking raw text into structured tokens |
| Memoization | Caching function results to avoid recomputation |
| Production rule | A pattern that maps input to output in a rule-based system |
| Pruning | Skipping branches of a search tree that cannot lead to a valid result |
| Sparse graph | A graph where most possible edges do not exist |
| Structural invariant | A property guaranteed by the data structure itself, not by runtime checks |
| Tree | A graph where every node has exactly one parent (except the root) |
| Unification | Matching a pattern against input, binding variables to matched values |