Memory health 08-beliefs-maintenance
`recall beliefs` and `recall maintenance --derive` audit the graph between turns — surfacing contradictions, stale/low-trust cells, and what to reverify.
script — screencasts/08-beliefs-maintenance.sh
#!/usr/bin/env bash
# Screencast 08 — "Memory health: beliefs + maintenance"
# Recall audits its own graph between turns: surfaces conflicts, stale cells,
# low-trust calibration, and what to reverify. Fresh isolated graph; the real
# user graph is never touched (tripwire asserts the global node count is unchanged).
set -u
HERE="$(cd "$(dirname "$0")" && pwd)"
source "$HERE/lib/present.sh"
H=/Users/hendrixx./.claude/skills/recall/scripts/recall_helper.py
# ---- isolation: fresh temp DB; record real global node count for the tripwire ----
DB="$(mktemp -t sc.XXXXXX).sqlite3"; rm -f "$DB"
recall init --db "$DB" >/dev/null 2>&1
GLOBAL="$HOME/.recall/db/global.sqlite3"
G0=$(recall status --db "$GLOBAL" 2>/dev/null | grep -oE '"nodes": *[0-9]+' | head -1 | grep -oE '[0-9]+')
write(){ # kind title body conf topics [contradicts]
if [ -n "${6:-}" ]; then
python3 "$H" --kind "$1" --title "$2" --body "$3" --confidence "$4" --topics "$5" --contradicts "$6" > /tmp/sc08p.json 2>/dev/null
else
python3 "$H" --kind "$1" --title "$2" --body "$3" --confidence "$4" --topics "$5" > /tmp/sc08p.json 2>/dev/null
fi
recall admit --json /tmp/sc08p.json --db "$DB" 2>/dev/null | jq -r '.node.id'
}
title "Memory health — Recall audits its own graph between turns" \
"beliefs + maintenance: conflicts, stale cells, low trust, and what to reverify"
# ---------------------------------------------------------------------------
scene "seed a graph with a known conflict + a weak belief"
narrate "Three cells. A decision, a newer decision that supersedes it, and a low-confidence hunch."
cmd "recall admit (decision) # original choice"
ID1=$(write decision "Use bcrypt for password hashing" \
"Adopt bcrypt with cost 12 for all password storage." 0.85 "auth,security")
note "→ $ID1" | out
cmd "recall admit (decision --contradicts) # supersedes the original"
ID2=$(write decision "Switch to argon2id for password hashing" \
"Argon2id is the recommended KDF now; supersedes the bcrypt decision." 0.9 "auth,security" "$ID1")
note "→ $ID2 contradicts ${ID1:0:8}" | out
cmd "recall admit (hypothesis, conf 0.30) # unverified, low trust"
ID3=$(write hypothesis "Rate limiting may reduce credential stuffing" \
"Unverified hunch that per-IP rate limits cut stuffing attempts." 0.3 "auth,security")
note "→ $ID3 (low confidence / unverified)" | out
endbox
# ---------------------------------------------------------------------------
scene "recall beliefs — the health report"
narrate "Between turns, Recall doesn't just store cells. It grades the graph it's holding."
cmd "recall beliefs --db \$DB"
recall beliefs --db "$DB" 2>/dev/null \
| jq -r '.report.stats | "stats nodes=\(.nodes) relations=\(.relations) rollback=\(.rollbackEntries)"' | out
recall beliefs --db "$DB" 2>/dev/null \
| jq -r '.report.contradictions | "conflicts \(length) open contradiction(s)"' | out
endbox
# ---------------------------------------------------------------------------
scene "the conflict it found"
narrate "It pairs the two decisions: the newer one contradicts the older, with a severity."
cmd "recall beliefs … | jq '.report.contradictions[]'"
recall beliefs --db "$DB" 2>/dev/null \
| jq -r '.report.contradictions[] | "[sev \(.severity)] \(.sourceTitle)\n ⇒ contradicts ⇒ \(.targetTitle)"' | out
endbox
# ---------------------------------------------------------------------------
scene "calibration — who is over-confident?"
narrate "Stated confidence vs how often that actor's cells got contradicted (Brier = lower is better)."
cmd "recall beliefs … | jq '.report.calibration[]'"
recall beliefs --db "$DB" 2>/dev/null \
| jq -r '.report.calibration[] | "actor=\(.actor) cells=\(.cells) contradicted=\(.contradicted) meanConf=\(.meanConfidence) brier=\(.brierScore)"' | out
note "low-confidence hunch drags meanConf down — exactly what reverification should target" | out
endbox
# ---------------------------------------------------------------------------
scene "nextActions — what to do about it"
narrate "The report ends in a worklist: curiosity targets, then concrete next actions."
cmd "recall beliefs … | jq '.report.curiosityTargets[], .report.nextActions[]'"
recall beliefs --db "$DB" 2>/dev/null \
| jq -r '.report.curiosityTargets[0] | "target • \(.title) (priority \(.priority))\n \(.suggestedAction)"' | out
recall beliefs --db "$DB" 2>/dev/null \
| jq -r '.report.nextActions[] | "next → \(. | gsub(", starting with.*$"; ", starting with the bcrypt cell."))"' | out
endbox
# ---------------------------------------------------------------------------
scene "recall maintenance --derive — run the pass"
narrate "Maintenance runs the derive pass and writes a signed health witness back into the graph."
cmd "recall maintenance --derive --db \$DB"
MJSON=$(recall maintenance --derive --db "$DB" 2>/dev/null)
echo "$MJSON" | jq -r '.result.node.summary | "summary \(.)"' | out
echo "$MJSON" | jq -r '.result.node | "witness \(.title) [\(.provenance.verification), by \(.provenance.produced_by)]"' | out
echo "$MJSON" | jq -r '.result.relations[] | "relation \(.kind) → concern raised on the contradicted cell"' | out
note "the derive pass closes the loop: the pressure it found is now itself a tracked cell" | out
endbox
# ---------------------------------------------------------------------------
scene "isolated"
G1=$(recall status --db "$GLOBAL" 2>/dev/null | grep -oE '"nodes": *[0-9]+' | head -1 | grep -oE '[0-9]+')
if [ "$G0" = "$G1" ]; then
note "global graph untouched: $G0 -> $G1 ✓ (demo ran on a throwaway db)"
else
note "TRIPWIRE FAILED: global graph changed $G0 -> $G1 — DO NOT publish"
fi
endbox
rm -f "$DB" /tmp/sc08p.json
printf '\n✦ DONE\n'
captured output (unedited real run)
Memory health — Recall audits its own graph between turns
beliefs + maintenance: conflicts, stale cells, low trust, and what to reverify
┌─ seed a graph with a known conflict + a weak belief ─────────────────────────────────────────
│ ▸ Three cells. A decision, a newer decision that supersedes it, and a low-confidence hunch.
│ $ recall admit (decision) # original choice
│ │ → c303e686-95e1-437d-81e4-b9cf58f1907c
│ $ recall admit (decision --contradicts) # supersedes the original
│ │ → 106f1dd7-18ca-41c0-90b4-6a8d0ba84496 contradicts c303e686
│ $ recall admit (hypothesis, conf 0.30) # unverified, low trust
│ │ → 3d2835cd-5141-4d11-9d96-7037af3ce481 (low confidence / unverified)
└──────────────────────────────────────────────────────────────
┌─ recall beliefs — the health report ─────────────────────────────────────────
│ ▸ Between turns, Recall doesn't just store cells. It grades the graph it's holding.
│ $ recall beliefs --db $DB
│ stats nodes=3 relations=1 rollback=4
│ conflicts 1 open contradiction(s)
└──────────────────────────────────────────────────────────────
┌─ the conflict it found ─────────────────────────────────────────
│ ▸ It pairs the two decisions: the newer one contradicts the older, with a severity.
│ $ recall beliefs … | jq '.report.contradictions[]'
│ [sev 0.9] Switch to argon2id for password hashing
│ ⇒ contradicts ⇒ Use bcrypt for password hashing
└──────────────────────────────────────────────────────────────
┌─ calibration — who is over-confident? ─────────────────────────────────────────
│ ▸ Stated confidence vs how often that actor's cells got contradicted (Brier = lower is better).
│ $ recall beliefs … | jq '.report.calibration[]'
│ actor=claude-code cells=3 contradicted=1 meanConf=0.633 brier=0.33
│ │ low-confidence hunch drags meanConf down — exactly what reverification should target
└──────────────────────────────────────────────────────────────
┌─ nextActions — what to do about it ─────────────────────────────────────────
│ ▸ The report ends in a worklist: curiosity targets, then concrete next actions.
│ $ recall beliefs … | jq '.report.curiosityTargets[], .report.nextActions[]'
│ target • Use bcrypt for password hashing (priority 0.9)
│ resolve, supersede, or add a concern/contradiction witness
│ next → Review 1 contradiction/concern relation(s), starting with the bcrypt cell.
└──────────────────────────────────────────────────────────────
┌─ recall maintenance --derive — run the pass ─────────────────────────────────────────
│ ▸ Maintenance runs the derive pass and writes a signed health witness back into the graph.
│ $ recall maintenance --derive --db $DB
│ summary 0 belief pressures, 0 stale findings, 1 contradiction findings.
│ witness Recall memory health report [checked, by recall-maintenance]
│ relation concerns → concern raised on the contradicted cell
│ │ the derive pass closes the loop: the pressure it found is now itself a tracked cell
└──────────────────────────────────────────────────────────────
┌─ isolated ─────────────────────────────────────────
│ global graph untouched: 334 -> 334 ✓ (demo ran on a throwaway db)
└──────────────────────────────────────────────────────────────
✦ DONE
Calibration — per-actor trust 09-calibration
A proper Brier score per actor: how well stated confidence matched reality. Overconfident-and-wrong writers get discounted, feeding effective-confidence ranking.
script — screencasts/09-calibration.sh
#!/usr/bin/env bash
# Screencast 09 — "Calibration: per-actor trust via Brier score."
# Fresh isolated graph. One actor (Ada) writes four evidence-backed, high-confidence
# cells; then three corrections each --contradicts a REAL prior cell id. We run
# `recall calibration` and read the per-actor row: cells, contradicted, rate,
# meanConfidence, brierScore — Recall's measure of how well stated confidence
# matched reality. Real CLI output only; isolation tripwire asserts the real graph
# is untouched.
set -u
HERE="$(cd "$(dirname "$0")" && pwd)"; source "$HERE/lib/present.sh"
mkdir -p "$HERE/transcripts"
T="$HERE/transcripts/09-calibration.txt"
HELP=/Users/hendrixx./.claude/skills/recall/scripts/recall_helper.py
# --- isolation: unique fresh DB; never honor the user's real graph ---
DB="$(mktemp -t sc.XXXXXX).sqlite3"; rm -f "$DB"; recall init --db "$DB" >/dev/null 2>&1
GLOBAL="$HOME/.recall/db/global.sqlite3"
G0=$(recall status --db "$GLOBAL" 2>/dev/null | grep -oE '"nodes": *[0-9]+' | head -1 | grep -oE '[0-9]+')
AID="agent:ada"; ADISP="Ada"
# write helper: emit proposal -> admit -> echo new id (no nested $() in a function)
propose(){ # kind title body conf topics srcfiles [contradicts]
python3 "$HELP" --kind "$1" --title "$2" --body "$3" --confidence "$4" \
--topics "$5" --source-files "$6" --actor-id "$AID" --actor-display "$ADISP" \
${7:+--contradicts $7} > "$PJ" 2>/dev/null
}
{
title "Calibration — per-actor trust via Brier score" "Recall scores how well each actor's stated confidence matched reality"
scene "four confident, evidence-backed claims"
narrate "Ada records four findings, each cited to a source file and asserted at high confidence."
PJ="$(mktemp)"; propose observation "Auth uses JWT in headers" "Service reads a bearer token from the Authorization header." 0.9 auth "src/auth.ts#L10"
A=$(recall admit --json "$PJ" --db "$DB" 2>/dev/null | jq -r '.node.id'); rm -f "$PJ"
PJ="$(mktemp)"; propose observation "Primary store is Postgres 14" "All persistence is on Postgres 14." 0.88 db "infra/db.yaml"
B=$(recall admit --json "$PJ" --db "$DB" 2>/dev/null | jq -r '.node.id'); rm -f "$PJ"
PJ="$(mktemp)"; propose decision "Cache sessions in Redis" "Sessions cached in Redis with a 1h TTL." 0.9 sessions "src/session.ts#L5"
C=$(recall admit --json "$PJ" --db "$DB" 2>/dev/null | jq -r '.node.id'); rm -f "$PJ"
PJ="$(mktemp)"; propose observation "API rate limit is 100 rpm" "Gateway caps clients at 100 requests/min." 0.85 api "gw/config.yaml"
D=$(recall admit --json "$PJ" --db "$DB" 2>/dev/null | jq -r '.node.id'); rm -f "$PJ"
cmd "recall calibration # baseline: nothing contradicted yet"
recall calibration --db "$DB" 2>/dev/null \
| jq -r '.calibration[] | "\(.actor) cells=\(.cells) contradicted=\(.contradicted) meanConf=\(.meanConfidence) brier=\(.brierScore)"' | out
note "4 cells, 0 contradicted, brier ~0.014 — well-calibrated so far (confident, all standing)."
endbox
scene "reality disagrees — three corrections supersede prior claims"
narrate "Later evidence shows three of Ada's claims were wrong. Each correction is admitted"
narrate "with --contradicts pointing at the REAL prior cell id (so the edge resolves)."
PJ="$(mktemp)"; propose observation "Auth uses httpOnly cookies, not headers" "The header claim was wrong; token lives in an httpOnly cookie." 0.8 auth "src/auth.ts#L42" "$A"
recall admit --json "$PJ" --db "$DB" 2>/dev/null | jq -r '"contradicts \(.relations[0].targetId[0:8]) <- \(.node.title)"' | out; rm -f "$PJ"
PJ="$(mktemp)"; propose decision "Sessions live in Postgres, not Redis" "Redis was dropped; sessions moved to a Postgres table." 0.8 sessions "src/session.ts#L60" "$C"
recall admit --json "$PJ" --db "$DB" 2>/dev/null | jq -r '"contradicts \(.relations[0].targetId[0:8]) <- \(.node.title)"' | out; rm -f "$PJ"
PJ="$(mktemp)"; propose observation "API rate limit is 1000 rpm, not 100" "Gateway config corrected to 1000/min." 0.8 api "gw/config.yaml#L8" "$D"
recall admit --json "$PJ" --db "$DB" 2>/dev/null | jq -r '"contradicts \(.relations[0].targetId[0:8]) <- \(.node.title)"' | out; rm -f "$PJ"
note "Three of Ada's four confident claims now carry a resolved contradiction edge."
endbox
scene "calibration re-scored — overconfidence is now visible"
narrate "Brier is a proper scoring rule: a cell that attracted a resolved contradiction"
narrate "scored outcome 0; one still standing scored 1. You can't game it by hedging."
cmd "recall calibration"
recall calibration --db "$DB" 2>/dev/null | jq -r '.calibration[]
| "\(.actor) cells=\(.cells) contradicted=\(.contradicted) rate=\(.contradictedRate)",
" meanConf=\(.meanConfidence) meanConfContradicted=\(.meanConfidenceContradicted) brier=\(.brierScore)"' | out
endbox
scene "what the row means"
narrate "~0.43 of Ada's claims went wrong; meanConfidenceContradicted stayed high (~0.88)"
narrate "— overconfident-and-wrong. brierScore jumps from ~0.014 to ~0.35 (lower is better)."
note "This per-actor score feeds effective-confidence ranking: an actor who is confident"
note "AND wrong gets discounted, so their future writes weigh less than a calibrated peer's."
endbox
scene "isolated"
G1=$(recall status --db "$GLOBAL" 2>/dev/null | grep -oE '"nodes": *[0-9]+' | head -1 | grep -oE '[0-9]+')
note "global graph untouched: ${G0:-?} -> ${G1:-?}"
endbox
printf '\n✦ DONE\n'
} > "$T" 2>&1
rm -f "$DB"
echo "captured $(wc -l < "$T") lines -> $T"
captured output (unedited real run)
Calibration — per-actor trust via Brier score
Recall scores how well each actor's stated confidence matched reality
┌─ four confident, evidence-backed claims ─────────────────────────────────────────
│ ▸ Ada records four findings, each cited to a source file and asserted at high confidence.
│ $ recall calibration # baseline: nothing contradicted yet
│ agent:ada cells=4 contradicted=0 meanConf=0.883 brier=0.014
│ 4 cells, 0 contradicted, brier ~0.014 — well-calibrated so far (confident, all standing).
└──────────────────────────────────────────────────────────────
┌─ reality disagrees — three corrections supersede prior claims ─────────────────────────────────────────
│ ▸ Later evidence shows three of Ada's claims were wrong. Each correction is admitted
│ ▸ with --contradicts pointing at the REAL prior cell id (so the edge resolves).
│ contradicts 91c321b6 <- Auth uses httpOnly cookies, not headers
│ contradicts f0629592 <- Sessions live in Postgres, not Redis
│ contradicts 299fc992 <- API rate limit is 1000 rpm, not 100
│ Three of Ada's four confident claims now carry a resolved contradiction edge.
└──────────────────────────────────────────────────────────────
┌─ calibration re-scored — overconfidence is now visible ─────────────────────────────────────────
│ ▸ Brier is a proper scoring rule: a cell that attracted a resolved contradiction
│ ▸ scored outcome 0; one still standing scored 1. You can't game it by hedging.
│ $ recall calibration
│ agent:ada cells=7 contradicted=3 rate=0.429
│ meanConf=0.847 meanConfContradicted=0.883 brier=0.354
└──────────────────────────────────────────────────────────────
┌─ what the row means ─────────────────────────────────────────
│ ▸ ~0.43 of Ada's claims went wrong; meanConfidenceContradicted stayed high (~0.88)
│ ▸ — overconfident-and-wrong. brierScore jumps from ~0.014 to ~0.35 (lower is better).
│ This per-actor score feeds effective-confidence ranking: an actor who is confident
│ AND wrong gets discounted, so their future writes weigh less than a calibrated peer's.
└──────────────────────────────────────────────────────────────
┌─ isolated ─────────────────────────────────────────
│ global graph untouched: 334 -> 334
└──────────────────────────────────────────────────────────────
✦ DONE
Standing watch programs 12-watch-program
Attach a non-LLM monitor to a belief via a hyperedge; it trips when the graph turns against the watched claim.
script — screencasts/12-watch-program.sh
#!/usr/bin/env bash
# 12-watch-program.sh — Standing WATCH programs over a hyperedge.
# Self-contained capture script: runs REAL recall commands against a fresh
# isolated graph and narrates a clean transcript to stdout.
set -uo pipefail
HERE="$(cd "$(dirname "$0")" && pwd)"
source "$HERE/lib/present.sh"
H=/Users/hendrixx./.claude/skills/recall/scripts/recall_helper.py
# --- ISOLATION: fresh temp DB; recall routes by --db, never the real graph ---
DB="$(mktemp -t sc12.XXXXXX).sqlite3"; rm -f "$DB"
recall init --db "$DB" >/dev/null 2>&1
GLOBAL="$HOME/.recall/db/global.sqlite3"
G0=$(recall status --db "$GLOBAL" 2>/dev/null | grep -oE '"nodes": *[0-9]+' | head -1 | grep -oE '[0-9]+')
cleanup(){ rm -f "$DB" /tmp/p1_$$.json /tmp/p2_$$.json /tmp/p3_$$.json /tmp/he_$$.json /tmp/prog_$$.json; }
trap cleanup EXIT
title "recall watch programs — standing monitors that fire when the graph turns" \
"attach a non-LLM watch to a belief; it trips when that belief is overruled"
# ---------------------------------------------------------------------------
scene "1 · admit a claim and its supporting evidence"
narrate "We record a decision (the watched claim) plus a benchmark backing it."
CLAIM=$(python3 "$H" --kind decision --title "Ship v2 caching layer" \
--body "Adopt a write-through cache for the API path." \
--confidence 0.85 --topics "cache,api" > /tmp/p1_$$.json 2>/dev/null
recall admit --json /tmp/p1_$$.json --db "$DB" 2>/dev/null | jq -r '.node.id')
EV=$(python3 "$H" --kind observation --title "Bench confirms cache hit rate" \
--body "Load test confirms a 95% cache hit rate." \
--confidence 0.8 --topics "cache" > /tmp/p2_$$.json 2>/dev/null
recall admit --json /tmp/p2_$$.json --db "$DB" 2>/dev/null | jq -r '.node.id')
cmd "recall admit # claim (decision) + evidence (observation)"
{ printf 'claim %s Ship v2 caching layer\n' "${CLAIM:0:8}"
printf 'evidence %s Bench confirms cache hit rate\n' "${EV:0:8}"; } | out
# ---------------------------------------------------------------------------
scene "2 · bundle them into an evidence hyperedge"
narrate "A hyperedge groups the claim with its verification into one unit."
cat > /tmp/he_$$.json <<EOF
{"kind":"evidence-bundle","title":"Caching decision bundle","members":[{"nodeId":"$CLAIM","role":"claim"},{"nodeId":"$EV","role":"verification"}]}
EOF
cmd "recall hyperedge add --json he.json"
HE=$(recall hyperedge add --json /tmp/he_$$.json --db "$DB" 2>/dev/null | jq -r '.id')
recall hyperedge show "$HE" --db "$DB" 2>/dev/null \
| jq -r '"bundle "+(.id[0:8])+" ["+.kind+"] "+(.title)+"\nmembers: "+([.members[]|.role]|join(" + "))' | out
# ---------------------------------------------------------------------------
scene "3 · attach a standing WATCH program"
narrate "watch fires when the watched claim's effective confidence drops below minEff."
cat > /tmp/prog_$$.json <<EOF
{"schemaVersion":"recall.program.v1","operation":"watch","params":{"concernTarget":"$CLAIM","minEff":0.7}}
EOF
cmd "recall program add <hyperedge> --json prog.json # operation=watch, minEff=0.7"
PROG=$(recall program add "$HE" --json /tmp/prog_$$.json --db "$DB" 2>/dev/null | jq -r '.id')
recall program show "$PROG" --db "$DB" 2>/dev/null \
| jq -r '.spec as $s | "program "+(.id[0:8])+" op="+$s.operation+" minEff="+($s.params.minEff|tostring)+" target="+($s.params.concernTarget[0:8])' | out
note "no LLM in the loop — this is a deterministic monitor over the graph."
# ---------------------------------------------------------------------------
scene "4 · baseline run — belief healthy, watch quiet"
narrate "Right now the claim holds, so effective confidence is above the floor."
cmd "recall program run <program> --derive"
recall program run "$PROG" --derive --db "$DB" 2>/dev/null \
| jq -r '.run.output | "effective="+(.current|tostring)+" floor=0.7 tripped="+(.tripped|tostring)' | out
# ---------------------------------------------------------------------------
scene "5 · the graph turns: a contradiction overrules the claim"
narrate "A prod incident contradicts the decision, collapsing its effective confidence."
CON=$(python3 "$H" --kind observation --title "Cache corrupts stale reads in prod" \
--body "Incident: write-through cache served stale data; do not ship." \
--confidence 0.95 --topics "cache,incident" --contradicts "$CLAIM" > /tmp/p3_$$.json 2>/dev/null
recall admit --json /tmp/p3_$$.json --db "$DB" 2>/dev/null | jq -r '.node.id')
cmd "recall admit --contradicts <claim> # high-confidence counter-evidence"
printf 'contradiction %s Cache corrupts stale reads in prod\n' "${CON:0:8}" | out
recall maintenance --derive --db "$DB" >/dev/null 2>&1 || true
# ---------------------------------------------------------------------------
scene "6 · re-run — the watch TRIPS"
narrate "Same standing program. Effective confidence fell through the floor; it fires."
cmd "recall program run <program> --derive"
recall program run "$PROG" --derive --db "$DB" 2>/dev/null \
| jq -r '.run.output | "effective="+(.current|tostring)+" (was "+(.previous|tostring)+") change="+(.change|tostring)+"\nfloor=0.7 TRIPPED="+(.tripped|tostring|ascii_upcase)' | out
note "a monitor you attached once now warns you the moment the graph turns against the belief."
# ---------------------------------------------------------------------------
scene "isolated"
G1=$(recall status --db "$GLOBAL" 2>/dev/null | grep -oE '"nodes": *[0-9]+' | head -1 | grep -oE '[0-9]+')
note "global graph untouched: ${G0} -> ${G1} (all writes hit a throwaway db)"
endbox
printf '\n✦ DONE\n'
captured output (unedited real run)
recall watch programs — standing monitors that fire when the graph turns
attach a non-LLM watch to a belief; it trips when that belief is overruled
┌─ 1 · admit a claim and its supporting evidence ─────────────────────────────────────────
│ ▸ We record a decision (the watched claim) plus a benchmark backing it.
│ $ recall admit # claim (decision) + evidence (observation)
│ claim 93d0fc5c Ship v2 caching layer
│ evidence d5355c19 Bench confirms cache hit rate
┌─ 2 · bundle them into an evidence hyperedge ─────────────────────────────────────────
│ ▸ A hyperedge groups the claim with its verification into one unit.
│ $ recall hyperedge add --json he.json
│ bundle f9beb37a [evidence-bundle] Caching decision bundle
│ members: claim + verification
┌─ 3 · attach a standing WATCH program ─────────────────────────────────────────
│ ▸ watch fires when the watched claim's effective confidence drops below minEff.
│ $ recall program add <hyperedge> --json prog.json # operation=watch, minEff=0.7
│ program f727e064 op=watch minEff=0.7 target=93d0fc5c
│ no LLM in the loop — this is a deterministic monitor over the graph.
┌─ 4 · baseline run — belief healthy, watch quiet ─────────────────────────────────────────
│ ▸ Right now the claim holds, so effective confidence is above the floor.
│ $ recall program run <program> --derive
│ effective=0.7 floor=0.7 tripped=false
┌─ 5 · the graph turns: a contradiction overrules the claim ─────────────────────────────────────────
│ ▸ A prod incident contradicts the decision, collapsing its effective confidence.
│ $ recall admit --contradicts <claim> # high-confidence counter-evidence
│ contradiction bdf38669 Cache corrupts stale reads in prod
┌─ 6 · re-run — the watch TRIPS ─────────────────────────────────────────
│ ▸ Same standing program. Effective confidence fell through the floor; it fires.
│ $ recall program run <program> --derive
│ effective=0.275 (was 0.7) change=-0.425
│ floor=0.7 TRIPPED=TRUE
│ a monitor you attached once now warns you the moment the graph turns against the belief.
┌─ isolated ─────────────────────────────────────────
│ global graph untouched: 334 -> 334 (all writes hit a throwaway db)
└──────────────────────────────────────────────────────────────
✦ DONE