Release history — BaseVault

v0.2.0Latest

May 29, 2026

# BaseVault v0.2.0

Built from b4fb753.

Compared to v0.1.49 — a substantial release. The minor-version bump reflects two landmark changes: the ReAct chat loop is now fully wired, and local chat is functional alongside TEE (Private Cloud) as a real first-class mode. Plus the trust surface materially hardened, and a long list of chat-quality wedges shipped.

Chat is meaningfully better

This release lands the multi-hop ReAct chatbot loop alongside several direct fixes. The chat can now:

Run in LOCAL mode — local chat is now functional. Embeddings + the sidecar's per-turn loop dispatch through LOCAL when you pick that mode.
Do multi-step retrieval (ReAct loop) — ask, look at what came back, refine the query, look again — instead of one shot per question.
Stream answers as they generate, instead of waiting for the full response.
Refuse cleanly when bound to an empty corpus, instead of silently grounding on an empty store.
Actually use the model you picked per stage (extract / entities / chat) — previous releases were silently using the same model across stages in some configs.
Cite cleanly — [N] bracket citations only when grounded retrieval backs them; integer-bracket refs are clickable in the answer and resolve to the actual source.
Not parrot itself — refused-turn assistant text is excluded from history so the bot doesn't loop on its own "I don't have a corpus" output across follow-up turns.
Not leak tool-call JSON into the chat bubble when the model emits prose+JSON+prose (mixed-shape wipe + onset re-detection).
Dedupe facts at retrieval time, so the same fact isn't returned multiple times.

The vault dropdown also stays fresh after a pipeline run completes — newly-completed runs show up without needing manual refresh.

Trust + privacy posture hardened

Fireworks and Chutes are gone from the production app — no longer reachable via a runtime mode switch. The production code routes through Tinfoil (Private Cloud) or LOCAL only. Eval-side testing still has them under app/testing/ (never bundled into the .app).
Bundle gate at release time asserts the test/eval surface never ships in the .app.
Fail-closed chat send in Private Cloud mode when attestation is failing or in-flight (mirroring the existing run-gate). No messages leave the app while the trust contract is unproven.
Per-step attestation logging to app.log — when attestation hangs, you can now see which step is stuck.
Crash-on-unknown mode at run start — a typo in your config errors out cleanly instead of silently falling back to LOCAL.
Tinfoil HTTP wire-capture toggle in Settings → Development for trust-chain investigation (off by default).

Pipeline correctness

Insight references in actions are now bundled at write time — Insight [N] in action why text now resolves to the insight's title (with a clickable link in the UI), instead of the dangling positional reference.
Vision stage in LOCAL mode resolves the right model from your config instead of using a hardcoded fallback.
Graph edges are always stamped at embedding time; dangling edges dropped.
Display vs embedding text in the vector store — facts/entities now carry a bare display layer alongside the enriched embedding layer (cleaner rendering in chat citations, no canonical-id slug leakage).

UI fixes

Click a fact citation in chat → the fact view scrolls to and highlights the right fact, including for in-flight (consolidating) entities where the click used to silently no-op.
WebKit paint-debt on fact-click navigation fixed (was making the click target appear blank momentarily on certain layouts).

Under the hood

New per-stage diagnostics rollup with sampled high-volume call detail.
Release-history page at basevault.ai/releases, sourced from GitHub Releases.
Eval framework reorganized + unified across the engines (query, phase, pipeline, chat); all eval outputs now consolidated under ~/.basevault/evals/<modality>/<run_id>/.
Pre-release smoke checklist + workhorse pipeline test rig — workers now have a standard pre-PR behavioral test.
Chat conversation exports from Claude.ai, Claude Code, and Codex can now be ingested as journal-equivalent corpus.

Upgrading

Auto-update should pick this up within a few minutes of opening the app. Manual download from basevault.ai.

v0.1.49

May 26, 2026

Maintenance release — re-signed under BaseVault, Inc.'s Apple Developer ID. No functional changes from 0.1.48. The auto-updater will offer it as the latest signed build.

v0.1.48

May 25, 2026

Fixes

Journal dates — Day One entries now use each entry's own timezone (timeZoneName) instead of UTC, fixing dates that were off by a day for entries written near midnight.
Entities reliability — a schema-shaped empty extraction ({"entities": []}) is now retried instead of being recorded as a false "empty success," so real results aren't silently dropped.
Sources panel — reference labels use the full panel width and the · separators no longer wrap to the start of a line.

v0.1.47

May 25, 2026

Facts & entities

Facts now sort newest-first, each prefixed with a formatted date.
Duplicate facts consolidated — a fact appearing under multiple categories now shows as a single entry in the entity and facts views.

Run details

Local-mode fix — runs now show the actual local model in use (previously surfaced cloud models).
Embeddings observability — embeddings calls show a prompt-token estimate, grouped under a collapsible parent row.

Reliability

Empty-extraction cache fix — a retried empty extraction is re-queried instead of being served a stale "empty success" from cache, so a real failure is no longer masked.

Under the hood

complete() cross-cutting plumbing consolidated into one chokepoint; internal eval tooling + test-fixture repairs.

v0.1.46

May 23, 2026

What's new since v0.1.45

Local mode

Downloaded MLX model now counts as setup-done — the Local mode picker no longer stays disabled after you've downloaded a local model.
Mixed-mode presets removed — simpler, clearer per-step model selection.

Pipeline

Multi-model scheduling — generic per-stage dispatch with an optional parallel multi-model option (e.g. kimi+glm), spreading high-fan-out stages across models.
Insight numbering is consistent across the UI, the underlying records, and chat citations.

Eval tooling (internal)

Agent-drivable perf tool: run → judge → report, --models, parallel judging, boxed reports.
Chatbot eval counts a key match in the retrieved grounding block as a pass, not just the answer text.

Under the hood

Fully-green, deterministic test baseline — hermetic test config, stale-test cleanup, and fail-loud unification of per-stage dispatch.

v0.1.45

May 23, 2026

What's new since v0.1.44

Local mode

MLX crash on older macOS fixed — the bundled MLX runtime is now pinned to the minimum-supported macOS floor, so the app no longer crashes at launch on macOS < 15.0.
Local picker only offered when usable — the Easy Wizard and Settings no longer present Local mode unless a local model + MLX are actually available, and Settings/Wizard now share the same readiness check.

Eval tooling (internal)

Agent-drivable eval runner — scheduler-paced fixture groups, pluggable providers, per-cell outputs, and per-group tables.
4-score judge — grounding / quality / schema / combined scoring, with per-fixture custom judge instructions.

v0.1.44

May 23, 2026

Reliability

Large vaults no longer bog down: the entities stage no longer over-fragments into thousands of tiny model calls (~100× fewer on big corpora).
Pipeline runs are sturdier under load + slow/failing calls — centralized retry, two-tier per-call timeouts (no indefinite hangs), reasoning auto-off after a load retry, and a reliable backup model tried before any data is dropped.

Quality

Better default synthesis routing (dedupe → gemma, patterns → kimi) for quality + speed.
Extraction now captures emotional/affect content, not just dry facts; empty extractions retry instead of silently passing.

Models & trust

Removed the unusable DeepSeek model; added GLM-5.1 as a selectable model.
When a model's secure enclave is unavailable, the app names exactly which one is down (e.g. embeddings) instead of a generic failure.
The image-transcription reasoning toggle now actually takes effect.

Onboarding & diagnostics

Streamlined Easy Wizard first-run for new users (name-only, no key setup).
Run diagnostics saved on every run ending — completed, paused, cancelled, or crash.

Under the hood: eval tooling + test-suite hardening.

v0.1.43

May 22, 2026

Pipeline quality

New default model routing — extract + entities on gpt-oss-120b, dedupe on gemma, patterns on kimi, all with reasoning on — for better extraction coverage and synthesis quality. A restored Reset to defaults button in Settings snaps the per-stage config back to these defaults after experimenting.
Heavily-mentioned people no longer get walls of repeated text as their description; dedupe compresses them to one clean summary.
Insights/actions that come back empty now retry once instead of silently producing nothing.
More reliable handling of oversized requests (cap-hit fallback routes cleanly to a larger-context model).

Chat

Conversations show readable names ("Conversation 3 · May 21") in the picker while staying stable on disk.
New Open Chats button in Settings opens your chat folder in Finder.

Run controls & progress

Skipping a pending request stops it immediately instead of hanging for tens of seconds, with no skip-state flicker.
The progress bar no longer freezes at "N/N" after pause/resume; the elapsed timer ticks smoothly.

Diagnostics

Attestation failures now show the full trace instead of a one-line message.

Onboarding

A bundled key + name-only Easy Wizard lets new users start without manual key entry.

v0.1.42

May 19, 2026

What's new since v0.1.41

Pipeline

Faster first results — extraction emits a small first batch early instead of waiting for the full corpus split, so the first facts and entities surface sooner on big vaults.
Deterministic dedupe — same vault, same model, same outputs: dedupe now uses a total-order alias key with a pinned PYTHONHASHSEED, removing run-to-run shuffle in entity merges.
WhatsApp / per-doc token brake — the shared token brake is enforced as a true ceiling on WhatsApp-style per-doc splits, so chat-log vaults no longer overshoot it.

Chatbot

Chat bar redesign — shared two-line dropdown for model/mode, clickable [N] markers in message bodies that jump to the cited reference, and references display with dates.
LOOKUP protocol scoped to the decision turn — fresh lookups happen only when the chatbot is deciding what to fetch, not on every follow-up; the no-reuse rule was hardened so the same chunk isn't pulled twice.

UI

Product icon in the chatbot and header — the BaseVault mark (black rounded square + emerald dot) now sits in the chatbot bar and the landing-page header.
Entities grouped by type — in the run-details tree, entities are bucketed by type (person, place, org, …) so the list is scannable on dense runs.
Progress bar — embeddings as one unit — the embeddings stage is a single collective progress unit rather than per-call rows; aborted calls are excluded from the in-flight count.
Progress bar — live running time — the "running" timer ticks live and matches the "elapsed" formatter (h/m/s) instead of freezing at snapshot time.
Live wait time on in-flight calls — the run-details view shows each pending call's wait time live, while the call is still waiting on the model.

Internals

Launch trace coalescing — per-line trace emission is gated, runs polling is merged into the coalescer, and per-tick trace markers are demoted; launch traces are quieter and cheaper on large vaults.

v0.1.41

May 17, 2026

What's new since v0.1.40

Privacy

Content-free diagnostics — shareable diagnostic exports are routed through a single guarded emitter that is, by construction, incapable of including file contents or prompt text. Earlier exports already avoided content; this release proves it structurally rather than relying on per-call discipline.

v0.1.40

May 17, 2026

What's new since v0.1.39

Chatbot

Conversation picker — chats now live in per-conversation directories with rename support and last-activity ordering, so you can keep multiple threads side by side.
Less-resistant persona — questions "about the user" trigger a fresh lookup instead of being deflected; LOOKUP is the default when the answer isn't already in-context.
Reasoning toggle now wired through — the chatbot's reasoning switch actually controls inference (it didn't before); the dead rerank toggle was removed.
Per-message copy button — each message has its own copy action, and each resources block is labeled with the source it came from.
Citations pinned to the run — a chat message's citations always reference the run that produced them, even after later runs change the underlying data.
Citation parity across hops — clicking a citation highlights the entity (with fade) on any hop, and highlights the whole chunk in the source view; chunker tuned to 512/64.

RAG

Fail-closed retriever — chunkless and resumed runs no longer silently retrieve zero chunks; the retriever now fails loudly instead of returning empty.

Reliability

Startup "Attestation Failed" fixed — an intermittent sigstore TUF symlink race on first launch is serialized away.

Trust chain

Attestation call sites consolidated — verification now happens at exactly three sanctioned points in the inference path; scattered ad-hoc checks were removed.

Identifiers

Stable 4-letter IDs — run IDs are now stable across restarts, and the same scheme was extended to chats so threads have durable IDs too.