BaseVault

Release history

What shipped in each BaseVault release, ordered newest first.

v0.2.0Latest

# BaseVault v0.2.0

Built from b4fb753.

Compared to v0.1.49 — a substantial release. The minor-version bump reflects two landmark changes: the ReAct chat loop is now fully wired, and local chat is functional alongside TEE (Private Cloud) as a real first-class mode. Plus the trust surface materially hardened, and a long list of chat-quality wedges shipped.

Chat is meaningfully better

This release lands the multi-hop ReAct chatbot loop alongside several direct fixes. The chat can now:

  • Run in LOCAL mode — local chat is now functional. Embeddings + the sidecar's per-turn loop dispatch through LOCAL when you pick that mode.
  • Do multi-step retrieval (ReAct loop) — ask, look at what came back, refine the query, look again — instead of one shot per question.
  • Stream answers as they generate, instead of waiting for the full response.
  • Refuse cleanly when bound to an empty corpus, instead of silently grounding on an empty store.
  • Actually use the model you picked per stage (extract / entities / chat) — previous releases were silently using the same model across stages in some configs.
  • Cite cleanly[N] bracket citations only when grounded retrieval backs them; integer-bracket refs are clickable in the answer and resolve to the actual source.
  • Not parrot itself — refused-turn assistant text is excluded from history so the bot doesn't loop on its own "I don't have a corpus" output across follow-up turns.
  • Not leak tool-call JSON into the chat bubble when the model emits prose+JSON+prose (mixed-shape wipe + onset re-detection).
  • Dedupe facts at retrieval time, so the same fact isn't returned multiple times.

The vault dropdown also stays fresh after a pipeline run completes — newly-completed runs show up without needing manual refresh.

Trust + privacy posture hardened

  • Fireworks and Chutes are gone from the production app — no longer reachable via a runtime mode switch. The production code routes through Tinfoil (Private Cloud) or LOCAL only. Eval-side testing still has them under app/testing/ (never bundled into the .app).
  • Bundle gate at release time asserts the test/eval surface never ships in the .app.
  • Fail-closed chat send in Private Cloud mode when attestation is failing or in-flight (mirroring the existing run-gate). No messages leave the app while the trust contract is unproven.
  • Per-step attestation logging to app.log — when attestation hangs, you can now see which step is stuck.
  • Crash-on-unknown mode at run start — a typo in your config errors out cleanly instead of silently falling back to LOCAL.
  • Tinfoil HTTP wire-capture toggle in Settings → Development for trust-chain investigation (off by default).

Pipeline correctness

  • Insight references in actions are now bundled at write timeInsight [N] in action why text now resolves to the insight's title (with a clickable link in the UI), instead of the dangling positional reference.
  • Vision stage in LOCAL mode resolves the right model from your config instead of using a hardcoded fallback.
  • Graph edges are always stamped at embedding time; dangling edges dropped.
  • Display vs embedding text in the vector store — facts/entities now carry a bare display layer alongside the enriched embedding layer (cleaner rendering in chat citations, no canonical-id slug leakage).

UI fixes

  • Click a fact citation in chat → the fact view scrolls to and highlights the right fact, including for in-flight (consolidating) entities where the click used to silently no-op.
  • WebKit paint-debt on fact-click navigation fixed (was making the click target appear blank momentarily on certain layouts).

Under the hood

  • New per-stage diagnostics rollup with sampled high-volume call detail.
  • Release-history page at basevault.ai/releases, sourced from GitHub Releases.
  • Eval framework reorganized + unified across the engines (query, phase, pipeline, chat); all eval outputs now consolidated under ~/.basevault/evals/<modality>/<run_id>/.
  • Pre-release smoke checklist + workhorse pipeline test rig — workers now have a standard pre-PR behavioral test.
  • Chat conversation exports from Claude.ai, Claude Code, and Codex can now be ingested as journal-equivalent corpus.

Upgrading

Auto-update should pick this up within a few minutes of opening the app. Manual download from basevault.ai.

v0.1.49

Maintenance release — re-signed under BaseVault, Inc.'s Apple Developer ID. No functional changes from 0.1.48. The auto-updater will offer it as the latest signed build.

v0.1.48

Fixes

  • Journal dates — Day One entries now use each entry's own timezone (timeZoneName) instead of UTC, fixing dates that were off by a day for entries written near midnight.
  • Entities reliability — a schema-shaped empty extraction ({"entities": []}) is now retried instead of being recorded as a false "empty success," so real results aren't silently dropped.
  • Sources panel — reference labels use the full panel width and the · separators no longer wrap to the start of a line.

v0.1.47

Facts & entities

  • Facts now sort newest-first, each prefixed with a formatted date.
  • Duplicate facts consolidated — a fact appearing under multiple categories now shows as a single entry in the entity and facts views.

Run details

  • Local-mode fix — runs now show the actual local model in use (previously surfaced cloud models).
  • Embeddings observability — embeddings calls show a prompt-token estimate, grouped under a collapsible parent row.

Reliability

  • Empty-extraction cache fix — a retried empty extraction is re-queried instead of being served a stale "empty success" from cache, so a real failure is no longer masked.

Under the hood

  • complete() cross-cutting plumbing consolidated into one chokepoint; internal eval tooling + test-fixture repairs.

v0.1.46

What's new since v0.1.45

Local mode

  • Downloaded MLX model now counts as setup-done — the Local mode picker no longer stays disabled after you've downloaded a local model.
  • Mixed-mode presets removed — simpler, clearer per-step model selection.

Pipeline

  • Multi-model scheduling — generic per-stage dispatch with an optional parallel multi-model option (e.g. kimi+glm), spreading high-fan-out stages across models.
  • Insight numbering is consistent across the UI, the underlying records, and chat citations.

Eval tooling (internal)

  • Agent-drivable perf tool: run → judge → report, --models, parallel judging, boxed reports.
  • Chatbot eval counts a key match in the retrieved grounding block as a pass, not just the answer text.

Under the hood

  • Fully-green, deterministic test baseline — hermetic test config, stale-test cleanup, and fail-loud unification of per-stage dispatch.

v0.1.45

What's new since v0.1.44

Local mode

  • MLX crash on older macOS fixed — the bundled MLX runtime is now pinned to the minimum-supported macOS floor, so the app no longer crashes at launch on macOS < 15.0.
  • Local picker only offered when usable — the Easy Wizard and Settings no longer present Local mode unless a local model + MLX are actually available, and Settings/Wizard now share the same readiness check.

Eval tooling (internal)

  • Agent-drivable eval runner — scheduler-paced fixture groups, pluggable providers, per-cell outputs, and per-group tables.
  • 4-score judge — grounding / quality / schema / combined scoring, with per-fixture custom judge instructions.

v0.1.44

Reliability

  • Large vaults no longer bog down: the entities stage no longer over-fragments into thousands of tiny model calls (~100× fewer on big corpora).
  • Pipeline runs are sturdier under load + slow/failing calls — centralized retry, two-tier per-call timeouts (no indefinite hangs), reasoning auto-off after a load retry, and a reliable backup model tried before any data is dropped.

Quality

  • Better default synthesis routing (dedupe → gemma, patterns → kimi) for quality + speed.
  • Extraction now captures emotional/affect content, not just dry facts; empty extractions retry instead of silently passing.

Models & trust

  • Removed the unusable DeepSeek model; added GLM-5.1 as a selectable model.
  • When a model's secure enclave is unavailable, the app names exactly which one is down (e.g. embeddings) instead of a generic failure.
  • The image-transcription reasoning toggle now actually takes effect.

Onboarding & diagnostics

  • Streamlined Easy Wizard first-run for new users (name-only, no key setup).
  • Run diagnostics saved on every run ending — completed, paused, cancelled, or crash.

Under the hood: eval tooling + test-suite hardening.

v0.1.43

Pipeline quality

  • New default model routing — extract + entities on gpt-oss-120b, dedupe on gemma, patterns on kimi, all with reasoning on — for better extraction coverage and synthesis quality. A restored Reset to defaults button in Settings snaps the per-stage config back to these defaults after experimenting.
  • Heavily-mentioned people no longer get walls of repeated text as their description; dedupe compresses them to one clean summary.
  • Insights/actions that come back empty now retry once instead of silently producing nothing.
  • More reliable handling of oversized requests (cap-hit fallback routes cleanly to a larger-context model).

Chat

  • Conversations show readable names ("Conversation 3 · May 21") in the picker while staying stable on disk.
  • New Open Chats button in Settings opens your chat folder in Finder.

Run controls & progress

  • Skipping a pending request stops it immediately instead of hanging for tens of seconds, with no skip-state flicker.
  • The progress bar no longer freezes at "N/N" after pause/resume; the elapsed timer ticks smoothly.

Diagnostics

  • Attestation failures now show the full trace instead of a one-line message.

Onboarding

  • A bundled key + name-only Easy Wizard lets new users start without manual key entry.

v0.1.42

What's new since v0.1.41

Pipeline

  • Faster first results — extraction emits a small first batch early instead of waiting for the full corpus split, so the first facts and entities surface sooner on big vaults.
  • Deterministic dedupe — same vault, same model, same outputs: dedupe now uses a total-order alias key with a pinned PYTHONHASHSEED, removing run-to-run shuffle in entity merges.
  • WhatsApp / per-doc token brake — the shared token brake is enforced as a true ceiling on WhatsApp-style per-doc splits, so chat-log vaults no longer overshoot it.

Chatbot

  • Chat bar redesign — shared two-line dropdown for model/mode, clickable [N] markers in message bodies that jump to the cited reference, and references display with dates.
  • LOOKUP protocol scoped to the decision turn — fresh lookups happen only when the chatbot is deciding what to fetch, not on every follow-up; the no-reuse rule was hardened so the same chunk isn't pulled twice.

UI

  • Product icon in the chatbot and header — the BaseVault mark (black rounded square + emerald dot) now sits in the chatbot bar and the landing-page header.
  • Entities grouped by type — in the run-details tree, entities are bucketed by type (person, place, org, …) so the list is scannable on dense runs.
  • Progress bar — embeddings as one unit — the embeddings stage is a single collective progress unit rather than per-call rows; aborted calls are excluded from the in-flight count.
  • Progress bar — live running time — the "running" timer ticks live and matches the "elapsed" formatter (h/m/s) instead of freezing at snapshot time.
  • Live wait time on in-flight calls — the run-details view shows each pending call's wait time live, while the call is still waiting on the model.

Internals

  • Launch trace coalescing — per-line trace emission is gated, runs polling is merged into the coalescer, and per-tick trace markers are demoted; launch traces are quieter and cheaper on large vaults.

v0.1.41

What's new since v0.1.40

Privacy

  • Content-free diagnostics — shareable diagnostic exports are routed through a single guarded emitter that is, by construction, incapable of including file contents or prompt text. Earlier exports already avoided content; this release proves it structurally rather than relying on per-call discipline.

v0.1.40

What's new since v0.1.39

Chatbot

  • Conversation picker — chats now live in per-conversation directories with rename support and last-activity ordering, so you can keep multiple threads side by side.
  • Less-resistant persona — questions "about the user" trigger a fresh lookup instead of being deflected; LOOKUP is the default when the answer isn't already in-context.
  • Reasoning toggle now wired through — the chatbot's reasoning switch actually controls inference (it didn't before); the dead rerank toggle was removed.
  • Per-message copy button — each message has its own copy action, and each resources block is labeled with the source it came from.
  • Citations pinned to the run — a chat message's citations always reference the run that produced them, even after later runs change the underlying data.
  • Citation parity across hops — clicking a citation highlights the entity (with fade) on any hop, and highlights the whole chunk in the source view; chunker tuned to 512/64.

RAG

  • Fail-closed retriever — chunkless and resumed runs no longer silently retrieve zero chunks; the retriever now fails loudly instead of returning empty.

Reliability

  • Startup "Attestation Failed" fixed — an intermittent sigstore TUF symlink race on first launch is serialized away.

Trust chain

  • Attestation call sites consolidated — verification now happens at exactly three sanctioned points in the inference path; scattered ad-hoc checks were removed.

Identifiers

  • Stable 4-letter IDs — run IDs are now stable across restarts, and the same scheme was extended to chats so threads have durable IDs too.