# Local-First Assistant - Current Plan (Mar 2026)
## Product Direction
Build a personal AI assistant that is:
- Local-first by default (local models/services whenever practical)
- Config-driven (`config/config.yaml` as the source of truth)
- Integration-based (new capabilities added as tools/integrations, not hardcoded hacks)
- Single-binary friendly for core runtime (`go run .` / `go build`)
Primary UX goals:
- Fast daily utility (calendar, memory, briefings, concise chat)
- Collaborative planning (shared scratchpad for short/long-term goals)
- Progressive multimodal expansion (audio + images)
---
## Current State (Shipped)
### Core runtime
- Go HTTP server with:
- `POST /ask`
- `POST /ask/stream` (SSE)
- `GET /api/status`
- Single Ollama-compatible LLM client (`llm/llm.go`)
- Agent loop with tool-calling rounds, telemetry, and context budget controls
- Config validation and feature gating in `config/config.go`
### Enabled capability stack (from current config)
- Weather tools
- News tools
- Calendar tools
- Memory tools
- Git log tools
- Code tools present but disabled by config
### Persistence and data model
- SQLite store in `memory/`
- Active tables:
- `memories`
- `calendar_items` (date-only scheduling via `due_at`, undated supported)
- `calendar_pending` (approval queue)
- `calendar_meta` (`revision` for refresh/polling)
Removed legacy scope:
- Old task toolchain and task table were removed to keep the codebase slim.
---
## Calendar + Pending Workflow (Shipped)
### API surface
- `GET /api/calendar/items`
- `GET /api/calendar/items/{id}`
- `POST /api/calendar/items`
- `PUT /api/calendar/items/{id}`
- `DELETE /api/calendar/items/{id}`
- `GET /api/calendar/pending`
- `PUT /api/calendar/pending/{id}` (kept for compatibility)
- `POST /api/calendar/pending/{id}/confirm`
- `POST /api/calendar/pending/{id}/reject`
### Tooling
- `calendar_list_range`
- `calendar_propose_change` (proposal queue only, user confirms in UI)
### Behavior highlights
- Date-only scheduling with optional undated items
- Range normalization for date/datetime inputs in list queries
- Pending approvals rendered in the UI below composer
- Day detail includes:
- separate undated section
- 14-day rolling agenda (today first)
- scrollable list area
- Done items remain visible with status pill styling
- Completing an undated item assigns due date to selected day (fallback to today)
---
## Integration Architecture (Target Shape)
Treat every major feature as a pluggable integration:
- Config-controlled enable/disable
- Health-checkable
- Exposed to the agent through stable tools
- Local endpoint first, optional remote fallback
Suggested structure:
```text
integrations/
scratchpad/
stt/
tts/
vision/
image_gen/
```
Each integration should define:
- Config schema
- Runtime client (local service/api)
- Minimal server endpoints (if UI uploads/playback required)
- Tool wrappers for agent use
---
## Next Major Build: Shared Scratchpad (P1)
### Why first
- Highest leverage for collaborative planning and long/short-term goals
- Creates durable shared context between user and model
- Makes future audio/vision workflows more coherent
### MVP scope
- SQLite table(s): scratchpads + optional revisions
- API:
- `GET /api/scratchpad`
- `PUT /api/scratchpad`
- optional `GET /api/scratchpad/history`
- Tools:
- `scratchpad_read`
- `scratchpad_update`
- UI panel with sections:
- Now
- Next
- Blockers
- Notes
- Revision counter + polling pattern similar to calendar
### Acceptance criteria
- User and model can both update a shared artifact safely
- Changes appear quickly in UI
- Agent can reference scratchpad reliably in subsequent turns
---
## Multimodal Roadmap
### P2: Audio (STT + TTS), local-first
Goal: hands-free interaction and spoken responses.
- STT integration options:
- `faster-whisper` or `whisper.cpp` service
- TTS integration options:
- Piper (default local)
- API candidates:
- `POST /api/audio/transcribe`
- `POST /api/audio/speak`
- Tool candidates:
- `transcribe_audio`
- `speak_text`
- UI:
- push-to-talk
- replay assistant audio
Acceptance:
- Reliable transcription for short prompts
- Low-latency local speech synthesis
### P3: Vision + Image Generation
Goal: image-aware assistant and local image creation pipeline.
- Vision (analyze images):
- upload image -> parse/describe -> optional memory/scratchpad entry
- Image generation:
- local SD/Flux backend via integration endpoint
- API candidates:
- `POST /api/images/analyze`
- `POST /api/images/generate`
- Tool candidates:
- `analyze_image`
- `generate_image`
- UI:
- upload + preview
- generation gallery/history
Acceptance:
- Agent can reason over user-provided images
- Agent can generate and return local image artifacts
---
## Engineering Principles
- Keep files small, explicit, and testable.
- Prefer typed structs over loose maps in hot paths.
- Keep DB ownership centralized in `memory/`.
- Keep tool descriptions precise to reduce model drift.
- Add capability only when config + UI + tool + telemetry are all wired.
- Preserve local-first defaults; remote is optional opt-in.
---
## Immediate Execution Plan
1. Implement scratchpad storage + API + UI + tools.
2. Add scratchpad-aware prompt guidance and tool usage rules.
3. Add regression tests around calendar range semantics and pending flows.
4. Add integration scaffold for audio (STT/TTS) with one local provider each.
5. Add vision/image generation integrations behind config flags.
---
## Success Criteria (Near-Term)
- Calendar and pending flows remain stable and predictable.
- Scratchpad becomes the default workspace for ongoing goals.
- Audio loop works locally with acceptable latency.
- Image workflows are usable without cloud dependency.
- The project stays slim enough to understand and iterate quickly.