Releases: axot/OpenSuperMLX
Releases · axot/OpenSuperMLX
Release 0.0.11
What's Changed
- chore: bump version to 0.0.11
- docs: strengthen commit and push rules in release flow learnings
- docs: add release flow and commit granularity rules to learnings
- chore: remove make_release.sh, CI release workflow is the sole release path
- fix: prevent CJK phrase-level repetition loops in streaming transcription
- fix: remove 0.5s pre-buffer that leaked pre-recording audio into transcription
- fix: resolve system audio clipping, popping, and stuttering in speaker capture
- fix(ui): forward TranscriptionService.objectWillChange to ContentViewModel
- refactor: simplify download progress code
- fix(vendor): use session-level URLSession delegate for download progress
- fix(vendor): use /tmp for Hub download cache to avoid duplicate model storage
- fix(ui): only show download progress bar when progress is between 0-100%, add diagnostic logging
- docs: trim cli.md — remove details available via --help, keep only policy and quick start
- fix: auto-recover audio engine after USB mic disconnect/reconnect and sleep/wake
- docs: slim AGENTS.md CLI section, extract detail to docs/cli.md
- docs: add CLI test harness reference and pre-commit verification policy to AGENTS.md
- fix: replace AsyncParsableCommand with ParsableCommand + runAsync bridge
- refactor: code simplification pass — remove unused GRDB import from RecordingsCommand
- feat: add mic hot-swap protocol spy tests
- feat: implement benchmark and diagnose commands
- feat: implement recordings, queue, mic, and model commands
- feat: implement correct and config commands
- feat: implement stream-simulate command with file injection into ring buffer
- feat: implement transcribe command with batch file transcription
- feat: add CLIOutput JSON formatting and CLIError with 12 error codes
- feat: add ArgumentParser CLI framework with 10 stub subcommands
- refactor: simplify download progress code
- feat(onboarding): add determinate download progress bar with percentage
- feat(ui): add determinate download progress bar to ContentView model loading overlay
- feat(vendor): thread progressHandler through ModelUtils and Qwen3ASR.fromPretrained
- plan: add CLI test harness implementation plan (15 tasks)
- docs: address Momus review — add JSON schemas, error codes, completion semantics, config typing, model download behavior
- fix(models): correct built-in model size labels to match HuggingFace
- docs: add CLI test harness design spec
- docs: add mandatory debugging section to AGENTS.md and restructure debugging.md
Installation via Homebrew (Recommended)
brew tap axot/tap
brew install --cask opensupermlxManual Installation
- Download OpenSuperMLX.dmg below
- Open the DMG and drag OpenSuperMLX to Applications
- On first launch: right-click the app → Open (required for unsigned apps)
- Grant microphone and accessibility permissions when prompted
Requirements
- macOS 15.1 or later
- Apple Silicon (M1/M2/M3/M4) Mac
Release 0.0.9
What's Changed
- Bump version to 0.0.9
- fix: unify MACOSX_DEPLOYMENT_TARGET to 14.0 across all targets
- docs: update AGENTS.md to match current codebase
- fix(streaming): token-level delta accumulation for cross-reset text display
- fix(streaming): token-level delta accumulation for cross-reset text display
- wip: token-level merge for cross-reset text accumulation (in progress)
- fix(streaming): align periodic reset with C reference to fix punctuation loss
- fix(streaming): fix long recording transcription failures
- Disable VPIO automatic system volume ducking
- refactor: decouple mic/speaker selection with independent controls
- refactor: address Oracle review feedback on streaming fixes
- fix(streaming): preserve transcribed text across periodic resets using overlap-aware merging
- fix(dual-track): await system audio transcription when mic text is empty before pasting
- fix(mic): skip virtual audio devices in auto mode, fall back to physical mic
- fix(streaming): auto mode no longer overrides language with detected language
- chore: add gstack skill routing rules to AGENTS.md
- refactor: code simplifier pass and AGENTS.md updates
- fix(streaming): O(1) memory for long audio sessions
- refactor(streaming): code simplifier pass
- feat(streaming): antirez-aligned continuous chunk processor
- feat: use separate bundle ID for Debug builds to avoid TCC permission conflicts
- refactor(tests): isolate remaining test classes from UserDefaults.standard
- fix: isolate test UserDefaults to prevent wiping user LLM settings
- refactor: code simplifier cleanup for LLM and Settings code
- feat(llm): add user-facing error notifications with in-app toast and connection retry
- fix(audio): guard activateForRecording in non-streaming path for auto mode
- fix(audio): auto mode uses system default microphone
- refactor: simplify system audio capture code
- feat(audio): wire end-to-end dual-track recording flow
- feat(ui): add Auto audio source to mic picker and Screen Recording permission in Settings
- feat(audio): add dual-track transcription and merge
- feat(audio): add dual-track capture coordination to StreamingAudioService
- feat(audio): production SystemAudioService with SCStream
- feat(audio): add screen recording permission flow
- feat(audio): add Auto option to MicrophoneService
- feat(audio): implement AEC processing pipeline
- chore(llm): remove BedrockService and dead notification
- fix: prevent LLM from executing instructions found in transcription text
- feat: add Default/Custom mode for LLM correction prompt
- refactor(llm): migrate call sites from BedrockService to LLMCorrectionService
- feat(audio): add dtln-aec-coreml dependency and AECService wrapper
- feat(audio): add call detection via CoreAudio AudioProcess API
- feat(audio): add headphone/speaker detection utility
- feat(llm): add LLMCorrectionService orchestrator with provider implementations
- feat(audio): add ScreenCaptureKit POC for system audio capture
- feat(llm): add LLM AppPreferences keys and UserDefaults migration
- feat(llm): add LLMProvider protocol, error types, and provider type enum
- docs: add build verification strategy to plan conventions
- fix: restore recordingsDidUpdateNotification listener removed in c0fa8df
- fix: rebuild Chinese ITN FSTs with enable_0_to_9=False to prevent single-character number conversion
Installation via Homebrew (Recommended)
brew tap axot/tap
brew install --cask opensupermlxManual Installation
- Download OpenSuperMLX.dmg below
- Open the DMG and drag OpenSuperMLX to Applications
- On first launch: right-click the app → Open (required for unsigned apps)
- Grant microphone and accessibility permissions when prompted
Requirements
- macOS 15.1 or later
- Apple Silicon (M1/M2/M3/M4) Mac
Release 0.0.8
What's Changed
- chore: bump version to 0.0.8
- fix: reduce SwiftUI layout thrashing causing UI freezes
- fix: reset KV cache per segment, pass context via system role
- fix: strip 'language ...<asr_text>' prefix from non-streaming transcription output
- fix: use synthetic test data and correct assertions in MicrophoneServiceTests
- chore: remove diagnostic tokenIds and merge logs
- fix: flush preSpeechBuffer at end-of-recording, widen ring buffer drain window
- Upgrade LLM correction prompt to semantic intent extraction
- feat: pre-warm audio engine on model load to eliminate startup latency
- chore: remove diagnostic logs from streaming pipeline
- fix: replace terminal punct with comma instead of removing, add diagnostic logs
- fix: streaming truncation — graceful stop, pre-speech buffer, punctuation merge
- feat(streaming): replace fixed 2s chunking with Silero VAD pre-segmentation
- docs: add debugging methodology and consolidate Reference Docs section
- remove oracle review
- test: strengthen weak assertions — verify real behavior, not mock pass-through
- style: fix import ordering and remove print() from test files
- test: add benchmark baseline for regression detection
- test: add performance, memory, and accuracy benchmark tests
- docs: require TDD workflow for all work plans
- test: add WER/CER calculation utility with TDD tests
- ci: run unit tests on every push and PR
- test(Settings): add SettingsTests with injectable memberwise init (TDD)
- test: create hostless unit test target, move pure-logic tests
- test(RecordingStore): add CRUD tests with in-memory DB injection (TDD)
- test(TranscriptionService): add mock-based orchestration tests with injectable init (TDD)
- test: reorganize existing tests — split mega-file, remove Settings overlap, merge FillerWordPrompt
- test: remove ClipboardUtilPasteIntegrationTests (fragile, never runs in CI)
- test: add jfk.wav audio fixture to test bundle with TestFixtures helper
- test: add xctestplan files for fast and benchmark test tiers
- docs: add mandatory testing rules to AGENTS.md
- cleanup: remove obsolete suppressBlankAudio and ITN toggle settings
- docs: add native library checklist to prevent missing build steps in releases
- build: add text-processing-rs to notarize_app.sh for release signing
- feat: add English ITN via text-processing-rs
- docs: require git worktree for all plan execution
- fix(streaming): guard prefix rollback against multi-byte character splits
- fix(streaming): add repetition penalty and guard to prevent decode loops
- feat: default language to auto-detect on first install
Installation via Homebrew (Recommended)
brew tap axot/tap
brew install --cask opensupermlxManual Installation
- Download OpenSuperMLX.dmg below
- Open the DMG and drag OpenSuperMLX to Applications
- On first launch: right-click the app → Open (required for unsigned apps)
- Grant microphone and accessibility permissions when prompted
Requirements
- macOS 15.1 or later
- Apple Silicon (M1/M2/M3/M4) Mac
Release 0.0.7
What's Changed
- chore: bump version to 0.0.7
- docs: update AGENTS.md with comment policy, VendoredPackages note, WeTextProcessing build context
- refactor(streaming): remove dead decodeAllTokenIds method
- fix(streaming): use prefix-rollback in finishStop instead of full re-decode
- test(streaming): add prefix-rollback regression tests for large token counts
- fix(streaming): cap decode window to prevent O(n²) slowdown on long recordings
Installation via Homebrew (Recommended)
brew tap axot/tap
brew install --cask opensupermlxManual Installation
- Download OpenSuperMLX.dmg below
- Open the DMG and drag OpenSuperMLX to Applications
- On first launch: right-click the app → Open (required for unsigned apps)
- Grant microphone and accessibility permissions when prompted
Requirements
- macOS 15.1 or later
- Apple Silicon (M1/M2/M3/M4) Mac
Release 0.0.6
What's Changed
- fix: bundle processor_main in release and revert version to 0.0.6
- fix(ci): skip non-SPM patches in resolve_and_patch.sh instead of failing
- docs: update AGENTS.md with ITNProcessor, logging docs, release workflow, trim verbosity
- fix: address Oracle review — add missing tests, fix MARK label, remove misleading prompt tests
- fix(asr): clear Qwen3 ASR system prompt to avoid debug interference
- docs: add logging guide, update AGENTS.md to reference docs/logging.md
- fix(itn): clean duplicate punctuation left by WeTextProcessing dropping filler chars
- docs: add Plan Conventions section to AGENTS.md
- feat(itn): make Chinese ITN always-on, remove settings toggle
- refactor(itn): simplify ITNProcessor per code-simplifier review
- chore(itn): update Xcode project for ITN files
- feat(itn): integrate ITN into transcription pipeline
- feat(itn): add Settings toggle for Chinese ITN
- feat(itn): add ITNProcessor wrapper and tests
- build(itn): add WeTextProcessing submodule and build pipeline
- refactor(streaming): code simplifier pass for clarity and consistency
- fix(streaming): replace overlap+dedup with prefix-rollback algorithm, fixes CJK duplicates
- fix(streaming): simplify StreamingConfig and update StreamingAudioService
- fix(streaming): add prefix parameter to buildPrompt for rollback support
- chore: remove patch artifacts and add *.orig to gitignore
- docs: update AGENTS.md for vendored mlx-audio-swift
- chore: remove mlx-audio-swift patches (now permanent in vendored source)
- chore: vendor mlx-audio-swift as local package with patched source
- feat(asr): add system prompt to suppress filler words in Qwen3-ASR transcription
- fix: prevent permanent stuck state and crash from rapid hotkey presses
- chore: upgrade AWS SDK Swift 1.6.68 → 1.6.77 and remove SSO token patch
- docs: add usage section with hold-to-record instructions and remove experimental label from streaming
- docs: add macOS Gatekeeper security approval instructions to README
Installation via Homebrew (Recommended)
brew tap axot/tap
brew install --cask opensupermlxManual Installation
- Download OpenSuperMLX.dmg below
- Open the DMG and drag OpenSuperMLX to Applications
- On first launch: right-click the app → Open (required for unsigned apps)
- Grant microphone and accessibility permissions when prompted
Requirements
- macOS 15.1 or later
- Apple Silicon (M1/M2/M3/M4) Mac
Release 0.0.5
What's Changed
- fix(release): use full clone for changelog generation
- docs: add Homebrew install to README and rename to README.md
- fix(release): use env context instead of secrets in if condition
- feat(release): add dynamic changelog and filter non-release tags
- feat(release): add ad-hoc signing, Homebrew tap, and pin macos-15 runner
- fix: patch aws-sdk-swift SSO token fractional seconds parsing bug
- fix(logging): convert remaining Russian print() calls and missed error print
- feat(logging): improve error logging visibility
- docs: expand fast incremental build guidance in AGENTS.md
- docs: remove Contributing section and upstream TODO list from README
- ci: restrict build workflow to master branch and PRs only
- ci: select latest Xcode for Swift 6.2+ compatibility
- ci: add release workflow to build unsigned DMG on tag push
- feat: restore hold-to-record as always-on default
- feat(indicator): show correcting state during LLM post-processing
- feat(shortcuts): remove single modifier key feature
- feat(shortcuts): add force-LLM-correction shortcut key
- feat: remove hold-to-record feature
- fix(settings): remove dead showTimestamps feature and timestamp display
- fix(settings): always show LLM configuration fields regardless of toggle state
- feat(settings): default streaming transcription to on
- docs: refresh README with updated screenshot, features, and requirements
- docs: update AGENTS.md with complete project structure and missing details
- fix(mic): fall back to default microphone when selected device fails to activate
- fix(ui): enable Enter key to confirm delete-all dialog
- fix(streaming): prevent audio leaking between recording sessions
- fix(bedrock): apply LLM correction in streaming mode, show errors in Notification Center
- feat(streaming): add real-time streaming transcription
- fix: add Slaney mel scale to IncrementalMelSpectrogram streaming path
- feat: add AWS Bedrock LLM post-transcription correction
- docs: tighten AGENTS.md and add patches workflow section
- fix: use Slaney mel scale and increase chunk duration to match Python mlx_audio
- Fix patch application: resolve SPM packages before patching, add patches to release build
- fix long audio never finishing: per-chunk token budget and 5-min chunking
- add MLX memory patch for periodic cache clearing during token generation
- reduce MLX memory cache limit from 512MB to 4MB
- fix: reduce GPU memory usage for long audio transcription
- Bump fastlane from 2.229.0 to 2.232.2
- Add AGENTS.md with build commands, test instructions, and code style guidelines
- feat: replace whisper.cpp/FluidAudio with MLX backend
- Skip accessibility permission check in debug builds
- refactor: rename OpenSuperWhisper to OpenSuperMLX
Installation via Homebrew (Recommended)
brew tap axot/tap
brew install --cask opensupermlxManual Installation
- Download OpenSuperMLX.dmg below
- Open the DMG and drag OpenSuperMLX to Applications
- On first launch: right-click the app → Open (required for unsigned apps)
- Grant microphone and accessibility permissions when prompted
Requirements
- macOS 15.1 or later
- Apple Silicon (M1/M2/M3/M4) Mac