This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Swift Scribe is an AI-powered speech-to-text transcription application built exclusively for iOS 26/macOS 26+ using Apple's latest frameworks. It provides real-time voice transcription, on-device AI processing, speaker diarization, and intelligent note-taking with complete privacy protection.
Building and Running:
# Open project in Xcode
open SwiftScribe.xcodeproj
# Build from command line (requires Xcode)
xcodebuild -project SwiftScribe.xcodeproj -scheme SwiftScribe -destination 'platform=iOS Simulator,name=iPhone 15 Pro' build
# Build for macOS
xcodebuild -project SwiftScribe.xcodeproj -scheme SwiftScribe -destination 'platform=macOS' build
# Clean build folder
xcodebuild clean -project SwiftScribe.xcodeproj -scheme SwiftScribeTesting:
# Run tests from command line
xcodebuild test -project SwiftScribe.xcodeproj -scheme SwiftScribe -destination 'platform=iOS Simulator,name=iPhone 15 Pro'
# Run tests for macOS
xcodebuild test -project SwiftScribe.xcodeproj -scheme SwiftScribe -destination 'platform=macOS'Swift Package Manager Integration:
# Reset Swift Package Manager cache if dependencies have issues
rm -rf SwiftScribe.xcodeproj/project.xcworkspace/xcshareddata/swiftpm
# Then rebuild in Xcode to re-resolve packagesIMPORTANT: This project requires bleeding-edge Apple platforms:
- iOS 26 Beta or newer (will NOT work on iOS 25 or earlier)
- macOS 26 Beta or newer (will NOT work on macOS 25 or earlier)
- Xcode Beta with Swift 6.2+ toolchain
- Apple Developer Account with beta access
SwiftUI + SwiftData + Modern Concurrency Architecture:
- Built entirely with SwiftUI for cross-platform UI
- SwiftData for object persistence (Core Data successor)
- Async/await and actors for concurrent operations
- Observable pattern using Swift 5.9+
@Observablemacro
-
App Layer (
ScribeApp.swift)- Main app entry point with SwiftData model container setup
- Configures shared model context for memo persistence
-
View Layer (
Views/)ContentView.swift: Navigation split view (memo list + transcript detail)TranscriptView.swift: Core recording interface with live transcriptionSettingsView.swift: App configuration and preferences- Conditional compilation for iOS vs macOS UI differences
-
Model Layer (
Models/)MemoModel.swift: CoreMemoclass with SwiftData persistence, AI enhancement, and speaker attributionSpeakerModels.swift:SpeakerandSpeakerSegmentmodels for diarizationAppSettings.swift: Observable settings for themes and diarization preferences
-
Audio Processing (
Audio/)Recorder.swift: DualAVAudioEnginearchitecture (recording + playback engines)DiarizationManager.swift: FluidAudio integration for real-time speaker identification
-
Speech & AI (
Transcription/+Helpers/)Transcription.swift: Apple Speech framework with async stream processingFoundationModelsHelper.swift: On-device AI text generation using Apple's FoundationModels
Core Frameworks:
- SwiftUI: Modern declarative UI framework
- SwiftData: Object persistence with
@Modelclasses - Speech: Real-time speech recognition with streaming
- AVFoundation: Audio recording, playback, and processing
- FoundationModels: On-device AI text generation (iOS 18+/macOS 15+)
External Dependencies:
- FluidAudio: Speaker diarization library for advanced speaker separation
- Repository:
https://github.com/FluidInference/FluidAudio/ - Provides
DiarizerManager,DiarizationResult,TimedSpeakerSegment
- Repository:
-
Dual Audio Engine Architecture
- Separate
AVAudioEngineinstances for recording and playback - Prevents conflicts and allows simultaneous recording/playback
- Real-time audio processing with buffer management
- Separate
-
Rich Attribution System
- Custom
AttributedStringextensions for speaker identification - Color-coded text based on speaker identification
- Confidence scoring and metadata embedding
- Timeline-based character position mapping
- Custom
-
Real-Time Processing Pipeline
- Streaming transcription with live text updates
- Optional concurrent speaker diarization
- On-device AI enhancement for summaries and titles
- Async actors for thread-safe audio processing
-
Cross-Platform Considerations
- Conditional compilation using
#if os(iOS)and#if os(macOS) - Platform-specific UI adaptations (navigation styles, toolbars)
- Shared business logic with platform-specific presentation
- Conditional compilation using
SwiftUI Observable Pattern:
// Use @Observable classes for state management
@Observable
class AppSettings {
var isDiarizationEnabled: Bool = false
var selectedTheme: Theme = .automatic
}
// Inject via environment
.environment(settings)SwiftData Model Pattern:
// Use @Model for persistent objects
@Model
final class Memo {
var title: String
var text: AttributedString
@Transient var diarizationResult: DiarizationResult?
// Relationships
var speakerSegments: [SpeakerSegment] = []
}Async/Await Audio Processing:
// Use structured concurrency for audio operations
Task {
for await transcription in transcriptionStream {
await MainActor.run {
// Update UI on main thread
}
}
}-
AttributedString Usage: The app heavily uses
AttributedStringfor rich text with speaker attribution. Custom attribute keys are defined for speaker metadata. -
Speaker Diarization Integration: Optional FluidAudio integration provides real-time speaker identification. Results are stored as separate
SpeakerSegmententities linked to memos. -
AI Enhancement: Uses Apple's FoundationModels for on-device text generation. Never sends data to external servers - completely privacy-focused.
-
Cross-Platform UI: Careful use of conditional compilation for iOS vs macOS differences while maintaining shared business logic.
-
Audio Permissions: Requires microphone permissions and speech recognition authorization. Handle permission states gracefully.
- Test on actual devices for speech recognition accuracy
- Mock FluidAudio dependency for unit tests since it requires audio input
- Test speaker diarization accuracy with multiple voice samples
- Verify AttributedString serialization with SwiftData persistence
- Test cross-platform UI on both iOS simulator and macOS
Adding New AI Features:
- Extend
FoundationModelsHelper.swiftwith new generation methods - Update
MemoModel.swiftto store AI-generated content - Modify
TranscriptView.swiftto trigger AI processing
Extending Speaker Diarization:
- Update
SpeakerModels.swiftfor new speaker metadata - Modify
DiarizationManager.swiftfor additional FluidAudio features - Enhance
MemoModel.swiftspeaker attribution methods
UI Enhancements:
- Update both iOS and macOS code paths in views
- Test navigation and layout on different screen sizes
- Ensure accessibility support for voice-based app
- Xcode Beta with latest Swift 6.2+ toolchain required
- iOS 26 Beta/macOS 26 Beta development targets
- Apple Developer Account with beta program access for testing
- Microphone usage for audio recording
- Speech recognition authorization
- Local file system access for audio file storage
- No network permissions required (completely offline operation)
- Optimized for Apple Silicon with MLX framework usage
- Real-time audio processing requires adequate device performance
- Speaker diarization is computationally intensive - consider making it optional
- SwiftData queries for speaker segments should be optimized for large transcripts