A production-quality iOS application for recording, transcribing, and managing audio files. Built with SwiftUI, SwiftData, and MVVM architecture.
- High-quality audio recording using AVAudioEngine
- Real-time audio level visualization with responsive ring animation
- Background recording support
- Audio interruption handling (phone calls, headphone disconnections)
- Automatic pause/resume functionality
- Integration with OpenAI GPT‑4o Mini Transcribe for accurate transcription
- Automatic audio segmentation (30-second chunks)
- Offline queuing system for network interruptions
- Fallback to on-device SFSpeechRecognizer when API fails
- Exponential backoff retry logic for failed requests
- Beautiful, intuitive interface with smooth animations
- Real-time waveform visualization during recording
- Search functionality across transcriptions
- Pull-to-refresh for recordings list
- Network status indicators
- Graceful error handling with user-friendly messages
- Disk space monitoring before recording
- Network connectivity detection
- Comprehensive error recovery mechanisms
- Offline mode support with automatic queue processing
- iOS 17.0+
- Xcode 15.0+
- Swift 5.9+
- OpenAI API Key (for transcription service)
-
Clone the repository
git clone <repository-url> cd Whisper
-
Open in Xcode
open Whisper.xcodeproj
-
Configure API Key
- See
API_SETUP.mdfor detailed instructions - Open
Whisper/Config.plist - Replace
YOUR_API_KEY_HEREwith your OpenAI API key - The API key is now stored securely and won't be committed to version control
- See
-
Build and Run
- Select your target device or simulator
- Press
Cmd+Rto build and run
- Launch the app and grant microphone permissions
- Tap the record button to start recording
- The ring animation will respond to your voice level
- Tap again to stop recording
- Recordings are automatically segmented and sent for transcription
- View all recordings in the main list, grouped by date
- Use the search bar to find recordings by transcription content
- Pull down to refresh and process queued transcriptions
- Tap on a recording to view details (coming in future update)
- When offline, transcriptions are automatically queued
- Queued transcriptions will process when network becomes available
- The app shows clear offline status indicators
The app follows the MVVM (Model-View-ViewModel) architecture pattern:
Recording: Core data model for audio recordingsTranscriptionSegment: Individual transcription segments
RecordingView: Main recording interface- SwiftUI components with modern design
RecordingViewModel: Manages recording state and business logic- Handles data persistence and UI updates
AudioService: Manages AVAudioEngine and recording functionalityTranscriptionService: Handles API communication and transcription
-
Recording Flow
User Action → RecordingViewModel → AudioService → AVAudioEngine -
Transcription Flow
Audio Segment → TranscriptionService → OpenAI API → SwiftData -
UI Updates
Data Changes → @Published Properties → SwiftUI Views
- SwiftUI: Modern declarative UI framework
- SwiftData: Persistent data storage
- AVAudioEngine: High-performance audio recording
- SFSpeechRecognizer: On-device speech recognition fallback
- Combine: Reactive programming for data binding
The app implements comprehensive error handling:
- Network Errors: Automatic retry with exponential backoff
- API Failures: Fallback to local speech recognition
- Storage Issues: Disk space monitoring and alerts
- Audio Interruptions: Automatic pause/resume handling
- Efficient SwiftData queries with proper indexing
- Lazy loading for large datasets
- Background processing for transcriptions
- Memory management for audio buffers
The app requires the following permissions:
- Microphone: For audio recording
- Speech Recognition: For fallback transcription
- Background Audio: For continuous recording
All permissions are requested with clear explanations and handled gracefully.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
For issues and questions:
- Check the existing issues
- Create a new issue with detailed information
- Include device model, iOS version, and steps to reproduce
- Session detail view with transcription segments
- Export functionality for recordings and transcriptions
- Advanced audio editing features
- Cloud sync capabilities
- Voice commands and shortcuts
- Enhanced accessibility features