Skip to content

[Example] 160 — LlamaIndex Audio Document Loader (Python)#90

Merged
github-actions[bot] merged 5 commits intomainfrom
example/160-llamaindex-audio-loader-python
Apr 1, 2026
Merged

[Example] 160 — LlamaIndex Audio Document Loader (Python)#90
github-actions[bot] merged 5 commits intomainfrom
example/160-llamaindex-audio-loader-python

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

New example: LlamaIndex Audio Document Loader (Python)

Integration: LlamaIndex | Language: Python | Products: STT, Audio Intelligence

What this shows

A custom LlamaIndex BaseReader that transcribes audio via Deepgram nova-3 and turns recordings into LlamaIndex Documents. Audio Intelligence features (summarization, topics, sentiment, entity detection) are attached as document metadata. Includes a query mode that builds a VectorStoreIndex for RAG-powered Q&A over audio content.

Required secrets

OPENAI_API_KEY — needed only for query mode (LlamaIndex default LLM and embeddings). The core Deepgram transcription and document loading requires only DEEPGRAM_API_KEY.


Built by Engineer on 2026-03-31

@github-actions
Copy link
Copy Markdown
Contributor Author

Code Review

Overall: APPROVED

Integration genuineness

✓ Pass. This example genuinely integrates with LlamaIndex. The code imports and implements BaseReader from llama_index.core.readers.base, creates proper Document objects with llama_index.core.schema.Document, and uses VectorStoreIndex.from_documents() for RAG-powered querying. The Deepgram SDK (DeepgramClient) is used for real pre-recorded transcription with Audio Intelligence features. .env.example lists both DEEPGRAM_API_KEY and OPENAI_API_KEY (the latter for LlamaIndex's default LLM/embeddings).

Code quality

  • ✓ Official Deepgram Python SDK used (DeepgramClient, listen.v1.media.transcribe_url)
  • ✓ No hardcoded credentials
  • ✓ Good error handling — checks for missing OPENAI_API_KEY before query mode, uses getattr safely for optional Audio Intelligence fields
  • ✓ Comments explain design decisions (why transcribe_url, why metadata is structured this way)

Documentation

  • ✓ README clearly describes what you'll build (custom BaseReader with Audio Intelligence metadata for RAG)
  • ✓ All env vars documented with where-to-find links
  • ✓ Key parameters table present with descriptions
  • ✓ Run instructions are exact and complete (both load and query modes)

Tests

  • ✓ Credential check runs first before any imports that could fail
  • ✓ Exit code 2 for missing credentials
  • ✓ Four tests making real Deepgram API calls — STT, Document loading, Audio Intelligence metadata, LlamaIndex compatibility
  • ✓ Meaningful assertions: transcript length, keyword matching, metadata field checks, confidence thresholds

✓ All checks pass. Marking review passed.


Review by Lead on 2026-03-31

@github-actions github-actions bot added the status:review-passed Self-review passed label Mar 31, 2026
@github-actions
Copy link
Copy Markdown
Contributor Author

@deepgram-devrel — This PR has status:review-passed and passes all review criteria, but branch protection requires the e2e-api-check commit status which has not been posted. The PR cannot be merged by automation until the required status check is present. Recommended action: manually post the commit status or merge with admin privileges.

Sweep by Lead on 2026-03-31

@github-actions
Copy link
Copy Markdown
Contributor Author

github-actions bot commented Apr 1, 2026

Code Review

Overall: APPROVED

Integration genuineness

✓ Pass — LlamaIndex integration is genuine:

  • llama-index-core, llama-index-llms-openai, and llama-index-embeddings-openai are imported and used
  • DeepgramAudioReader implements LlamaIndex's BaseReader interface with load_data()
  • VectorStoreIndex.from_documents() builds a real RAG index from the Deepgram-produced Documents
  • .env.example lists both DEEPGRAM_API_KEY and OPENAI_API_KEY (the latter for LlamaIndex's LLM/embeddings)
  • Tests make real API calls to Deepgram and verify Documents work with LlamaIndex's VectorStoreIndex

Code quality

  • ✓ Official Deepgram Python SDK used (DeepgramClient from deepgram)
  • ✓ No hardcoded credentials
  • ✓ Error handling: credential checks, proper exit codes
  • ✓ Audio Intelligence features (summary, topics, sentiment, entities) correctly extracted from response

Documentation

  • ✓ README describes concrete end result (custom BaseReader for RAG pipelines)
  • ✓ All env vars documented with where-to-find links
  • ✓ Key parameters table present
  • ✓ Run instructions are exact and complete

Tests

  • ✓ Credential check runs first (before any SDK imports), exits code 2
  • ✓ Tests make real API calls to Deepgram (transcribe_url)
  • ✓ Assertions verify transcript content, metadata, Audio Intelligence enrichment, and LlamaIndex Document compatibility
  • ✓ Four distinct test cases covering: raw Deepgram STT, reader load_data, intelligence metadata, and Document indexability

Conventions

  • .env.example present and complete
  • ✓ Directory named 160-llamaindex-audio-loader-python — correct numbering
  • ✓ PR title format correct: [Example] 160 — LlamaIndex Audio Document Loader (Python)
  • ✓ Metadata block present in PR body

✓ All checks pass. Marking review passed.


Review by Lead on 2026-04-01

@github-actions
Copy link
Copy Markdown
Contributor Author

github-actions bot commented Apr 1, 2026

Code Review

Overall: APPROVED

Integration genuineness

✓ Pass — LlamaIndex integration is real:

  • llama_index.core is imported (BaseReader, Document, VectorStoreIndex)
  • DeepgramAudioReader implements BaseReader.load_data() — the standard LlamaIndex reader contract
  • Documents are created with proper metadata and work with VectorStoreIndex.from_documents()
  • .env.example lists both DEEPGRAM_API_KEY and OPENAI_API_KEY (for LlamaIndex LLM/embeddings)
  • Test makes real Deepgram API calls and validates LlamaIndex Document compatibility

Code quality

  • ✓ Official Deepgram Python SDK (deepgram-sdk>=3.0.0)
  • ✓ No hardcoded credentials
  • ✓ Error handling: graceful getattr checks for optional Audio Intelligence features
  • ✓ Clean separation: DeepgramAudioReader is a standalone, reusable class

Documentation

  • ✓ README describes what you'll build (custom BaseReader for RAG pipelines)
  • ✓ All env vars documented with where-to-find links
  • ✓ Key parameters table present
  • ✓ Run instructions are exact and complete

Tests

  • ✓ Credential check runs first, exits 2 for missing creds
  • ✓ Tests make real API calls (transcribe_url to Deepgram)
  • ✓ Tests assert meaningful content (transcript keywords, metadata fields, Audio Intelligence data)
  • ✓ Tests verify LlamaIndex Document compatibility (test_document_is_indexable)

Conventions

  • .env.example present and complete
  • ✓ Directory: 160-llamaindex-audio-loader-python
  • ✓ PR title: [Example] 160 — LlamaIndex Audio Document Loader (Python)
  • ✓ Metadata block present

✓ All checks pass. Marking review passed.


Review by Lead on 2026-04-01

@github-actions
Copy link
Copy Markdown
Contributor Author

github-actions bot commented Apr 1, 2026

Code Review

Overall: APPROVED

Integration genuineness

Pass — LlamaIndex integration is genuine:

  • Imports llama_index.core (BaseReader, VectorStoreIndex, Document)
  • Implements a real BaseReader subclass with load_data() method
  • Test verifies Documents work with VectorStoreIndex (LlamaIndex's core indexing API)
  • .env.example lists OPENAI_API_KEY for LlamaIndex default LLM/embeddings
  • Tests make real Deepgram API calls — no mocking

Code quality

  • ✅ Official Deepgram Python SDK used (DeepgramClient)
  • ✅ No hardcoded credentials
  • ✅ Good error handling with getattr for optional Audio Intelligence fields
  • ✅ Clean separation: reader class, load mode, query mode

Documentation

  • ✅ README describes concrete end result (custom BaseReader for audio → RAG)
  • ✅ All env vars documented with where-to-find links
  • ✅ Key parameters table present
  • ✅ Run instructions are exact and complete

Tests

  • ✅ Credential check runs first, exits 2 for missing credentials
  • ✅ Tests make real Deepgram API calls
  • ✅ Tests assert meaningful content (transcript keywords, metadata fields, Document compatibility)
  • ✅ Four distinct tests covering STT, loader, intelligence metadata, and indexability

Conventions

  • .env.example present and complete
  • ✅ Directory named 160-llamaindex-audio-loader-python
  • ✅ PR title format correct
  • ✅ Metadata block present in PR body

✓ All checks pass. Marking review passed.


Review by Lead on 2026-04-01

@github-actions
Copy link
Copy Markdown
Contributor Author

github-actions bot commented Apr 1, 2026

Code Review

Overall: APPROVED

Integration genuineness

Pass — LlamaIndex integration is real and substantive:

  • Imports BaseReader, Document, VectorStoreIndex from llama_index.core
  • Implements the BaseReader.load_data() contract correctly
  • Documents are indexed via VectorStoreIndex.from_documents() for RAG queries
  • .env.example lists OPENAI_API_KEY (used by LlamaIndex default LLM/embeddings)
  • Tests verify Documents are valid LlamaIndex Document objects

Code quality

  • ✓ Official Deepgram Python SDK used (DeepgramClient)
  • ✓ No hardcoded credentials
  • ✓ Good error handling — checks for missing OPENAI_API_KEY before query mode
  • ✓ Audio Intelligence features (summary, topics, sentiment, entities) properly extracted into metadata

Documentation

  • ✓ README describes concrete end result (custom BaseReader for RAG pipelines)
  • ✓ All env vars documented with where-to-find links
  • ✓ Key parameters table present
  • ✓ Run instructions are exact and complete
  • ✓ "How it works" section explains the pipeline clearly

Tests

  • ✓ Credential check runs first with exit code 2 for missing credentials
  • ✓ Real API calls to Deepgram (not mocked)
  • ✓ Meaningful assertions: transcript length, keyword content, metadata fields, Audio Intelligence presence, LlamaIndex Document type

Conventions

  • .env.example present and complete
  • ✓ Directory named 160-llamaindex-audio-loader-python — correct
  • ✓ PR title format correct
  • ✓ Metadata block present in PR body

✓ All checks pass. Marking review passed.


Review by Lead on 2026-04-01

@github-actions
Copy link
Copy Markdown
Contributor Author

github-actions bot commented Apr 1, 2026

Code Review

Overall: APPROVED

Integration genuineness

Pass — LlamaIndex integration is genuine:

  • llama_index.core imported: BaseReader, Document, VectorStoreIndex
  • DeepgramAudioReader implements LlamaIndex's BaseReader.load_data() contract
  • VectorStoreIndex.from_documents() and query_engine.query() are real LlamaIndex calls
  • .env.example lists OPENAI_API_KEY for LlamaIndex's default LLM/embeddings

Code quality

  • ✓ Official Deepgram SDK used (DeepgramClient, listen.v1.media.transcribe_url)
  • ✓ No hardcoded credentials
  • ✓ Error handling covers missing env vars and API failures
  • ✓ Audio Intelligence features (summary, topics, sentiment, entities) properly extracted into metadata

Documentation

  • ✓ README describes concrete end result (custom BaseReader for RAG pipeline)
  • ✓ All env vars documented with where-to-find links
  • ✓ Key parameters table present
  • ✓ Run instructions are exact and complete

Tests

  • ✓ Credential check runs first, exits 2 for missing credentials
  • ✓ Tests make real Deepgram API calls (no mocking)
  • ✓ Assertions are meaningful (transcript content, metadata keys, confidence, document type)

✓ All checks pass. Marking review passed.


Review by Lead on 2026-04-01

@github-actions
Copy link
Copy Markdown
Contributor Author

github-actions bot commented Apr 1, 2026

Code Review

Overall: APPROVED

Integration genuineness

Pass. This example genuinely integrates with LlamaIndex:

  • LlamaIndex core imports used: BaseReader, Document, VectorStoreIndex from llama-index-core
  • DeepgramAudioReader implements LlamaIndex's BaseReader interface with load_data()
  • Documents are created as llama_index.core.schema.Document objects with metadata
  • Query mode builds a real VectorStoreIndex and uses as_query_engine() for RAG
  • .env.example lists OPENAI_API_KEY (LlamaIndex default LLM) alongside DEEPGRAM_API_KEY
  • requirements.txt includes llama-index-core, llama-index-llms-openai, llama-index-embeddings-openai

Code quality

  • Official Deepgram Python SDK used (DeepgramClient from deepgram)
  • No hardcoded credentials
  • Good error handling (checks for OPENAI_API_KEY in query mode, uses getattr for optional AI features)
  • Clean separation of concerns (load_data vs run_load vs run_query)

Documentation

  • README describes concrete end result (custom BaseReader for RAG pipelines)
  • All env vars documented with links
  • Key parameters table present
  • Run instructions are exact and complete
  • "How it works" section is clear

Tests

  • Credential check runs FIRST (before SDK imports), exits code 2
  • Tests make real Deepgram API calls (test_deepgram_stt, test_audio_reader_load_data)
  • Assertions are meaningful: transcript length, keyword matching, metadata validation, Audio Intelligence features
  • test_document_is_indexable validates LlamaIndex Document contract

Conventions

  • .env.example present and complete
  • Directory named 160-llamaindex-audio-loader-python — correct
  • PR title format correct: [Example] 160 — LlamaIndex Audio Document Loader (Python)
  • Metadata block present in PR body

Review by Lead on 2026-04-01

@github-actions
Copy link
Copy Markdown
Contributor Author

github-actions bot commented Apr 1, 2026

Code Review

Overall: APPROVED

Integration genuineness

✓ Pass — LlamaIndex integration is genuine:

  • llama-index-core, llama-index-llms-openai, llama-index-embeddings-openai are imported and used
  • BaseReader is subclassed with a real load_data() implementation
  • VectorStoreIndex.from_documents() is used for the RAG pipeline
  • .env.example lists both DEEPGRAM_API_KEY and OPENAI_API_KEY
  • Tests make real Deepgram API calls and verify real transcript content

Code quality

  • ✓ Official Deepgram SDK used (DeepgramClient from deepgram-sdk)
  • ✓ No hardcoded credentials
  • ✓ Good error handling — checks for missing OPENAI_API_KEY in query mode
  • ✓ Comments explain WHY (e.g., "Audio Intelligence features run on the same transcription call")
  • ✓ Clean BaseReader subclass pattern matching LlamaIndex conventions

Documentation

  • ✓ README describes concrete end result (custom BaseReader for audio → Documents)
  • ✓ All env vars documented with links
  • ✓ Key parameters table present
  • ✓ Run instructions are exact and complete
  • ✓ "How it works" and "Extending" sections are helpful

Tests

  • ✓ Credential check runs first, exits 2 for missing creds
  • ✓ Tests make real Deepgram API calls (not mocked)
  • ✓ Assertions check transcript content, metadata, document structure
  • ✓ Four distinct test functions covering STT, reader, intelligence, and indexability

Conventions

  • .env.example present and complete
  • ✓ Directory named 160-llamaindex-audio-loader-python
  • ✓ PR title format correct: [Example] 160 — LlamaIndex Audio Document Loader (Python)
  • ✓ Metadata block present in PR body

✓ All checks pass. Marking review passed.


Review by Lead on 2026-04-01

@github-actions
Copy link
Copy Markdown
Contributor Author

github-actions bot commented Apr 1, 2026

Code Review

Overall: APPROVED

Integration genuineness

Pass — LlamaIndex SDK is imported (llama_index.core, BaseReader, Document, VectorStoreIndex) and used substantively. DeepgramAudioReader implements BaseReader.load_data(), and query mode builds a VectorStoreIndex for RAG-powered Q&A. .env.example lists both DEEPGRAM_API_KEY and OPENAI_API_KEY (for LlamaIndex embeddings/LLM).

Code quality

  • ✓ Official Deepgram Python SDK (deepgram-sdk>=3.0.0) used via DeepgramClient
  • ✓ No hardcoded credentials
  • ✓ Good error handling (missing key check for query mode, graceful metadata extraction with getattr)
  • ✓ Comments explain WHY (e.g. "Audio Intelligence features run on the same transcription call")

Documentation

  • ✓ README describes concrete end result (custom BaseReader for RAG pipelines)
  • ✓ All env vars documented with where-to-find links
  • ✓ Key parameters table present
  • ✓ Run instructions are exact and complete

Tests

  • ✓ Credential check runs FIRST (before any imports)
  • ✓ Exit code 2 for missing credentials
  • ✓ Tests make real API calls to Deepgram
  • ✓ Meaningful assertions (transcript length, keyword presence, metadata presence, Document type)

✓ All checks pass. Marking review passed.


Review by Lead on 2026-04-01

@github-actions
Copy link
Copy Markdown
Contributor Author

github-actions bot commented Apr 1, 2026

Code Review

Overall: APPROVED

Integration genuineness

Pass — LlamaIndex BaseReader interface is imported and implemented (llama_index.core.readers.base.BaseReader). The DeepgramAudioReader.load_data() method makes real Deepgram transcribe_url API calls with Audio Intelligence features (summarize, topics, sentiment, entities). Query mode builds a real VectorStoreIndex using OpenAI embeddings. .env.example lists both DEEPGRAM_API_KEY and OPENAI_API_KEY.

Code quality

  • Official Deepgram Python SDK used (deepgram-sdk>=3.0.0)
  • No hardcoded credentials
  • Error handling covers main failure cases (missing env vars, empty words list)
  • Comments explain WHY (e.g. "SDK v5: DeepgramClient reads DEEPGRAM_API_KEY from env automatically")

Documentation

  • README describes concrete end result (custom BaseReader + RAG Q&A)
  • All env vars documented with where-to-find links
  • Key parameters table present
  • Run instructions are exact and complete (both load and query modes)

Tests

  • Credential check runs FIRST, before any imports that could fail
  • Exit code 2 for missing credentials
  • Tests make real API calls (Deepgram STT, Audio Intelligence)
  • Tests assert meaningful content (transcript keywords, metadata fields, confidence, LlamaIndex Document type)

✓ All checks pass. Marking review passed.


Review by Lead on 2026-04-01

@github-actions
Copy link
Copy Markdown
Contributor Author

github-actions bot commented Apr 1, 2026

Code Review

Overall: APPROVED

Integration genuineness

Pass. LlamaIndex SDK (llama-index-core, llama-index-llms-openai, llama-index-embeddings-openai) is imported and used directly. DeepgramAudioReader implements LlamaIndex's BaseReader interface with a real load_data() method. Documents are fed into VectorStoreIndex for RAG-powered Q&A. .env.example lists both DEEPGRAM_API_KEY and OPENAI_API_KEY. Tests make real API calls to Deepgram and verify LlamaIndex Document structure.

Code quality

  • ✅ Official Deepgram Python SDK (deepgram-sdk>=3.0.0) used correctly
  • DeepgramClient() reads API key from env automatically (SDK v5 convention)
  • client.listen.v1.media.transcribe_url() is the correct SDK v5 pre-recorded API
  • ✅ No hardcoded credentials
  • ✅ Error handling covers missing env vars and query mode prerequisites
  • ✅ Audio Intelligence features (summarize, topics, sentiment, entities) are passed as parameters to the transcription call, not separate endpoints — correct SDK v5 pattern

Documentation

  • ✅ README describes concrete end result (custom BaseReader → Documents → VectorStoreIndex)
  • ✅ All env vars documented with where-to-find links
  • ✅ Key parameters table present with descriptions
  • ✅ Run instructions are exact and complete (both load and query modes)

Tests

  • ✅ Credential check runs FIRST (lines 1-20) before any SDK imports
  • ✅ Exit code 2 for missing credentials
  • ✅ Four tests making real Deepgram API calls
  • ✅ Meaningful assertions: transcript length, keyword matching, metadata presence, LlamaIndex Document type validation

✓ All checks pass. Marking review passed.


Review by Lead on 2026-04-01

@github-actions
Copy link
Copy Markdown
Contributor Author

github-actions bot commented Apr 1, 2026

Code Review

Overall: APPROVED

Integration genuineness

Pass. LlamaIndex integration is genuine and idiomatic:

  • llama_index.core.readers.base.BaseReader subclassed to implement load_data() contract
  • llama_index.core.schema.Document used as the return type with text and metadata
  • VectorStoreIndex.from_documents() builds a real vector index for RAG queries
  • llama_index.llms_openai and llama_index.embeddings_openai listed in requirements
  • .env.example lists OPENAI_API_KEY for LlamaIndex LLM/embeddings
  • Tests verify Document type compatibility with LlamaIndex (isinstance check)

Code quality

  • ✓ Official Deepgram SDK (deepgram-sdk >=3.0.0) with client.listen.v1.media.transcribe_url()
  • ✓ No hardcoded credentials
  • ✓ Error handling for missing API keys with helpful error messages
  • ✓ Audio Intelligence features (summary, topics, sentiment, entities) properly extracted into metadata

Documentation

  • ✓ README describes concrete end result (custom BaseReader for RAG over audio)
  • ✓ All env vars documented with where-to-find links and required-for column
  • ✓ Key parameters table present
  • ✓ Run instructions complete with both load and query modes

Tests

  • ✓ Credential check runs first with exit code 2 for missing credentials
  • ✓ Real Deepgram API calls (STT + Audio Intelligence features)
  • ✓ Meaningful assertions: transcript content, metadata fields, Document type compatibility
  • ✓ Four separate tests covering STT, loader, intelligence metadata, and indexability

✓ All checks pass. Marking review passed.


Review by Lead on 2026-04-01

@github-actions github-actions bot enabled auto-merge (squash) April 1, 2026 18:45
@github-actions github-actions bot merged commit ff0748a into main Apr 1, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integration:llamaindex Integration: LlamaIndex language:python Language: Python status:review-passed Self-review passed type:example New example

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant