Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# See https://help.github.com/articles/ignoring-files/ for more about ignoring files.

# dependencies
/node_modules
/.pnp
.pnp.*
.yarn/*
!.yarn/patches
!.yarn/plugins
!.yarn/releases
!.yarn/versions

# testing
/coverage

# next.js
/.next/
/out/

# production
/build

# misc
.DS_Store
*.pem

# debug
npm-debug.log*
yarn-debug.log*
yarn-error.log*
.pnpm-debug.log*

# env files (can opt-in for committing if needed)
.env*

# vercel
.vercel

# typescript
*.tsbuildinfo
next-env.d.ts
node_modules

/src/generated/prisma
*.db
50 changes: 50 additions & 0 deletions IMPLEMENTATION_PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Implementation Plan - Podcast Generator

## Phase 0: Git Setup
- [x] Check if the current directory is an initialized git repository.
- [x] If it is, create and checkout a new feature branch named `podcast-generator`.

## Phase 1: Environment & Project Setup
- [x] Initialize a new Next.js project with TypeScript, Tailwind CSS, and App Router.
- [x] Install backend dependencies: `prisma`, `@prisma/client` (or `better-sqlite3`), `rss-parser`, `fluent-ffmpeg`, `@google/generative-ai` (Gemini SDK), `@google-cloud/text-to-speech`.
- [x] Install frontend dependencies: `lucide-react` (for icons), `axios` or use `fetch`.
- [x] Configure Environment Variables (`.env.local`): `GOOGLE_AI_API_KEY`, `GOOGLE_APPLICATION_CREDENTIALS`, `DATABASE_URL`.
- [x] Verify `ffmpeg` installation on the local machine.

## Phase 2: Database & Data Access
- [x] Initialize Prisma with SQLite provider (`npx prisma init --datasource-provider sqlite`).
- [x] Define `Feed` model in `prisma/schema.prisma` (id, url, title, createdAt).
- [x] Define `Podcast` model in `prisma/schema.prisma` (id, title, filePath, duration, createdAt, summary).
- [x] Run migration to create the SQLite database (`npx prisma migrate dev --name init`).
- [x] Create a `db.ts` lib file to export the singleton Prisma client instance.

## Phase 3: Backend Services (Core Logic)
- [x] **RSS Service**: Create `lib/services/rss.ts`. Implement `fetchLatestArticles(feeds)` using `rss-parser`. Filter articles from the last 24 hours.
- [x] **AI Service (Summarization)**: Create `lib/services/gemini.ts`. Implement `generatePodcastScript(articles)` using the Gemini API. Construct the prompt as specified in the Tech Spec.
- [x] **TTS Service (Synthesis)**: Create `lib/services/tts.ts`. Implement `synthesizeSpeech(text)` using Google Cloud TTS (Chirp model). Handle text chunking if necessary.
- [x] **Audio Service (Processing)**: Create `lib/services/audio.ts`. Implement `concatAudioSegments(segments, outputPath)` using `fluent-ffmpeg` to merge Intro + Body + Outro.

## Phase 4: API Routes
- [x] **Feed Endpoints**: Create `app/api/feeds/route.ts` and `app/api/feeds/[id]/route.ts`. Implement GET (list), POST (add & validate), DELETE (remove).
- [x] **Podcast Endpoints**: Create `app/api/podcasts/route.ts`. Implement GET (list history).
- [x] **Generation Endpoint**: Create `app/api/podcasts/generate/route.ts`. Implement POST. This should orchestrate the full pipeline: Fetch RSS -> Summarize -> TTS -> Concat -> Save to DB -> Return Result.

## Phase 5: Frontend - UI Components
- [x] **Layout & Hero**: Update `app/page.tsx` with a responsive layout. Add the Hero section with the "Robot agent reading news" image (placeholder or generated).
- [x] **Feed Manager Component**: Create `components/FeedManager.tsx`. Implement a form to add URLs and a list to display/delete current feeds. Connect to `/api/feeds`.
- [x] **Podcast History & Player**: Create `components/PodcastPlayer.tsx`. Display a list of generated podcasts. When selected, play the audio file using the standard `<audio>` tag. Connect to `/api/podcasts`.
- [x] **Generate Button**: Create a prominent "Generate Podcast" button in `app/page.tsx`. Connect to `/api/podcasts/generate` and handle loading states (this might take a while to process).

## Phase 6: Integration & Polish
- [x] Verify the full flow: Add Feed -> Generate -> Wait -> Play Audio.
- [x] Add basic error handling (invalid RSS, API failures).
- [x] Refine the Prompt engineering for Gemini to ensure the script sounds natural.
- [x] Ensure the generated audio files are accessible publicly (saved in `public/podcasts` or served via a static route).

## Phase 7: Completion & Version Control
- [x] Verify application functionality.
- [x] Create a `README.md` file explaining the application functions, how to interact with them, the architecture, file breakdown and how to run and test it locally.
- [x] Add all changes to the repository (`git add .`).
- [x] Commit the changes (`git commit -m "Complete implementation of Podcast Generator"`).
- [x] Push the feature branch to the remote repository, creating a branch with the same name in the remote repository, using the Gemini CLI github MCP server.
- [x] Open a pull request for the feature branch using the Gemini CLI github MCP server, leave it open for review, don't merge it.
139 changes: 50 additions & 89 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,94 +1,55 @@
# Spec-Driven Development w Gemini CLI

This repo has some basic assets to experiment **Spec-Driven Development** using the Gemini CLI. You will act as a developer going from a raw Functional Specification to a deployed Pull Request in a single session.

## Assets

* `.gemini/commands/`: Contains configuration files for custom commands (`techspec`, `plan`, `build`).
* `GEMINI.md`: Contains project rules and guidelines.
* `.github/workflows`: Contains CI workflow.
* **No application code**.
# Podcast Generator

A Next.js application that generates personalized audio news podcasts from RSS feeds using Google Gemini for summarization and Google Cloud Chirp for high-quality text-to-speech.

## Features

- **RSS Feed Management**: Add and remove RSS feeds.
- **AI Summarization**: Automatically fetches and summarizes the latest articles from your feeds using Gemini 1.5 Flash.
- **Audio Generation**: Converts summaries into natural-sounding speech using Google's Chirp models.
- **Podcast Player**: Listen to your generated daily briefings directly in the app.

## Architecture

- **Frontend**: Next.js (App Router), Tailwind CSS.
- **Backend**: Next.js API Routes.
- **Database**: SQLite (via Prisma & Better-SQLite3).
- **AI Services**: Google Gemini API, Google Cloud Text-to-Speech.
- **Audio Processing**: fluent-ffmpeg.

## Setup

1. **Clone the repository**.
2. **Install dependencies**:
```bash
npm install
```
3. **Environment Variables**:
Create a `.env.local` file with the following:
```env
GOOGLE_AI_API_KEY=your_gemini_api_key
GOOGLE_APPLICATION_CREDENTIALS=path/to/your/google_cloud_credentials.json
DATABASE_URL="file:./dev.db"
```
4. **Database Migration**:
```bash
npx prisma migrate dev --name init
```
5. **Run Development Server**:
```bash
npm run dev
```

## Requirements

The `GEMINI.md` configuration and custom commands require the following extensions:
* **Google Workspace**
* **Nano Banana**
* **GitHub**

---

## Step 1: The Architect Phase (/techspec)

**Goal:** Transform a Functional Spec (Google Doc) into a Technical Spec (Google Doc).

1. **Command:**
```
/techspec "Name of your functional specs doc" "Your desired technology stack and requirements"
```

2. **What Happens:**
* The agent searches your Drive for the doc.
* It reads the requirements.
* It generates a **Technical Specification** including Data Models, API Routes, and Architecture based on your inputs.
* **Output:** It creates a *new* Google Doc titled "Technical Specification - Application name" and gives you the link.

---

## Step 2: The Planning Phase (/plan)

**Goal:** Break the Technical Spec down into an atomic Implementation Plan.

1. **Command:**
```
/plan "Name of your Tech spec doc"
```
*(Use the exact name of the doc generated in Step 1)*

2. **What Happens:**
* The agent reads the Tech Spec.
* It creates a local file `IMPLEMENTATION_PLAN.md`.
* It breaks the project into phases (e.g., Setup, Backend, Frontend, Polish).
* It defines the Git strategy.

---

## Step 3: The Build Phase (/build)

**Goal:** Execute the plan and write the code.

1. **Command:**
```
/build IMPLEMENTATION_PLAN.md "Name of your Tech spec doc"
```

2. **What Happens (Iterative):**
* **Execution:** The agent iterates through the plan, initializing the project structure and writing the application code.
* **Visuals:** It generates necessary visual assets (images, icons) as defined in the spec.
* **Progress:** It updates `IMPLEMENTATION_PLAN.md` as tasks are completed.

---

## Step 4: Final Delivery

**Goal:** Push the code and open a Pull Request.

1. **Action:**
The `/build` command's final phase usually covers this, or you can manually instruct the agent to finalize the project.

2. **What Happens:**
* The agent runs final checks (linting/formatting).
* It creates a `README.md` for the new application.
* It commits all changes.
* It pushes the feature branch to GitHub.
* It uses the GitHub extension to **Open a Pull Request**.

---
- Node.js 18+
- ffmpeg installed on the system.
- Google Cloud Project with Vertex AI / Gemini API and Text-to-Speech API enabled.

## Summary of Commands
## Usage

| Step | Command | Input | Output |
| :--- | :--- | :--- | :--- |
| **1. Spec** | `/techspec` | Functional Doc (Drive) | Tech Spec (Drive) |
| **2. Plan** | `/plan` | Tech Spec (Drive) | `IMPLEMENTATION_PLAN.md` |
| **3. Build** | `/build` | Plan + Tech Spec | Code, Assets, App |
1. Open `http://localhost:3000`.
2. Add RSS feed URLs (e.g., `https://rss.nytimes.com/services/xml/rss/nyt/HomePage.xml`).
3. Click "Generate New Episode".
4. Wait for the process to complete (this can take a minute).
5. Play the generated episode from the list.
18 changes: 18 additions & 0 deletions eslint.config.mjs
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
import { defineConfig, globalIgnores } from "eslint/config";
import nextVitals from "eslint-config-next/core-web-vitals";
import nextTs from "eslint-config-next/typescript";

const eslintConfig = defineConfig([
...nextVitals,
...nextTs,
// Override default ignores of eslint-config-next.
globalIgnores([
// Default ignores of eslint-config-next:
".next/**",
"out/**",
"build/**",
"next-env.d.ts",
]),
]);

export default eslintConfig;
Loading