Skip to content

Commit 194b621

Browse files
committed
PRDs
1 parent 83e7a7f commit 194b621

3 files changed

Lines changed: 1224 additions & 78 deletions

File tree

Lines changed: 359 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,359 @@
1+
# 3D Model Generator PRD (Image → 3D GLB)
2+
3+
## Overview
4+
5+
- **Context & Goals**
6+
7+
- Enable in-editor generation of 3D assets as `.glb` via a two-step flow: image generation (OpenRouter) → image-to-3D (Replicate Hunyuan3D).
8+
- Integrate outputs with existing asset folder conventions and optimization pipelines to produce LODs and compressed, runtime-ready assets.
9+
- Support interactive UX for prompt authoring, retry, style presets, and approval before 3D generation.
10+
- Ensure generated assets land under `src/game/assets/models` and are synced/optimized for runtime.
11+
12+
- **Current Pain Points**
13+
- Manual asset creation/import slows iteration and requires external tools.
14+
- Inconsistent folder structures and optimization steps cause runtime performance variance.
15+
- Lack of a guided, discoverable editor UX for asset generation and post-processing.
16+
17+
## Proposed Solution
18+
19+
- **High‑level Summary**
20+
21+
- Add a multi-step Model Generator modal (stepper UX) in the editor.
22+
- Step 1 uses `openrouter` with `openai/gpt-5-image-mini` to generate a transparent-background image; supports retry and prompt/style tuning.
23+
- Step 2 calls Replicate Hunyuan3D to produce `.glb` from the selected image.
24+
- Save generated assets under `src/game/assets/models/<ModelName>/glb|lod|textures`, then invoke optimization (`scripts/optimize.js` and option to fall back/augment via `scripts/optimize-models.js`).
25+
- Reuse and extend the existing `assets-api` Vite server plugin—or a new `generator-api`—to securely call external APIs and write files.
26+
27+
- **Architecture & Directory Structure**
28+
29+
```
30+
/src
31+
├── editor/
32+
│ ├── components/
33+
│ │ └── assets/
34+
│ │ └── ModelGeneratorModal.tsx # Stepper modal UI (like TerrainWizard pattern)
35+
│ ├── hooks/
36+
│ │ └── useModelGenerator.ts # Business logic, prompt state, retries, flow
37+
│ └── services/
38+
│ └── generatorClient.ts # Client for /api/generator endpoints
39+
├── plugins/
40+
│ ├── assets-api/
41+
│ │ └── createAssetsApi.ts # Existing; may extend to support model saves
42+
│ └── generator-api/
43+
│ └── createGeneratorApi.ts # New: server-side calls to OpenRouter/Replicate
44+
├── game/
45+
│ └── assets/
46+
│ └── models/
47+
│ └── <ModelName>/
48+
│ ├── glb/<fileName>.glb # Generated base
49+
│ ├── lod/<fileName>.high_fidelity.glb
50+
│ ├── lod/<fileName>.low_fidelity.glb
51+
│ └── textures/ # Optional baked/preview textures
52+
└── scripts/
53+
├── optimize.js # Current pipeline (MODELS_DIR=src/...)
54+
└── optimize-models.js # Public assets pipeline (optional augmentation)
55+
56+
.env
57+
├── OPENROUTER_API_KEY
58+
├── OPENROUTER_BASE_URL=https://openrouter.ai/api/v1
59+
├── OPENROUTER_IMAGE_MODEL=openai/gpt-5-image-mini
60+
├── REPLICATE_API_TOKEN
61+
└── HYPER3D_BASE_URL=https://api.hyper3d.com
62+
```
63+
64+
## Implementation Plan
65+
66+
- **Phase 1: Setup & Config (0.5 day)**
67+
68+
1. Define `.env` variables and inject via Vite server plugin (server-only usage).
69+
2. Create `generator-api` Vite plugin with routes for image generation, image→3D conversion, saving, and optimization triggers.
70+
3. Extend or reuse `assets-api` file-writing utilities for model folder creation.
71+
72+
- **Phase 2: UI & Flow (1.0 day)**
73+
74+
1. Implement `ModelGeneratorModal.tsx` as a multi-step wizard mirroring `TerrainWizard` stepper.
75+
2. Step 1 UI: prompt editor, style selector, size controls, retry regenerate; display gallery (latest N attempts) with selection.
76+
3. Step 2 UI: Replicate Hunyuan3D options (seed, steps, etc.), model naming (`ModelName` and `fileName`), generate and show progress.
77+
4. On success, show save path and a button to “Add to Scene.”
78+
79+
- **Phase 3: Server Integrations (1.0 day)**
80+
81+
1. Implement `POST /api/generator/image` (OpenRouter → image buffer with transparent background).
82+
2. Implement `POST /api/generator/mesh` (Replicate Hunyuan3D) to convert image → `.glb` and return path.
83+
3. Implement `POST /api/generator/optimize` to invoke `scripts/optimize.js --force` (and optional `optimize-models.js`).
84+
85+
- **Phase 4: Asset Pipeline Hooking (0.5 day)**
86+
87+
1. Save `.glb` to `src/game/assets/models/<ModelName>/glb/<fileName>.glb`.
88+
2. Run optimization to produce `lod/` variants and compression.
89+
3. Confirm sync to `public/assets/models/...` via `scripts/sync-assets.js`.
90+
91+
- **Phase 5: Testing & Hardening (0.5 day)**
92+
1. Unit tests for prompt builder, provider selection logic, and filename sanitation.
93+
2. Integration tests for server endpoints with mocked providers.
94+
3. Manual QA: generate character and prop; verify editor loads LODs and sizes.
95+
96+
## File and Directory Structures
97+
98+
```
99+
/src/editor/components/assets/
100+
├── ModelGeneratorModal.tsx
101+
/src/editor/hooks/
102+
├── useModelGenerator.ts
103+
/src/editor/services/
104+
├── generatorClient.ts
105+
/src/plugins/generator-api/
106+
├── createGeneratorApi.ts
107+
/src/game/assets/models/
108+
└── <ModelName>/
109+
├── glb/<fileName>.glb
110+
├── lod/<fileName>.high_fidelity.glb
111+
├── lod/<fileName>.low_fidelity.glb
112+
└── textures/
113+
```
114+
115+
## Technical Details
116+
117+
- **Environment Variables**
118+
119+
```bash
120+
# OpenRouter
121+
OPENROUTER_API_KEY=...
122+
OPENROUTER_BASE_URL=https://openrouter.ai/api/v1
123+
OPENROUTER_IMAGE_MODEL=openai/gpt-5-image-mini
124+
125+
# Replicate (Hunyuan3D)
126+
REPLICATE_API_TOKEN=...
127+
```
128+
129+
- **Replicate Setup (Node.js)**
130+
131+
```bash
132+
yarn add replicate
133+
134+
# or, if using npm (not recommended in this workspace):
135+
# npm install replicate
136+
137+
# Set token (shell)
138+
export REPLICATE_API_TOKEN=<paste-your-token-here>
139+
```
140+
141+
```ts
142+
// Server-side setup
143+
import Replicate from 'replicate';
144+
145+
const replicate = new Replicate({
146+
auth: process.env.REPLICATE_API_TOKEN!,
147+
});
148+
149+
// Run Hunyuan3D
150+
const output = await replicate.run(
151+
'ndreca/hunyuan3d-2.1:895e514f953d39e8b5bfb859df9313481ad3fa3a8631e5c54c7e5c9c85a6aa9f',
152+
{
153+
input: {
154+
seed: 1234,
155+
image: 'https://example.com/image.png',
156+
steps: 50,
157+
num_chunks: 8000,
158+
max_facenum: 20000,
159+
guidance_scale: 7.5,
160+
generate_texture: true,
161+
octree_resolution: 256,
162+
remove_background: false,
163+
},
164+
},
165+
);
166+
```
167+
168+
- **Client Types (TS)**
169+
170+
```ts
171+
// src/editor/services/generatorClient.ts (types)
172+
export interface ImageGenRequest {
173+
prompt: string; // user prompt
174+
style: string; // e.g., "low-poly", "realistic", etc.
175+
width?: number; // px
176+
height?: number; // px
177+
background?: 'transparent' | 'white';
178+
negativePrompt?: string;
179+
seed?: number;
180+
}
181+
182+
export interface ImageGenResult {
183+
imageId: string;
184+
imageDataUrl: string; // data URL for preview
185+
}
186+
187+
export interface MeshGenRequest {
188+
imageId: string; // reference to selected image
189+
modelName: string; // folder name under models
190+
fileName: string; // base file name for glb
191+
options?: {
192+
seed?: number;
193+
steps?: number;
194+
numChunks?: number; // maps to num_chunks
195+
maxFaceNum?: number; // maps to max_facenum
196+
guidanceScale?: number; // maps to guidance_scale
197+
generateTexture?: boolean; // maps to generate_texture
198+
octreeResolution?: number; // maps to octree_resolution
199+
removeBackground?: boolean; // maps to remove_background
200+
};
201+
}
202+
203+
export interface MeshGenResult {
204+
modelPath: string; // absolute path (server) + public path mapping
205+
}
206+
```
207+
208+
- **Server Routes (TS skeletons)**
209+
210+
```ts
211+
// src/plugins/generator-api/createGeneratorApi.ts (skeleton)
212+
import type { Plugin } from 'vite';
213+
214+
export const createGeneratorApi = (options?: { corsOrigin?: string }): Plugin => ({
215+
name: 'generator-api',
216+
configureServer(server) {
217+
// POST /api/generator/image
218+
server.middlewares.use('/api/generator/image', async (req, res) => {
219+
// 1) parse body (prompt, style, size, background=transparent)
220+
// 2) call OpenRouter: model=openai/gpt-5-image-mini
221+
// 3) return { imageId, imageDataUrl }
222+
});
223+
224+
// POST /api/generator/mesh
225+
server.middlewares.use('/api/generator/mesh', async (req, res) => {
226+
// 1) parse body (imageId, modelName, fileName, options)
227+
// 2) resolve selected image to a URL (or upload temp) for Replicate
228+
// 3) call Replicate Hunyuan3D and receive mesh output
229+
// 4) write GLB to src/game/assets/models/<ModelName>/glb/<fileName>.glb
230+
// 5) return { modelPath }
231+
});
232+
233+
// POST /api/generator/optimize
234+
server.middlewares.use('/api/generator/optimize', async (req, res) => {
235+
// 1) spawn node scripts/optimize.js --force
236+
// 2) (optional) also run scripts/optimize-models.js for public assets
237+
// 3) return summary
238+
});
239+
},
240+
});
241+
```
242+
243+
- **UX Flow Details**
244+
245+
- Step 1 (Preview Image): Build prompt as “Create a STYLE_HERE 3D model with transparent background” + user-provided details. Allow retry (re-generate) without leaving step; maintain an image history list.
246+
- Step 2 (Image→3D): Provide Replicate Hunyuan3D options; supply `ModelName` and `fileName`; show progress and final size.
247+
- After success: Persist to disk, trigger optimization, show final asset paths (base + LODs), and provide “Add to Scene” shortcut with `/assets/models/...` URLs.
248+
249+
## Usage Examples
250+
251+
```ts
252+
// Step 1: Generate preview image
253+
const img = await generatorClient.generateImage({
254+
prompt: userPrompt,
255+
style: selectedStyle,
256+
background: 'transparent',
257+
});
258+
259+
// Step 2: Generate mesh via Replicate
260+
const mesh = await generatorClient.generateMesh({
261+
imageId: img.imageId,
262+
modelName: 'MyModel',
263+
fileName: 'MyModel',
264+
options: { steps: 50, generateTexture: true },
265+
});
266+
267+
// Step 3: Optimize
268+
await generatorClient.optimizeModels();
269+
```
270+
271+
## Testing Strategy
272+
273+
- **Integration Tests**
274+
- `/api/generator/image` returns image preview (mock OpenRouter response).
275+
- `/api/generator/mesh` writes `.glb` to the correct folder (mock Replicate Hunyuan3D).
276+
- `/api/generator/optimize` triggers optimization and produces `lod/` variants.
277+
- Editor loads resulting `/assets/models/...` paths and LOD utils select variant URLs correctly.
278+
279+
## Edge Cases
280+
281+
| Edge Case | Remediation |
282+
| -------------------------------- | ------------------------------------------------------------------------------------- |
283+
| API key missing or invalid | Disable actions, show setup instructions; server returns 401 with clear message. |
284+
| Rate-limits / provider outage | Exponential backoff, retry UI, fallback provider if possible. |
285+
| Non-transparent images returned | Enforce `background=transparent`; if not possible, post-process (alpha mask) or warn. |
286+
| Large or invalid GLB | Validate before save; re-run generation with reduced complexity; show error. |
287+
| Duplicate `ModelName`/`fileName` | Prompt overwrite or auto-increment suffix. |
288+
| Long-running generation | Show progress spinner and timeouts with cancel/retry. |
289+
| Unsafe prompt content | Client-side validation and server-side filtering per provider policy. |
290+
291+
## Sequence Diagram
292+
293+
```mermaid
294+
sequenceDiagram
295+
participant U as User
296+
participant UI as Editor UI (Modal)
297+
participant GA as Generator API (/api/generator)
298+
participant OR as OpenRouter (Image)
299+
participant R as Replicate (Hunyuan3D)
300+
participant FS as FS (src/game/assets/models)
301+
participant OPT as optimize.js
302+
303+
U->>UI: Click “Generate 3D”
304+
UI->>GA: POST /image { prompt, style, background=transparent }
305+
GA->>OR: Create image (model=openai/gpt-5-image-mini)
306+
OR-->>GA: image data (transparent PNG)
307+
GA-->>UI: { imageId, imageDataUrl }
308+
U->>UI: Approve image (or Retry)
309+
UI->>GA: POST /mesh { imageId, modelName, fileName, options }
310+
GA->>R: Submit image + params → 3D
311+
R-->>GA: .glb
312+
GA->>FS: write src/game/assets/models/<ModelName>/glb/<fileName>.glb
313+
UI->>GA: POST /optimize
314+
GA->>OPT: node scripts/optimize.js --force
315+
OPT-->>GA: results (LOD paths)
316+
GA-->>UI: success { base+LOD URLs }
317+
UI-->>U: Show “Add to Scene”
318+
```
319+
320+
## Risks & Mitigations
321+
322+
| Risk | Mitigation |
323+
| -------------------------------------------- | ------------------------------------------------------------------------------ |
324+
| Provider API changes or model unavailability | Feature-flag model strings; make provider calls configurable via `.env`. |
325+
| Security of API keys | Use server-only Vite plugin; never expose keys to client; restrict CORS. |
326+
| Slow or failed optimizations | Run async with progress; surface summaries; allow retry with `--force`. |
327+
| Large asset sizes | Enforce texture resizing and LOD generation; warn on thresholds before import. |
328+
| Folder structure drift | Centralize save logic; add validations and tests for path rules. |
329+
330+
## Timeline
331+
332+
- Total: ~3.5 days
333+
- Phase 1: 0.5 day
334+
- Phase 2: 1.0 day
335+
- Phase 3: 1.0 day
336+
- Phase 4: 0.5 day
337+
- Phase 5: 0.5 day
338+
339+
## Acceptance Criteria
340+
341+
- A “Generate 3D” button opens a two-step modal (image → 3D) with retry and style controls.
342+
- Step 1 uses OpenRouter (`openai/gpt-5-image-mini`) to generate transparent-background images.
343+
- Step 2 uses Replicate Hunyuan3D to convert the approved image to `.glb`.
344+
- `REPLICATE_API_TOKEN` is required and never exposed to the client; calls happen server-side.
345+
- Generated `.glb` is saved to `src/game/assets/models/<ModelName>/glb/<fileName>.glb`.
346+
- Optimization produces `lod/` variants and reduces size; results are visible under `/assets/models/...`.
347+
- No API keys exposed on client; server-side integration works via `/api/generator/*` endpoints.
348+
- Editor can immediately add the new model to the scene from the success screen.
349+
350+
## Conclusion
351+
352+
This plan adds a guided, in-editor 3D asset generation flow integrated with robust post-processing and consistent asset organization. It leverages OpenRouter for fast previews and Meshy/Hyper3D for high-quality `.glb`, then ensures runtime readiness via our existing optimization and LOD pipelines.
353+
354+
## Assumptions & Dependencies
355+
356+
- OpenRouter image generation supports `background=transparent` for the selected model.
357+
- Meshy.ai and Hyper3D provide stable image→3D endpoints and accept PNG/JPEG inputs.
358+
- Development runs under Vite with server plugin support; keys available in `.env`.
359+
- Optimization scripts remain compatible with produced `.glb` files.

0 commit comments

Comments
 (0)