Docorama - AI-Powered Document Editor

Welcome to Docorama, where we are transforming document creation and editing into a hands-free, voice-driven experience. This project is an innovative step towards making content creation seamless and efficient using state-of-the-art AI technology.

Project Overview

Docorama is an AI-based document editor designed to simplify the process of writing and editing by using voice commands. The goal is to make content creation entirely verbal, providing a smooth, hands-free user experience. The following video demo you see was generated through our Captions AI, showcasing the power and efficiency of our technology. https://github.com/user-attachments/assets/7937a4d8-b857-4e40-9606-601bc066110d

Key Features

AI-Driven Document Editing
Utilizes Eleven Labs for speech-to-text conversion, allowing users to dictate content effortlessly. Leverages OpenAI Whisper for text-to-speech, enabling clear and natural audio playback.
Efficient In-Place Edits
Employs OpenAI predictive text to make instant, in-place edits within the document, keeping changes quick and focused without rewriting the entire content.
Content Summarization and Script Generation
Uses OpenAI to summarize the document and generate a script for Captions AI. Captions AI creates AI-driven content to serve as an engaging introduction to the document.

How It Works

Verbal Document Creation: Users can dictate their documents, and the system transcribes the speech using Eleven Labs' speech-to-text capabilities.
Hands-Free Editing: Users can verbally command edits, and OpenAI's predictive features make precise modifications.
Document Summarization: Once the document is finalized, OpenAI summarizes the content, creating a script that Captions AI can use to generate video content.

flowchart TD
    subgraph User_Interaction
        Input[User Input] --> Text[Text Input]
        Input --> Voice[Voice Input via WebSpeech API]
    end

    subgraph Natural_Language_Processing
        Text --> GPT[OpenAI GPT-4]
        Voice --> GPT
        GPT -->|JSON Response| Handler[Response Handler]
    end

    subgraph Content_Generation
        Handler -->|speak_content| Speech[ElevenLabs TTS API]
        Handler -->|edit_command| Edit[Content Editor]
        Edit --> ImageGen[DALL-E 3 API]
        Edit --> VideoGen[Luma AI API]
        Edit --> TextGen[GPT-4 Content]
    end

    subgraph Content_Display
        Speech --> Audio[Audio Playback]
        ImageGen --> Iframe[iframe Display]
        VideoGen --> Iframe
        TextGen --> Iframe
    end

    style Natural_Language_Processing fill:#f9f,stroke:#333
    style Content_Generation fill:#bbf,stroke:#333
    style User_Interaction fill:#bfb,stroke:#333
    style Content_Display fill:#fbb,stroke:#333

Technologies Used

Eleven Labs: For high-quality speech-to-text conversion.
OpenAI Whisper: For natural and accurate text-to-speech functionality.
OpenAI Predictive Text: For efficient and intelligent in-place editing.
Captions AI: For generating engaging AI-driven video content.

Future Goals

We envision expanding Docorama's capabilities to include more AI-driven features and further enhance the user experience. Our mission is to make document editing and content creation as smooth and efficient as possible.

User Journey

User starts talking to the agents about the topic on which they want to generate an article.
The agent brainstorms with users, and starts generating the content for the artile. The content can comprise of text, images and videos.
The user then can ask the agent to make any desired changes to the generated content.
Once, the document is generated the user can use our summarizer tool to generate a summary.
This summary can further be used to generate a short form video representation of the article.

Contributing

We welcome contributions from the community! Please feel free to submit issues or pull requests to help improve Docorama.

Contact

For questions or feedback, please reach out to our team at shaikabdulmalik958@gmail.com or sharan.goku19@gmail.com .

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Docorama.html		Docorama.html
README.md		README.md
Vidapp.py		Vidapp.py
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Docorama - AI-Powered Document Editor

Project Overview

Key Features

How It Works

Technologies Used

Future Goals

User Journey

Contributing

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Docorama - AI-Powered Document Editor

Project Overview

Key Features

How It Works

Technologies Used

Future Goals

User Journey

Contributing

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages