[Upstage AI Ambassador] Personal Knowledge Management with Upstage Solar Pro 2 ✨
UpThink is a service designed to minimize the repetitive manual effort in Personal Knowledge Management environments (specifically Obsidian). It addresses the following bottlenecks that inevitably arise during the knowledge organization process.
| Problem | Description |
|---|---|
| Image Data Processing | Manual conversion of visual information into text |
| Tag Management | Maintaining tag conventions and styling concerns |
| Lack of Knowledge Connectivity | Search costs for finding relevant past notes |
| Unstructured Documents | Need for splitting massive notes |
UpThink automates these processes based on the powerful language understanding capabilities of Upstage Solar Pro 2. Users can break free from simple repetitive tasks and focus on what matters most—thinking.
- Overview
- Key Features
- Tech Stack
- Architecture
- Installation
- Usage
- Project Structure
- Members & Roles
- Acknowledgements
Extracts text from images within notes and automatically generates alt text describing the image content.
- Perform OCR and extract document structure from images using Upstage Document Parse
- Generate alt text of around 50 words based on extracted text using Solar Pro 2
- Batch process all images within markdown files
- Automatically insert alt text below image links in
![[image.png]]format
Analyzes note content to recommend appropriate tags and maintains consistency with existing Vault tag conventions.
- Automatically collect existing tags in Vault (supports hashtags
#tagand YAML frontmatter) - Set user-defined tag guidelines (language, case, separators, number of tags)
- Generate tags based on note content using Solar Pro 2
- Compare and match similarity with existing tags using Qwen Embedding model
- Automatically insert tags in YAML frontmatter format
Finds notes semantically similar to the current note and automatically connects them.
- Vectorize notes in Vault using Upstage Embedding Model and Chroma DB
- Automatically identify and batch process unembedded notes
- Process long notes through chunking
- Recommend Top 3 related notes via similarity search
- Automatically append backlinks to the
## Related Notessection using the[[note]]format
Splits massive notes by topic to atomize them and build an interconnected knowledge system.
- Automatically extract topics within notes using Solar Pro 2
- Support flexible splitting strategies based on templates
- Edit, delete, or add extracted topics
- Automatically generate and save split atomic notes
- Automatically insert backlinks and
## Generated Atomic Notessection in original note
| Category | Technology |
|---|---|
| Language | Python 3.13 |
| Frontend | Streamlit |
| LLM | Upstage Solar Pro 2 |
| Document AI | Upstage Document Parse |
| Embedding | Upstage Embedding, Qwen3-Embedding-0.6B |
| Vector DB | Chroma DB |
| Framework | LangChain |
| Package Manager | uv |
| Layer | Component | Description |
|---|---|---|
| Frontend | Streamlit | Web-based User Interface |
| Backend | Python Modules | Implementation of 4 core features |
| Upstage API | Solar Pro 2, Document Parse, Embedding Model | LLM, OCR, Vector Embedding |
| Local | Qwen Embedding, Chroma DB | Tag similarity comparison, Note vector storage |
| Step | Flow | Key Backend Modules |
|---|---|---|
| 1 | Extract image links and check alt text existence | MarkdownImageProcessor._collect_images_to_process() |
| 1 | Search image file paths in Vault | MarkdownImageProcessor._find_image_in_vault() |
| 2 | Extract text from image | OCRProcessor.extract_text() |
| 2 | Generate alt text | AltTextGenerator.generate_alt_text() |
| 3 | Insert alt text below image link | MarkdownImageProcessor.process_images() |
| Step | Flow | Key Backend Modules |
|---|---|---|
| 1 | Collect and check existing tags | TagExtractor.get_unique_tags(), TagExtractor.count_tags() |
| 2 | Set tag guidelines and generate new tags | GuidelineGenerator(), TagGenerator.generate_tags() |
| 3 | Compare existing and new tags | TagComparator.compare_tags() |
| 3 | Suggest final tags | TagComparator.get_final_tags() |
| 4 | Insert YAML Frontmatter | add_yaml_frontmatter() |
| Step | Flow | Key Backend Modules |
|---|---|---|
| 1 | Identify unembedded notes | Related_Note.get_unembedded_notes() |
| 2 | Preprocessing and chunking | Related_Note.clean_text(), Related_Note.chunk_text() |
| 2 | Embed notes and save to DB | Related_Note.index_unembedded_notes() |
| 3 | Search related notes | Related_Note.find_related_notes() |
| 4 | Insert backlinks | Related_Note.append_related_links() |
| Step | Flow | Key Backend Modules |
|---|---|---|
| 1 | Load prompt template | PromptLoader.load_template() |
| 1 | Extract Topic | UpstageClient.generate_with_template_sync() |
| 2 | Parse Topic list | ResponseParser.parse_topics_from_json() |
| 3 | Generate atomic notes | FileHandler.create_atomic_note() |
| 3 | Insert backlinks | FileHandler.insert_backlinks() |
- macOS
- Windows (PowerShell, CMD)
brew install uv
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
# git clone
git clone https://github.com/geminii01/product-usecase-knowledge-management-upthink.git
cd product-usecase-knowledge-management-upthink
# Environment Variable Setup (Required!)
cp .env.example .env
# Open .env file and enter API keys
# UPSTAGE_API_KEY=your_api_key_here
# TAVILY_API_KEY=your_api_key_here
# Install Python 3.13 and dependencies automatically
uv sync
streamlit run frontend/app.py
# Access via Local URL below!
# http://localhost:8501
🎬 Watch Demo Video: YouTube 🎬
- Enter Vault Path in the sidebar. (Absolute path to Obsidian Vault)
- Upload Markdown file to process.
- Go to the desired feature page and execute.
upthink/
├── frontend/ # Streamlit Frontend
│ ├── app.py # Main App (Routing, Common Sidebar)
│ ├── home.py # Home Page
│ ├── image_ocr.py # Image Alt Text Generation UI
│ ├── tag_suggest.py # Tag Recommendation UI
│ ├── related_note.py # Related Note Recommendation UI
│ ├── note_split.py # Note Splitting UI
│ └── note_freshness.py # Freshness Check UI
│
├── backend/ # Backend Logic
│ ├── image_ocr/ # Image OCR & Alt Text Generation
│ │ ├── ocr_processor.py # Document Parse API Integration
│ │ ├── alt_text_generator.py # Solar Pro 2 Alt Text Generation
│ │ └── markdown_processor.py # Markdown Image Processing
│ │
│ ├── tag_suggest/ # Tag Suggestion
│ │ ├── tag_extractor.py # Extract 2 Tag Patterns
│ │ ├── tag_guidelines.py # Guideline Generation
│ │ ├── tag_generator.py # Solar Pro 2 Tag Generation
│ │ ├── tag_comparator.py # Qwen Embedding Similarity Comparison
│ │ └── markdown_processor.py # YAML frontmatter Processing
│ │
│ ├── related_note/ # Related Note Recommendation
│ │ └── related_note.py # Chroma DB based Similarity Search
│ │
│ ├── note_split/ # Note Splitting
│ │ ├── config.py # Config
│ │ ├── models.py # Data Models
│ │ ├── core/ # State Management, File Handling
│ │ ├── llm/ # LLM Client, Prompt Loader
│ │ └── ui/ # UI Components
│ │
│ └── note_freshness/ # Freshness Verification
│ ├── api/ # Tavily, Wikipedia API
│ ├── core/ # State Management
│ └── llm/ # LLM Integration
│
├── prompts/ # Prompt Templates (YAML)
├── pyproject.toml # Project Configuration & Dependencies
└── .env.example # Environment Variable Example
This project was conducted as part of the Upstage AI Ambassador program.
We thank Upstage for providing credits to support this project.