SciDiscover

An Advanced Scientific Discovery Platform

SciDiscover is a cutting-edge platform for scientific discovery that leverages multi-agent AI reasoning to transform biomedical research through intelligent, adaptive knowledge exploration and collaborative hypothesis generation.

Core Features

Multi-Agent Reasoning: Implements a collaborative multi-agent framework with specialized agents (Scientist, Critic, Expander) for hypothesis generation, critique, and refinement
Dynamic Knowledge Graphs: Visualizes and navigates complex relationships between scientific concepts using network analysis and concept path reasoning
Extended Thinking Capabilities: Utilizes Claude 3.7 Sonnet with up to 64K thinking tokens for deeper scientific analysis
Performance Metrics: Tracks analysis time and confidence scores for experimental validation
Debate-Driven Analysis: Simulates scientific discourse through structured multi-agent debate with multiple refinement rounds

Technologies

Backend: Python with advanced LLM integration (Claude 3.7 Sonnet-20250219)
Frontend: Streamlit interactive web interface with responsive visualizations
Data Sources: Integration with PubTator3 for biomedical entity recognition
Knowledge Integration: Custom knowledge graph construction with NetworkX

Getting Started

Prerequisites

Python 3.11+
Anthropic API key (required)
OpenAI API key (optional, used as fallback)

Installation

# Clone the repository
git clone https://github.com/aardeshir/SciDiscover.git
cd SciDiscover

# Create a virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies for your platform
# For GitHub and most environments:
pip install -r requirements-github.txt

# For development:
pip install -r requirements_dev.txt

API Key Setup

SciDiscover requires an Anthropic API key to function properly:

Create an account on Anthropic's website
Generate an API key from the Anthropic console

Set up your API key using one of these methods:

Method 1: Using .env file (recommended)

cp .env.example .env
# Edit .env file with your API key:
# ANTHROPIC_API_KEY=your_actual_key_here

Method 2: Setting environment variables directly

export ANTHROPIC_API_KEY=your_anthropic_api_key
export OPENAI_API_KEY=your_openai_api_key  # Optional

Running the Application

streamlit run main.py

The application will be available at http://localhost:5000 by default.

Docker Deployment (Optional)

A Dockerfile is provided for containerized deployment:

# Build the Docker image
docker build -t scidiscover:latest .

# Run the container
docker run -p 5000:5000 -e ANTHROPIC_API_KEY=your_key_here scidiscover:latest

GitHub Codespaces

This repository is configured for GitHub Codespaces, allowing you to start developing in a fully configured environment directly in your browser:

Click on the "Code" button on the GitHub repository
Select the "Codespaces" tab
Click "Create codespace on main"
Once launched, add your API keys as secrets in the Codespaces environment

Usage Guide

Performing Scientific Analysis

Enter your scientific query in the main text area
Adjust the novelty level slider (0: established knowledge, 1: cutting-edge)
Select your preferred analysis method:
- Standard Analysis: Direct exploration of mechanisms
- Debate-Driven Analysis: Multi-agent collaborative reasoning
Choose the appropriate thinking mode based on query complexity:
- High-Demand: 64K thinking tokens (best for complex queries)
- Low-Demand: 32K thinking tokens (balanced)
- None: Standard processing (fastest)
Click "Analyze" to initiate the discovery process

Interpreting Results

The analysis results include:

Key molecular pathways
Relevant genes and their roles
Detailed molecular mechanisms
Temporal sequence of events
Supporting experimental evidence
Clinical and therapeutic implications
Confidence score and analysis metrics

Architecture

SciDiscover is built with a modular architecture following multi-agent AI design principles:

Overall Architecture

Knowledge Layer: Entity recognition, knowledge graph construction, PubTator3 integration
Reasoning Layer: LLM-based specialized agents, hypothesis generation, validation
Orchestration Layer: Workflow management, agent coordination, debate orchestration
User Interface: Interactive visualization, configuration controls, progress tracking

Multi-Agent System

SciDiscover implements a specialized multi-agent system based on the SciAgents architecture:

Ontologist Agent: Defines key scientific concepts and their relationships
Scientist Agent: Generates detailed scientific hypotheses based on concepts
Expander Agent: Refines and expands hypotheses with additional context
Critic Agent: Evaluates hypotheses for scientific validity and limitations
Debate Orchestrator: Manages the multi-round debate process between agents

Extended Thinking

The system leverages Claude's extended thinking capabilities through:

High Thinking Mode: 64K thinking tokens, 80K output tokens
Low Thinking Mode: 32K thinking tokens, 64K output tokens
Standard Mode: No extended thinking, 32K output tokens

Development

Complete Project Structure

scidiscover/
├── collaboration/     # Collaborative hypothesis building
│   └── gamification.py  # Scoring and rewards system
├── knowledge/         # Knowledge integration components
│   ├── graph.py         # Base knowledge graph implementation
│   ├── kg_coi.py        # KG-COI reasoning implementation
│   └── pubtator.py      # PubTator3 API integration
├── orchestrator/      # Scientific workflow coordination
│   └── workflow.py      # Main workflow orchestration
├── output/            # Output formatting utilities
│   └── formatter.py     # Format for display
├── reasoning/         # AI reasoning and hypothesis generation
│   ├── agents.py        # Specialized agent implementations
│   ├── debate_orchestrator.py  # Multi-agent debate system
│   ├── hypothesis.py    # Hypothesis generation
│   ├── kg_reasoning.py  # Graph-based reasoning
│   ├── llm_manager.py   # LLM API integration
│   └── sci_agent.py     # Main scientific agent
├── ui/                # Streamlit interface components
│   ├── components.py    # Reusable UI elements
│   └── pages.py         # Page definitions
├── config.py          # Configuration settings
└── snapshot.py        # Versioning and snapshot management

Snapshot System

SciDiscover includes a sophisticated snapshot system for versioning analyses and preserving research states:

# Create a new snapshot with timestamp and metadata
python scripts/create_snapshot.py create "My Research Milestone" --description "Key findings on mechanism X"

# List all available snapshots with creation dates
python scripts/create_snapshot.py list

# Show detailed snapshot information including metadata
python scripts/create_snapshot.py show "My Research Milestone"

# Compare two snapshots to see differences
python scripts/create_snapshot.py compare "Milestone A" "Milestone B"

Advanced Configuration

For advanced users, the following configuration options are available in config.py:

Model selection for both Anthropic and OpenAI
Thinking token allocation for different modes
PubTator API endpoint configuration
Knowledge graph cache settings

License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

Disclaimer

This software is provided "AS IS" without warranty of any kind. SciDiscover utilizes Large Language Models (LLMs) and PubTator, which may generate content that is incorrect, incomplete, or misleading despite best efforts to ensure accuracy.

Important: This software is intended to assist scientific research but should not be used as the sole basis for any scientific conclusions, medical decisions, or policy recommendations.

For the full disclaimer, please see the DISCLAIMER.md file.

Acknowledgments

This research utilizes the Anthropic Claude API
PubTator3 for biomedical entity recognition
NetworkX for knowledge graph implementations

Contact

For questions or collaboration opportunities, please reach out to the ArdeshirLab organization.

Name		Name	Last commit message	Last commit date
Latest commit History 124 Commits
.github		.github
.streamlit		.streamlit
attached_assets		attached_assets
resources		resources
scidiscover		scidiscover
scripts		scripts
snapshots		snapshots
.env.example		.env.example
.gitignore		.gitignore
.replit		.replit
CONTRIBUTING.md		CONTRIBUTING.md
DEPLOY.md		DEPLOY.md
DISCLAIMER.md		DISCLAIMER.md
Dockerfile		Dockerfile
README.md		README.md
SECURITY.md		SECURITY.md
config.py		config.py
generated-icon.png		generated-icon.png
main.py		main.py
pyproject.toml		pyproject.toml
replit.nix		replit.nix
requirements-github.txt		requirements-github.txt
requirements_dev.txt		requirements_dev.txt
setup.py		setup.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SciDiscover

An Advanced Scientific Discovery Platform

Core Features

Technologies

Getting Started

Prerequisites

Installation

API Key Setup

Running the Application

Docker Deployment (Optional)

GitHub Codespaces

Usage Guide

Performing Scientific Analysis

Interpreting Results

Architecture

Overall Architecture

Multi-Agent System

Extended Thinking

Development

Complete Project Structure

Snapshot System

Advanced Configuration

License

Disclaimer

Acknowledgments

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SciDiscover

An Advanced Scientific Discovery Platform

Core Features

Technologies

Getting Started

Prerequisites

Installation

API Key Setup

Running the Application

Docker Deployment (Optional)

GitHub Codespaces

Usage Guide

Performing Scientific Analysis

Interpreting Results

Architecture

Overall Architecture

Multi-Agent System

Extended Thinking

Development

Complete Project Structure

Snapshot System

Advanced Configuration

License

Disclaimer

Acknowledgments

Contact

About

Resources

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages