coreason-navigator

coreason-navigator is the "Eyes and Hands" of the CoReason platform for the World Wide Web. It bridges the gap between modern AI agents and legacy web interfaces using State-of-the-Art (SOTA) "Computer Use" techniques.

Features

Visual Navigation: Agents use a headless browser (Playwright) to "see" and interact with webpages just like a human, bypassing the need for fragile API integrations.
Vision-Language Model (VLM) Integration: Uses screenshots and Accessibility Trees (AX Tree) to ground LLM intent into precise screen coordinates.
Robust Orchestration: Handles dynamic content, session persistence, and stealth techniques (e.g., User-Agent rotation) to avoid detection.
Set-of-Marks (SoM): Overlays numeric tags on interactive elements to improve VLM accuracy.
Content Extraction: Converts noisy webpages into clean Markdown for LLM consumption.
Safety First: Includes rate limiting, domain allowlisting, and PII input protection.

Installation

pip install coreason-navigator

Or install from source:

git clone https://github.com/CoReason-AI/coreason_navigator.git
cd coreason_navigator
pip install .

You will also need to install Playwright browsers:

playwright install chromium

Usage

Here is a simple example of how to use the PlaywrightNavigator:

import asyncio
from coreason_navigator.driver import PlaywrightNavigator
from coreason_navigator.types import GotoAction, ClickAction

async def main():
    # Initialize the navigator (headless by default)
    navigator = PlaywrightNavigator(headless=True)

    try:
        # Launch the browser
        await navigator.launch()

        # Navigate to a URL
        print("Navigating to example.com...")
        state = await navigator.navigate("https://example.com")
        print(f"Title: {state.title}")

        # Take a screenshot (base64 encoded)
        # print(state.screenshot_base64[:50] + "...")

        # Extract content
        content = await navigator.extract_content(format="markdown")
        print("Page Content Summary:")
        print(content[:200])

    finally:
        # Always close resources
        await navigator.close()

if __name__ == "__main__":
    asyncio.run(main())

Architecture

Observe: Captures screenshot and Accessibility Tree.
Orient: Maps user intent to screen coordinates using VLM.
Decide: Formulates browser actions (Click, Type, Scroll).
Act: Executes actions via Playwright.

License

This software is proprietary and dual-licensed. Licensed under the Prosperity Public License 3.0. Commercial use beyond a 30-day trial requires a separate license.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.github/workflows		.github/workflows
docs		docs
src/coreason_navigator		src/coreason_navigator
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
Dockerfile		Dockerfile
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
VIGNETTE.md		VIGNETTE.md
codecov.yml		codecov.yml
mkdocs.yml		mkdocs.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

coreason-navigator

Features

Installation

Usage

Architecture

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

coreason-navigator

Features

Installation

Usage

Architecture

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages