Skip to content

Tool: Web Navigation Agent (based on Browser-Use) #4

@rmarcacini

Description

@rmarcacini

Objective

Develop a new Tool for the Agents4Gov (LABIC – ICMC/USP) project that enables controlled web navigation and data extraction through automated browser interaction.

The tool will be built on top of the browser-use framework and designed for secure, auditable exploration of public websites to collect structured information that can later be processed by LLMs or other Agents4Gov components.


Description

The Web Navigation Tool will provide an interface for Agents4Gov agents to:

  1. Open and interact with web pages.
  2. Click buttons, follow links, and fill forms automatically.
  3. Extract relevant content such as text, tables, or metadata.

This capability will support public-sector use cases such as:

  • Monitoring official portals (e.g., procurement, transparency, environment, or health).
  • Gathering open data from regulatory agencies.
  • Validating publication or update events on public databases.

Functional Requirements

  1. Input

    • A target URL or search query.
    • Optional navigation instructions (e.g., “click the first result”, “extract all table rows”).
    • Optional configuration for maximum depth or page limit.
  2. Browser Session

    • Use browser-use headless automation with secure sandboxing.
    • Each session must be ephemeral and auditable (temporary cache, isolated context).
    • Log every action (URL visited, element clicked, text extracted).
  3. Output

    • browser-use history

Expected Behavior (User Flow)

  1. The user opens Open WebUI → Tools → Web Navigation Agent.
  2. Provides a URL or query and a instruction.
  3. The tool runs a browser-use session, navigating according to the instructions.
  4. The user receives:
    • A structured JSON summary of the navigation.
    • A clear message about the number of pages visited and any warnings.
    • Optional LLM-generated textual summary.

Configuration

  • Valve name: llm_web_analyzer (optional)
  • Dependencies: browser-use, requests, beautifulsoup4, playwright (or selenium)
  • Security: run inside a sandboxed environment with network whitelisting (only HTTP/HTTPS).
  • Logs: automatically store navigation logs and extracted text snippets in temporary storage for auditing.

Deliverables

  • New folder: tools/browser-use/
    • main.py – orchestration of browser-use session and data extraction
    • README.md – usage, examples, safety guidelines
    • requirements.txt – dependencies
    • Optional: test_navigation.py – mock site tests
  • Update docs/README.md to include this tool and usage notes

Acceptance Criteria

  • The tool runs a controlled browser session using browser-use.
  • Accepts a URL or query and optional navigation instructions.
  • Extracts relevant text or tables from public pages.
  • Returns structured JSON results and a summary message.
  • Logs all navigation actions for transparency.
  • Respects privacy, sandboxing, and open data restrictions.
  • Works when imported and executed via Open WebUI Tools module.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions