Skip to content

Implement Agentic Multi-Step Chat Subsystem and Non-Blocking Architecture#176

Merged
kargig merged 3 commits intomainfrom
feature/cardinal-direction-filtering
Mar 8, 2026
Merged

Implement Agentic Multi-Step Chat Subsystem and Non-Blocking Architecture#176
kargig merged 3 commits intomainfrom
feature/cardinal-direction-filtering

Conversation

@kargig
Copy link
Owner

@kargig kargig commented Mar 8, 2026

Summary

This PR fundamentally transforms the Divemap chatbot from a rigid, single-step intent extraction pipeline into a robust, multi-step ReAct (Reasoning and Acting) agent powered by OpenAI Tool Calling. It also modernizes the backend architecture to be fully non-blocking, resolving event loop lockups during intensive AI or database tasks.

Changes Made

Core Chat Architecture

  • Agentic ReAct Loop: Migrated from a monolithic SearchIntent schema to discrete OpenAI Tool Calling. The ChatService now utilizes a while loop allowing the LLM to execute multiple specialized tools in succession to fulfill complex queries.
  • Modularized Chat Package: Split the previous monolithic chat_service.py into a dedicated backend/app/services/chat/ package with specialized modules for tool definitions, executors, and utilities.
  • Discrete Tools: Implemented 8 specialized tools:
    • search_dive_sites: Spatial and keyword-based site discovery.
    • search_diving_centers: Find dive centers with support for gear rental Fallbacks.
    • search_marine_life: Biodiversity-focused discovery.
    • calculate_diving_physics: High-precision physics engine (MOD, SAC, EAD, etc.).
    • get_weather_suitability: Wind and safety forecast integration.
    • recommend_dive_sites: Personalized, context-aware site suggestions.
    • search_certifications: Knowledge base for diving agencies and career paths.
    • ask_user_for_clarification: Allows the agent to pause and request missing data.

Performance & Reliability

  • Non-blocking Execution: Migrated the OpenAI client to AsyncOpenAI.
  • Thread Pool Delegation: Wrapped all synchronous SQLAlchemy and physics operations in FastAPI's run_in_threadpool, ensuring the event loop remains free to serve other requests during chat processing.
  • Hallucination Prevention: Refined system prompts to strictly enforce relative markdown links, eliminating hallucinated domain names (e.g., .ai, .app).
  • Pydantic Validation: Updated SearchIntent schemas to support polymorphic data types (strings/floats) in calculator parameters.

Evaluation & Tooling

  • Tiered Quality Suite: Enhanced evaluate_chat_quality.py with 6 tiers of test cases (Knowledge -> Basic Tools -> Complex Discovery -> Calculators -> Agentic Multi-Step).
  • Comprehensive Logging: Updated evaluation reports to include full response text, structured sources, and backend base URLs for automated quality comparisons.

Testing

  • Automated Tests: All backend chat tests passed successfully (tests/test_chat_api.py, tests/test_api_chat_recommendation.py, tests/test_chat_comparison.py, etc.).
  • Regression Testing: Verified functional parity with commit 6784a87 to ensure no legacy filtering logic (snorkel heuristics, cardinal directions) was lost during refactoring.
  • Manual Verification: Verified non-blocking behavior by hitting /health endpoints while multi-step agent loops were active.
  • Quality Evaluation: Ran the new evaluate_chat_quality.py suite achieving a high pass rate across all tool categories.

Related Issues

  • Resolves: # [Add Issue Number]
  • Part of: feature/cardinal-direction-filtering

Additional Notes

  • Breaking Change: Removed the intent key from the default ChatResponse JSON as part of the move toward a tool-history-based architecture (preserved internal mappings for backward compatibility where possible).
  • Deployment: Requires a docker restart divemap_backend after deployment to initialize the new AsyncOpenAI client.
  • Reviewer Note: Focus on the logic in backend/app/services/chat/chat_service.py which manages the multi-turn agent loop.

kargig added 3 commits March 8, 2026 14:05
Implement a comprehensive context-aware search system for the AI assistant,
supporting geographical targeting, universal page context analysis, and
an agentic multi-step processing architecture.

- Refactor ChatService into an agentic loop for multi-step intent fulfillment
- Implement universal 'PageViewContext' to capture URL state on every page
- Add backend resolver to translate raw paths/IDs into descriptive summaries
- Implement logic to filter by cardinal directions (North, South, etc.)
- Add 'Region Promotion' to automatically map towns to parent regions
- Implement automatic country/region context resolution via coords or IP
- Add JSON repair and retry logic to OpenAIService for increased reliability
- Implement fuzzy location matching for specific dive site name resolution
- Add 'intermediate_steps' and 'entity_type_filter' to chat schemas
- Expand regression suite with complex geographical and contextual cases
Split the monolithic chat_service.py into a modular package to improve
maintainability and readability. Extracted intent extraction, response
generation, context resolution, weather enrichment, and search executors
into dedicated modules within backend/app/services/chat/. Updated all
routers and tests to use the new module structure while preserving exact
functional parity.
Upgrade the chat subsystem from a rigid, single-step intent extraction
pipeline to a multi-step ReAct (Reasoning and Acting) loop powered by
OpenAI Tool Calling. This transition improves answer accuracy through
self-correction and ambiguity handling while simplifying future feature
extensibility.

Adopt non-blocking architecture by migrating to AsyncOpenAI and
delegating synchronous database operations to a thread pool. This
resolves event loop lockups during intensive LLM or database tasks.

Refine system prompts to eliminate domain hallucinations and enforce
sensible safety defaults. Enhance the quality evaluation suite with
tiered test cases and comprehensive logging to better monitor agent
performance across all supported backend tools.
@kargig kargig force-pushed the feature/cardinal-direction-filtering branch from 01d264c to e3d0eda Compare March 8, 2026 17:27
@kargig kargig changed the title Refactor chatbot to agentic architecture and add geographic targeting Implement Agentic Multi-Step Chat Subsystem and Non-Blocking Architecture Mar 8, 2026
@kargig kargig merged commit 6f2083b into main Mar 8, 2026
2 checks passed
kargig added a commit that referenced this pull request Mar 10, 2026
…ture (#176)

* Refactor chatbot to agentic architecture and add geographic targeting

Implement a comprehensive context-aware search system for the AI assistant,
supporting geographical targeting, universal page context analysis, and
an agentic multi-step processing architecture.

- Refactor ChatService into an agentic loop for multi-step intent fulfillment
- Implement universal 'PageViewContext' to capture URL state on every page
- Add backend resolver to translate raw paths/IDs into descriptive summaries
- Implement logic to filter by cardinal directions (North, South, etc.)
- Add 'Region Promotion' to automatically map towns to parent regions
- Implement automatic country/region context resolution via coords or IP
- Add JSON repair and retry logic to OpenAIService for increased reliability
- Implement fuzzy location matching for specific dive site name resolution
- Add 'intermediate_steps' and 'entity_type_filter' to chat schemas
- Expand regression suite with complex geographical and contextual cases

* Refactor chat service into modular architecture

Split the monolithic chat_service.py into a modular package to improve
maintainability and readability. Extracted intent extraction, response
generation, context resolution, weather enrichment, and search executors
into dedicated modules within backend/app/services/chat/. Updated all
routers and tests to use the new module structure while preserving exact
functional parity.

* Implement agentic tool-calling loop for chatbot

Upgrade the chat subsystem from a rigid, single-step intent extraction
pipeline to a multi-step ReAct (Reasoning and Acting) loop powered by
OpenAI Tool Calling. This transition improves answer accuracy through
self-correction and ambiguity handling while simplifying future feature
extensibility.

Adopt non-blocking architecture by migrating to AsyncOpenAI and
delegating synchronous database operations to a thread pool. This
resolves event loop lockups during intensive LLM or database tasks.

Refine system prompts to eliminate domain hallucinations and enforce
sensible safety defaults. Enhance the quality evaluation suite with
tiered test cases and comprehensive logging to better monitor agent
performance across all supported backend tools.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant