-
Notifications
You must be signed in to change notification settings - Fork 18
Add agentic AI architecture research #227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
nforro
wants to merge
1
commit into
packit:main
Choose a base branch
from
nforro:ai-architecture
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,49 @@ | ||
| # Agentic AI architecture | ||
|
|
||
| ## Can we use native thinking mode of models instead of using the think tool with RequirementAgent? Would it be possible to make that work with BeeAI? If not, can we tune arguments of RequirementAgent? | ||
|
|
||
| LiteLLM supports translating `reasoning_effort` option into model specific reasoning/thinking configuration, | ||
| allowing to control the level of reasoning of models that support it. It also converts model thoughts | ||
| into `reasoning_content` (universal) and `thinking_blocks` (Anthropic-specific) attributes in assistant messages. | ||
|
|
||
| Docs: https://docs.litellm.ai/docs/reasoning_content | ||
|
|
||
| BeeAI doesn't support this, but implementing it is quite simple, it's just about propagating the options/attributes | ||
| and combining it with a custom middleware for instance. | ||
|
|
||
| So the answer is yes, this is definitely possible. However, as it turns out, Anthropic models don't support reasoning | ||
| together with tool constraints. Considering that BeeAI `RequirementAgent` is built on tool constraints | ||
| and the `ToolCallingAgent` has just been deprecated in favor of it, this means Anthropic models can't work with BeeAI | ||
| with reasoning enabled. | ||
|
|
||
| ## Is the BeeAI's "RequirementAgent with conditions" concept best for our purposes? Do the benefits outweigh the model-specific issues with tool calls? | ||
|
|
||
| Without model-native reasoning enabled, I would say it is necessary. Being able to observe model thoughts is essential | ||
| for debugging. With model-native reasoning, we no longer need the `Think` tool and the benefits of the requirements | ||
| become questionable. But it's probably the only way how to force a model to call a certain tool when it tends not to do it, | ||
| and I remember some instances of this happening (upstream patch verification). That being said, even without | ||
| the `Think` tool we would still have issues with models that don't want to respect the constraints. | ||
|
|
||
| ## Do we need a framework at all? Would it be possible to build the agents directly on top of VertexAI API? Would that bring us any benefits? | ||
|
|
||
| Probably not, however, using a framework gives us more flexibility if we want to switch providers in the future, | ||
| and gives us the benefits of API unification and caching. Also, I've just learned that for example Anthropic models | ||
| via VertexAI API behave a bit differently than via Anthropic API, so in case we would decide to go with | ||
| a pure VertexAI client we could be locking ourselves out from some features. | ||
|
|
||
| I think preserving at least LiteLLM as a low-level layer makes sense, but I believe also BeeAI has its place. | ||
| One of the biggest disadvantages is its rapid development that brings breaking changes and not keeping up with them | ||
| just makes it more difficult to adapt in the future. On the other hand, the upstream is very responsive and supportive, | ||
| paying attention to our issues even if they may not be strictly aligned with their focus. | ||
|
|
||
| I think we should decide on how flexible we need to be going forward and do further research, for example | ||
| trying to implement a VertexAI API client demo agent and an agent build on top of LiteLLM without BeeAI, and evaluate | ||
| pros and cons of each approach. We also need to consider the previous points (tool constraints and reasoning). | ||
|
|
||
| ## How complicated it would be to get a more searchable solution when debugging agentic runs? Phoenix is amazing in visualizing the runs but the lack of "easy search in a text file" makes debugging longer. Also the default BeeAI Middleware prints everything which makes the output hard to consume. | ||
|
|
||
| It should be quite easy, just implementing a custom middleware. It could mirror what `openinference-instrumentation-beeai` | ||
| traces and Phoenix displays, or it can be something else, thanks to the BeeAI emitter system the possibilities are virtually | ||
| limitless. Even the default middleware has several filters and options that if tweaked could make it more usable, | ||
| but I think implementing a custom middleware is the way to go. It will be also necessary for logging model-native | ||
| reasoning content (see the first point). | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nforro so the Loki was not usable for this purpose? You can integrate with it directly also ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is more about what is passed to Loki (and directly consumable during local runs), the logs generated by the default middleware are extensive and really hard to navigate in, because they contain everything, including the ever-increasing history of messages exchanged between the model and the framework.