- m&m’s: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks
- multi-modal
- AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents
- TOOLLLM: FACILITATING LARGE LANGUAGE MODELS TO MASTER 16000+ REAL-WORLD APIS
- toolbench
- Gorilla: Large Language Model Connected with Massive APIs
- code output
- AGENTVERSE: FACILITATING MULTI-AGENT COLLABORATION AND EXPLORING EMERGENT BEHAVIORS
- CRAFT: CUSTOMIZING LLMS BY CREATING AND RETRIEVING FROM SPECIALIZED TOOLSETS
- α-UMi: Small LLMs Are Weak Tool Learners: A Multi-LLM Agent
- AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls
- WISSNYF: TOOL GROUNDED LLM AGENTS FOR BLACK BOX SETTING
- τ -bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
- GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction
- AI Agents That Matter
- groupchat, nestedchat example
- Custom llm + function calling
- https://microsoft.github.io/autogen/docs/topics/non-openai-models/local-litellm-ollama/
- https://www.reddit.com/r/ollama/comments/1bacf8c/anyone_had_success_with_function_calling_its/
- https://www.reddit.com/r/AutoGenAI/comments/1b57l2t/trying_to_get_autogen_to_work_with_ollama_and/
- https://github.com/marklysze/AutoGenCodeTesting/blob/master/function_calling/function_calling_test.py
- Autogen discord (alt-models channel) - https://discord.com/channels/1153072414184452236/1201369716057440287
- Articles on autoen function calling