GitHub - r2llab/GTTA: This codebase is to reproduce the results of the paper "Grounded Test-Time Adaptation for LLM Agents".

Test-Time Adaptation for LLM Agents via Environment Interaction

This is the codebase to reproduce the results of the paper Test-Time Adaptation for LLM Agents via Environment Interaction. We propose the adaptation framework -- Grounded Test-Time Adaptation (GTTA).

Syntactic Alignment (SA)	Dynamics Grounding (DG)

WebArena

We adopt NNetnav's codebase for web navigation exploration and task evaluation. To reproduce our results on WebArena, please refer to this.

BFCLv3

For BFCLv3 experiment, we modify our method based on the official gorilla codebase. To reproduce our results on BFCLv3, please refer to this.

Tau-Bench

For Tau-Bench experiment, please refer to official codebase with syntactic alignment (SA) (parametric adaptation) enabled.

Citation

If you find this work useful, please cite:

@inproceedings{chentest,
  title={Test-Time Adaptation for LLM Agents via Environment Interaction},
  author={Chen, Arthur and Liu, Zuxin and Zhang, Jianguo and Prabhakar, Akshara and Liu, Zhiwei and Heinecke, Shelby and Savarese, Silvio and Zhong, Victor and Xiong, Caiming},
  booktitle={The Fourteenth International Conference on Learning Representations}
}

License

This work is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
BFCLv3		BFCLv3
WebArena		WebArena
assets		assets
vllm		vllm
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Test-Time Adaptation for LLM Agents via Environment Interaction

WebArena

BFCLv3

Tau-Bench

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Test-Time Adaptation for LLM Agents via Environment Interaction

WebArena

BFCLv3

Tau-Bench

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages