Optimize Gemini API Chaining and Token Usage

Currently, AI requests are sent without caching or batching, leading to token waste and frequent 429 rate limit errors. We need to implement a caching layer and sequential batching for large requests.