Consider offloading batching to LiteLLM

LiteLLM supports [heterogeneous batching](https://docs.litellm.ai/docs/completion/batching), where parallelism across requests is handled entirely on the provider side rather than the application side. We should use that API to support things like #407 #388  unless it becomes clear that a model or provider we need is not supported or some important behavior is missing.