LiteLLM supports heterogeneous batching, where parallelism across requests is handled entirely on the provider side rather than the application side. We should use that API to support things like #407 #388 unless it becomes clear that a model or provider we need is not supported or some important behavior is missing.