Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,9 @@ You can control costs using the following strategies:
* **Machine learning trained model autoscaling**: [Trained model autoscaling](/deploy-manage/autoscaling/trained-model-autoscaling.md) is always enabled and cannot be disabled, ensuring efficient resource usage, reduced costs, and optimal performance without manual configuration.

Trained model deployments automatically scale down to zero allocations after 24 hours without any inference requests. When they scale up again, they remain active for 5 minutes before they can scale down. During these cooldown periods, you will continue to be billed for the active resources.
* **Ingest VCU scaling**: Ingest VCU consumption scales with your indexing load. After 15 minutes with no ingest activity, ingest capacity scales down to zero and ingest charges are minimal during idle periods. Continuous indexing requests prevent idle windows, so ingest VCUs remain provisioned and you continue to incur costs.

@shainaraskas shainaraskas Jun 18, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think maybe we should reframe this using the lever as the title "batch indexing" or similar ... this also feels like it might be part of the "indexing strategies" section

if you break these into subheadings, you could put these two things underneath so people know that both the shape of their data and their indexing rate/strategy matter

alternatively, "indexing strategies" could get a better title like "index size / structure" (not with a slash, but you get it)


Where your use case allows, batch documents into fewer, larger [`_bulk`]({{cloud-serverless-apis}}operation/operation-bulk) requests and group indexing into scheduled batches to maximize idle time and reduce ingest VCU usage.
* **Indexing strategies**: Consider your indexing strategies and how they might impact overall VCU usage and costs.
To ensure optimal performance and cost-effectiveness for your project, it's important to consider how you structure your data.

Expand Down
Loading