From 3edb779b177b9499431aa5d103e21cc459e7c6fe Mon Sep 17 00:00:00 2001 From: Benjamin ACH Date: Tue, 12 May 2026 14:06:41 +0200 Subject: [PATCH 01/13] feat(scaling): add container sizing guidance --- .../2000-01-01-choosing-container-size.md | 139 ++++++++++++++++++ .../app/scaling/2000-01-01-scaling.md | 47 +++++- 2 files changed, 178 insertions(+), 8 deletions(-) create mode 100644 src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md diff --git a/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md b/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md new file mode 100644 index 000000000..231c4b178 --- /dev/null +++ b/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md @@ -0,0 +1,139 @@ +--- +title: Choosing a Container Size +nav: Choosing a Container Size +modified_at: 2026-05-12 00:00:00 +tags: app scaling containers memory metrics +index: 15 +--- + +Choosing the right container size is a balance between safety, performance, and +cost. A safe initial size gives your application enough memory headroom while +you collect real metrics and validate the workload. + +For simple applications, the default `M` container size is often a reasonable +starting point. Choose a larger size from the beginning when you already know +that your application has higher memory needs, for example because it uses a +memory-intensive runtime, high concurrency, large in-memory datasets, caches, +background jobs processing large payloads, or unknown production traffic. + +See the [container sizes][container-sizes] page for the available sizes and +their memory limits. + + +## Start Safe, Then Tune + +When you are unsure about the right size, start with a size that is slightly +larger than your first estimate. After deployment, use metrics, alerts, and +load testing to adjust the size. + +Avoid choosing a smaller size only because the application starts successfully. +An application can boot with low memory usage and still consume much more +memory under real traffic, scheduled jobs, large requests, or specific user +flows. + +Before downsizing, validate that the application keeps enough memory headroom +below the limit over time. + + +## Read Memory Metrics Before Changing Size + +Before changing the container size, inspect the memory charts in the +[Metrics tab][metrics]. Compare the application memory usage with the memory +quota of the selected container size. + +Pay attention to: + +- RAM and swap usage. +- Whether memory usage returns to a stable baseline after traffic peaks. +- Per-container details when the application runs several containers. +- Deploy, restart, scale, traffic, and background job events around memory + spikes. + +Memory usage that keeps increasing over time without returning to a stable +baseline can indicate a memory leak. In that case, investigate the application +before relying only on scaling. + + +## Validate With Load Testing + +The most reliable way to validate a container size is to test the application +with realistic load. Use production-like traffic patterns, important endpoints, +background jobs, and non-sensitive data. + +During the test, monitor: + +- Memory and swap usage. +- CPU usage. +- Response time. +- 5xx errors. +- Restart events. +- Application and database bottlenecks. + +If the application approaches its memory limit or starts swapping heavily, +either increase the container size, reduce memory usage, or tune the runtime +settings that control memory and concurrency. + +{% note %} +Before running intensive load tests against an application hosted on Scalingo, +read our [external testing procedures][external-testing]. +{% endnote %} + + +## Choose the Right Remediation + +High memory usage does not always require the same response: + +- Increase the container size when each container needs more memory to run + safely. +- Add more containers when memory usage increases because traffic increases and + the workload can be distributed. +- Tune runtime memory or concurrency settings when the runtime starts too many + workers, uses too large a heap, or keeps too little memory headroom. +- Reduce application memory usage by reviewing caches, large in-memory data + structures, payload processing, or background job behavior. +- Investigate a memory leak when memory usage keeps growing over time. + +If the application is critical or you are unsure about the safest sizing +strategy, contact Scalingo support. + + +## Monitor Memory Risk + +Configure [alerts][alerts] for RAM and swap usage, and keep +[notifiers][notifiers] configured so the right people receive notifications +before memory usage becomes critical. + +If the application consumes all its available memory, it can be terminated by +the system. See the [Runtime Issues][oom-diagnosis] page for Out of Memory +crash diagnosis and recovery guidance. + + +## Runtime Tuning + +Scaling changes the resources available to the application. Runtime tuning +changes how the application uses these resources. Depending on the language and +runtime, you may need to tune memory limits, heap size, workers, or concurrency. + +See the language-specific documentation for the main runtime settings: + +- [Go][go] +- [Java][java] +- [Node.js][nodejs] +- [PHP][php] +- [Python][python] +- [Ruby][ruby] + + +[alerts]: {% post_url platform/app/2000-01-01-alerts %} +[container-sizes]: {% post_url platform/internals/2000-01-01-container-sizes %} +[external-testing]: {% post_url security/procedures/2000-01-01-external-testing %}#can-i-run-a-load-test-on-my-application-that-is-running-on-scalingo +[metrics]: {% post_url platform/app/2000-01-01-metrics %} +[notifiers]: {% post_url platform/app/2000-01-01-notifiers %} +[oom-diagnosis]: {% post_url platform/app/troubleshooting/2000-01-01-runtime-issues %}#out-of-memory-crashes + +[ruby]: {% post_url languages/ruby/2000-01-01-start %}#memory-management +[go]: {% post_url languages/go/2000-01-01-start %}#memory-management +[java]: {% post_url languages/java/2000-01-01-start %}#memory-management +[nodejs]: {% post_url languages/nodejs/2000-01-01-start %}#memory-management +[php]: {% post_url languages/php/2000-01-01-start %}#memory-management +[python]: {% post_url languages/python/2000-01-01-start %}#memory-management diff --git a/src/_posts/platform/app/scaling/2000-01-01-scaling.md b/src/_posts/platform/app/scaling/2000-01-01-scaling.md index da50929f0..b1f40df26 100644 --- a/src/_posts/platform/app/scaling/2000-01-01-scaling.md +++ b/src/_posts/platform/app/scaling/2000-01-01-scaling.md @@ -1,7 +1,7 @@ --- title: Scaling Your Application nav: Scaling -modified_at: 2026-01-02 12:00:00 +modified_at: 2026-05-12 00:00:00 index: 10 --- @@ -71,15 +71,40 @@ Here is a quick comparison table, in the context of a Platform as a Service: | **When** | Lack or overuse of CPU or RAM (swap increase or decrease) | Increase or decrease in total application traffic | -## Limitations +### Scaling and Memory Usage -- Vertical scaling is limited by the platform. The biggest container we can - currently boot is the `2XL` container, with 4GB of RAM. For a comprehensive - list of container sizes and corresponding specifications, please see our +If your application is approaching its memory limit, check its +[memory metrics][metrics] and apply the same distinction between vertical and +horizontal scaling to the memory usage pattern. + +Use vertical scaling when each container needs more memory to run safely. In +that case, choose a larger [container size][container-sizes]. Adding more +containers can help when memory usage increases because traffic increases and +the workload can be distributed across multiple containers. However, horizontal +scaling will not fix an application that individually requires more memory than +the selected container size provides. + +If memory usage keeps increasing over time without returning to a stable +baseline, investigate a possible memory leak before relying only on scaling. + +Before choosing a smaller container size, validate your application under +realistic load and monitor memory usage over time. We recommend keeping enough +headroom below the memory limit to reduce the risk of +[Out of Memory crashes][oom-diagnosis]. See +[Choosing a Container Size][choosing-container-size] for detailed sizing +guidance, and configure [alerts][alerts] with [app notifiers][notifiers] to be +warned before memory usage becomes critical. + + +## Scaling Boundaries + +- Vertical scaling currently goes up to the `2XL` container size, with 4GB of + RAM. For a comprehensive list of container sizes and corresponding + specifications, please see our [dedicated documentation page]({% post_url platform/internals/2000-01-01-container-sizes %}). -- Horizontal scaling is limited by default to a maximum of 10 containers per - [process type]({% post_url platform/app/2000-01-01-procfile %}). This limit - can be increased via our support team. +- Horizontal scaling is available by default up to 10 containers per + [process type]({% post_url platform/app/2000-01-01-procfile %}). This + boundary can be increased via our support team. ## Costs @@ -233,3 +258,9 @@ To learn more about events and notifiers, please visit the page dedicated to [routing-requests]: {% post_url platform/networking/public/2000-01-01-routing %}#requests-distribution [Scalingo Autoscaler]: {% post_url platform/app/scaling/2000-01-01-scalingo-autoscaler %} +[alerts]: {% post_url platform/app/2000-01-01-alerts %} +[choosing-container-size]: {% post_url platform/app/scaling/2000-01-01-choosing-container-size %} +[container-sizes]: {% post_url platform/internals/2000-01-01-container-sizes %} +[metrics]: {% post_url platform/app/2000-01-01-metrics %} +[notifiers]: {% post_url platform/app/2000-01-01-notifiers %} +[oom-diagnosis]: {% post_url platform/app/troubleshooting/2000-01-01-runtime-issues %}#out-of-memory-crashes From 018f1ce70449a7246850c572677f77f62310606d Mon Sep 17 00:00:00 2001 From: Benjamin ACH Date: Tue, 12 May 2026 14:10:14 +0200 Subject: [PATCH 02/13] Reorganize the section order --- .../platform/app/scaling/2000-01-01-choosing-container-size.md | 2 +- src/_posts/platform/app/scaling/2000-01-01-scaling.md | 1 - 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md b/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md index 231c4b178..87d6d3454 100644 --- a/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md +++ b/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md @@ -3,7 +3,7 @@ title: Choosing a Container Size nav: Choosing a Container Size modified_at: 2026-05-12 00:00:00 tags: app scaling containers memory metrics -index: 15 +index: 1 --- Choosing the right container size is a balance between safety, performance, and diff --git a/src/_posts/platform/app/scaling/2000-01-01-scaling.md b/src/_posts/platform/app/scaling/2000-01-01-scaling.md index b1f40df26..8f86919c7 100644 --- a/src/_posts/platform/app/scaling/2000-01-01-scaling.md +++ b/src/_posts/platform/app/scaling/2000-01-01-scaling.md @@ -70,7 +70,6 @@ Here is a quick comparison table, in the context of a Platform as a Service: | **Flexibility** | Low, limited by physical/virtual constraints | High, limited by the application architecture | | **When** | Lack or overuse of CPU or RAM (swap increase or decrease) | Increase or decrease in total application traffic | - ### Scaling and Memory Usage If your application is approaching its memory limit, check its From 6c92fb280a27c278cbdf158ce878f17857ff7385 Mon Sep 17 00:00:00 2001 From: Benjamin ACH Date: Tue, 12 May 2026 14:20:01 +0200 Subject: [PATCH 03/13] docs(scaling): clarify memory sizing constraint --- .../app/scaling/2000-01-01-choosing-container-size.md | 6 ++++-- src/_posts/platform/app/scaling/2000-01-01-scaling.md | 2 +- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md b/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md index 87d6d3454..ae31e582b 100644 --- a/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md +++ b/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md @@ -7,8 +7,10 @@ index: 1 --- Choosing the right container size is a balance between safety, performance, and -cost. A safe initial size gives your application enough memory headroom while -you collect real metrics and validate the workload. +cost. Memory is often the resource that most directly constrains this choice +because each running container must stay below its own memory quota. A safe +initial size gives your application enough memory headroom while you collect +real metrics and validate the workload. For simple applications, the default `M` container size is often a reasonable starting point. Choose a larger size from the beginning when you already know diff --git a/src/_posts/platform/app/scaling/2000-01-01-scaling.md b/src/_posts/platform/app/scaling/2000-01-01-scaling.md index 8f86919c7..369b33493 100644 --- a/src/_posts/platform/app/scaling/2000-01-01-scaling.md +++ b/src/_posts/platform/app/scaling/2000-01-01-scaling.md @@ -70,7 +70,7 @@ Here is a quick comparison table, in the context of a Platform as a Service: | **Flexibility** | Low, limited by physical/virtual constraints | High, limited by the application architecture | | **When** | Lack or overuse of CPU or RAM (swap increase or decrease) | Increase or decrease in total application traffic | -### Scaling and Memory Usage +### Memory Usage and Scaling Decisions If your application is approaching its memory limit, check its [memory metrics][metrics] and apply the same distinction between vertical and From bd36f825b58cb8160144b3b72b2afce48aa6ff56 Mon Sep 17 00:00:00 2001 From: Benjamin ACH Date: Tue, 12 May 2026 15:02:33 +0200 Subject: [PATCH 04/13] docs(scaling): keep sizing guidance focused on scaling --- .../2000-01-01-choosing-container-size.md | 89 +++++-------------- 1 file changed, 20 insertions(+), 69 deletions(-) diff --git a/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md b/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md index ae31e582b..e729cf90d 100644 --- a/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md +++ b/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md @@ -18,15 +18,15 @@ that your application has higher memory needs, for example because it uses a memory-intensive runtime, high concurrency, large in-memory datasets, caches, background jobs processing large payloads, or unknown production traffic. -See the [container sizes][container-sizes] page for the available sizes and -their memory limits. +See the [container sizes][container-sizes] page for the available sizes, memory +limits, and PID limits. -## Start Safe, Then Tune +## Start Safe, Then Adjust When you are unsure about the right size, start with a size that is slightly -larger than your first estimate. After deployment, use metrics, alerts, and -load testing to adjust the size. +larger than your first estimate. After deployment, use [metrics][metrics], +[alerts][alerts], and realistic load testing to adjust the size. Avoid choosing a smaller size only because the application starts successfully. An application can boot with low memory usage and still consume much more @@ -37,43 +37,24 @@ Before downsizing, validate that the application keeps enough memory headroom below the limit over time. -## Read Memory Metrics Before Changing Size +## Validate With Metrics and Load Testing -Before changing the container size, inspect the memory charts in the -[Metrics tab][metrics]. Compare the application memory usage with the memory -quota of the selected container size. +Before changing the container size, inspect the application charts in the +[Metrics tab][metrics]. Compare memory usage with the memory quota of the +selected container size, and also review CPU usage and application-level +signals. Pay attention to: +- CPU usage. - RAM and swap usage. - Whether memory usage returns to a stable baseline after traffic peaks. -- Per-container details when the application runs several containers. -- Deploy, restart, scale, traffic, and background job events around memory - spikes. - -Memory usage that keeps increasing over time without returning to a stable -baseline can indicate a memory leak. In that case, investigate the application -before relying only on scaling. - - -## Validate With Load Testing - -The most reliable way to validate a container size is to test the application -with realistic load. Use production-like traffic patterns, important endpoints, -background jobs, and non-sensitive data. - -During the test, monitor: - -- Memory and swap usage. -- CPU usage. - Response time. - 5xx errors. - Restart events. -- Application and database bottlenecks. -If the application approaches its memory limit or starts swapping heavily, -either increase the container size, reduce memory usage, or tune the runtime -settings that control memory and concurrency. +If production metrics are not enough to validate a size, test the application +with realistic load and non-sensitive data. {% note %} Before running intensive load tests against an application hosted on Scalingo, @@ -81,51 +62,26 @@ read our [external testing procedures][external-testing]. {% endnote %} -## Choose the Right Remediation - -High memory usage does not always require the same response: +## Adjust With Scaling -- Increase the container size when each container needs more memory to run - safely. -- Add more containers when memory usage increases because traffic increases and - the workload can be distributed. -- Tune runtime memory or concurrency settings when the runtime starts too many - workers, uses too large a heap, or keeps too little memory headroom. -- Reduce application memory usage by reviewing caches, large in-memory data - structures, payload processing, or background job behavior. -- Investigate a memory leak when memory usage keeps growing over time. +Use the general [scaling][scaling] guidance to decide whether the application +needs vertical scaling, horizontal scaling, or a combination of both. If the application is critical or you are unsure about the safest sizing strategy, contact Scalingo support. -## Monitor Memory Risk +## Monitor Resource Usage -Configure [alerts][alerts] for RAM and swap usage, and keep +Configure [alerts][alerts] for critical metrics, and keep [notifiers][notifiers] configured so the right people receive notifications -before memory usage becomes critical. +before resource usage becomes critical. If the application consumes all its available memory, it can be terminated by the system. See the [Runtime Issues][oom-diagnosis] page for Out of Memory crash diagnosis and recovery guidance. -## Runtime Tuning - -Scaling changes the resources available to the application. Runtime tuning -changes how the application uses these resources. Depending on the language and -runtime, you may need to tune memory limits, heap size, workers, or concurrency. - -See the language-specific documentation for the main runtime settings: - -- [Go][go] -- [Java][java] -- [Node.js][nodejs] -- [PHP][php] -- [Python][python] -- [Ruby][ruby] - - [alerts]: {% post_url platform/app/2000-01-01-alerts %} [container-sizes]: {% post_url platform/internals/2000-01-01-container-sizes %} [external-testing]: {% post_url security/procedures/2000-01-01-external-testing %}#can-i-run-a-load-test-on-my-application-that-is-running-on-scalingo @@ -133,9 +89,4 @@ See the language-specific documentation for the main runtime settings: [notifiers]: {% post_url platform/app/2000-01-01-notifiers %} [oom-diagnosis]: {% post_url platform/app/troubleshooting/2000-01-01-runtime-issues %}#out-of-memory-crashes -[ruby]: {% post_url languages/ruby/2000-01-01-start %}#memory-management -[go]: {% post_url languages/go/2000-01-01-start %}#memory-management -[java]: {% post_url languages/java/2000-01-01-start %}#memory-management -[nodejs]: {% post_url languages/nodejs/2000-01-01-start %}#memory-management -[php]: {% post_url languages/php/2000-01-01-start %}#memory-management -[python]: {% post_url languages/python/2000-01-01-start %}#memory-management +[scaling]: {% post_url platform/app/scaling/2000-01-01-scaling %} From 1cb0ec66105850aee4b93d5b9d6731b1ab3e5c65 Mon Sep 17 00:00:00 2001 From: Benjamin ACH Date: Tue, 12 May 2026 15:14:30 +0200 Subject: [PATCH 05/13] Fix wording, links and anchors --- .../scaling/2000-01-01-scalingo-autoscaler.md | 52 +++++++++---------- 1 file changed, 26 insertions(+), 26 deletions(-) diff --git a/src/_posts/platform/app/scaling/2000-01-01-scalingo-autoscaler.md b/src/_posts/platform/app/scaling/2000-01-01-scalingo-autoscaler.md index 5460348c6..7274cd9d2 100644 --- a/src/_posts/platform/app/scaling/2000-01-01-scalingo-autoscaler.md +++ b/src/_posts/platform/app/scaling/2000-01-01-scalingo-autoscaler.md @@ -1,7 +1,7 @@ --- title: Scalingo Autoscaler nav: Scalingo Autoscaler -modified_at: 2025-12-29 11:04:00 +modified_at: 2026-05-12 00:00:00 tags: app scaling autoscaling metrics autoscaler index: 20 --- @@ -21,7 +21,7 @@ metric while remaining within strict boundaries to prevent unforeseen costs. An Autoscaler is linked to a [process type]({% post_url platform/app/2000-01-01-procfile %}), which means you can have multiple Autoscalers for the same application, as long -as each one is setup for a different process type. Each Autoscaler can be setup +as each one is set up for a different process type. Each Autoscaler can be set up differently. When the configured metric deviates from the defined *target*, the Autoscaler @@ -44,22 +44,22 @@ to take action: round, ensuring a progressive and controlled adjustment. When the metric is RPM per container, the Autoscaler is able to add more than one container per decision round when required, allowing to scale much faster. The maximum - number of containers limit is still honoured in such a case. + number of containers still applies in such a case. These rules are designed to prevent the application from scaling wildly. They make autoscaling effective at handling moderately increasing or decreasing metrics, but less effective at managing sudden, massive spikes. For such cases, -and considering they are predictable, we usually adivse to manually scale the +and considering they are predictable, we usually advise to manually scale the application to an appropriate container formation. -## Chosing a Metric +## Choosing a Metric An Autoscaler can depend on 6 different metrics: | Metric | Kind | Keyword | | ------------------------------------------------------------------------ | ----------- | ------------------- | -| [Requests Per Minute (RPM) per container](rpm-per-container-recommended) | `router` | `rpm_per_container` | +| [Requests Per Minute (RPM) per container](#rpm-per-container-recommended) | `router` | `rpm_per_container` | | [Response Time](#response-time) | `router` | `p95_response_time` | | [Number of 5xx errors](#5xx-errors) | `router` | `5XX` | | [CPU consumption](#cpu-consumption) | `technical` | `cpu` | @@ -147,7 +147,7 @@ distribute the load. However, if high CPU usage persists despite autoscaling, it may indicate that your application requires more powerful containers to run properly. In such -cases, switching for a larger plan (vertical scaling) can provide more CPU +cases, switching to a larger plan (vertical scaling) can provide more CPU priority per container, thus enhancing the ability of your application to deal with resource-intensive scenarios. @@ -235,7 +235,7 @@ resource-intensive endpoints of your application. Although the Scalingo Autoscaler itself is free, the additional containers started during a scale-out operation are billed like any other container (on -the other hand, scaling-in allows to save costs). +the other hand, scaling-in can save costs). Consequently, billing depends on the type of container you chose for your application (M is the default container size), on the maximum number of @@ -247,7 +247,7 @@ workload. {% warning %} Adding an Autoscaler also immediately enables it! -Since this can lead to additional costs, please make sure to chose appropriate +Since this can lead to additional costs, please make sure to choose appropriate options and values before validating. [It can be disabled](#disabling-an-autoscaler) if needed. {% endwarning %} @@ -258,7 +258,7 @@ options and values before validating. 2. Click on the **Resources** tab 3. Locate the **Containers** block 4. In this block, locate the **Scale** button next to the process type for - which you want to setup the Autoscaler + which you want to set up the Autoscaler 5. Click the down arrow next to the **Scale** button 6. From the dropdown menu, select **Setup autoscaler** 7. The following popup window appears: @@ -269,15 +269,15 @@ options and values before validating. 1. Pick the minimum number of containers for this process type (minimum is 2) 2. Pick the maximum number of containers for this process type - 3. Chose the metric to watch - 4. Chose a value above which the Autoscaler considers scaling-out + 3. Choose the metric to watch + 4. Choose a value above which the Autoscaler considers scaling-out 5. Validate by clicking the **Confirm** button 9. **The Autoscaler is configured and enabled** ### Using the Command Line -1. Make sure you have correctly [setup the Scalingo command line tool]({% post_url tools/cli/2000-01-01-start %}) -2. From the command line, run the following command to setup the Autoscaler: +1. Make sure you have correctly [set up the Scalingo command line tool]({% post_url tools/cli/2000-01-01-start %}) +2. From the command line, run the following command to set up the Autoscaler: ```bash scalingo --app my-app autoscalers-add --container-type \ --metric --target \ @@ -288,10 +288,10 @@ options and values before validating. Name of the process type to scale (e.g. `web`, `clock`, `scheduler`, ...) - `metric`\ Name of the metric to watch.\ - Please refer to the *Keyword* column of the [metrics table](#available-metrics) + Please refer to the *Keyword* column of the [metrics table](#choosing-a-metric) for available values. - `target`\ - The value for metric that serves as boundary to trigger a scale operation + The metric value that serves as a boundary to trigger a scale operation - `min`\ Minimum number of containers to run - `max`\ @@ -321,13 +321,13 @@ options and values before validating. `web` process type when the total number of requests received by the application divided by the number of running `web` containers exceeds 1000. It will start a maximum of 10 containers.\ - Please refer to the *Keyword* column of the [metrics table](#available-metrics) + Please refer to the *Keyword* column of the [metrics table](#choosing-a-metric) for available values. ## Enabling the Autoscaler -Enabling (or re-enabling) an Autoscaler allows to put a previously [disabled](#disabling-an-autoscaler) +Enabling (or re-enabling) an Autoscaler allows you to put a previously [disabled](#disabling-an-autoscaler) Autoscaler back in action, using the saved configuration. When enabling an Autoscaler, and depending on the current state, the platform @@ -346,8 +346,8 @@ may decide to either scale-out (i.e. boot up additional containers) or scale-in ### Using the Command Line -1. Make sure you have correctly [setup the Scalingo command line tool]({% post_url tools/cli/2000-01-01-start %}) -2. Make sure you have [added and configured an Autoscaler](#configuring-an-autoscaler) +1. Make sure you have correctly [set up the Scalingo command line tool]({% post_url tools/cli/2000-01-01-start %}) +2. Make sure you have [added and configured an Autoscaler](#creating-an-autoscaler) 3. From the command line, enable the Autoscaler: ```bash scalingo --app my-app autoscalers-enable @@ -373,14 +373,14 @@ may decide to either scale-out (i.e. boot up additional containers) or scale-in ## Disabling an Autoscaler -Disabling an Autoscaler allows to put it out of action, while saving its -configuration for later use. It can be [re-enabled](#enabling-an-autoscaler) +Disabling an Autoscaler allows you to put it out of action, while saving its +configuration for later use. It can be [re-enabled](#enabling-the-autoscaler) anytime. Sometimes it can be useful to temporarily disable an Autoscaler to only rely on manual scaling, be it for testing purposes, to handle a planned peak such as the Christmas period for an e-commerce website, and so on... This feature -allows to put an Autoscaler aside for an undetermined amount of time, after +allows you to put an Autoscaler aside for an undetermined amount of time, after which it can be re-enabled with the same configuration. When disabling an Autoscaler, the platform does not scale-in. The number of @@ -402,8 +402,8 @@ running containers remains the same. ### Using the Command Line -1. Make sure you have correctly [setup the Scalingo command line tool]({% post_url tools/cli/2000-01-01-start %}) -2. Make sure you have [added and configured an Autoscaler](#configuring-an-autoscaler) +1. Make sure you have correctly [set up the Scalingo command line tool]({% post_url tools/cli/2000-01-01-start %}) +2. Make sure you have [added and configured an Autoscaler](#creating-an-autoscaler) 3. From the command line, disable the Autoscaler: ```bash scalingo --app my-app autoscalers-disable @@ -429,7 +429,7 @@ running containers remains the same. ## Monitoring the Autoscaler -The following event is available to monitor the Autoscaler executions: +The following event is available to monitor Autoscaler executions: | Event | Description | | ------------ | -------------------------------------------------------------------- | From 0e5006d5c02267e45e192fcc9607766be9ee4040 Mon Sep 17 00:00:00 2001 From: Benjamin ACH Date: Tue, 12 May 2026 15:14:56 +0200 Subject: [PATCH 06/13] Fix typos --- .../app/scaling/2000-01-01-choosing-container-size.md | 1 - src/_posts/platform/app/scaling/2000-01-01-scaling.md | 6 +++--- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md b/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md index e729cf90d..3851ca5d8 100644 --- a/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md +++ b/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md @@ -88,5 +88,4 @@ crash diagnosis and recovery guidance. [metrics]: {% post_url platform/app/2000-01-01-metrics %} [notifiers]: {% post_url platform/app/2000-01-01-notifiers %} [oom-diagnosis]: {% post_url platform/app/troubleshooting/2000-01-01-runtime-issues %}#out-of-memory-crashes - [scaling]: {% post_url platform/app/scaling/2000-01-01-scaling %} diff --git a/src/_posts/platform/app/scaling/2000-01-01-scaling.md b/src/_posts/platform/app/scaling/2000-01-01-scaling.md index 369b33493..44d09ccfc 100644 --- a/src/_posts/platform/app/scaling/2000-01-01-scaling.md +++ b/src/_posts/platform/app/scaling/2000-01-01-scaling.md @@ -95,15 +95,15 @@ guidance, and configure [alerts][alerts] with [app notifiers][notifiers] to be warned before memory usage becomes critical. -## Scaling Boundaries +## Scaling Limits - Vertical scaling currently goes up to the `2XL` container size, with 4GB of RAM. For a comprehensive list of container sizes and corresponding specifications, please see our [dedicated documentation page]({% post_url platform/internals/2000-01-01-container-sizes %}). - Horizontal scaling is available by default up to 10 containers per - [process type]({% post_url platform/app/2000-01-01-procfile %}). This - boundary can be increased via our support team. + [process type]({% post_url platform/app/2000-01-01-procfile %}). This limit + can be increased via our support team. ## Costs From d15538d131656a57f5fd5573c2dc427028674bff Mon Sep 17 00:00:00 2001 From: Benjamin ACH Date: Tue, 12 May 2026 15:20:29 +0200 Subject: [PATCH 07/13] Fix another dead anchor --- .../platform/app/troubleshooting/2000-01-01-runtime-issues.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/_posts/platform/app/troubleshooting/2000-01-01-runtime-issues.md b/src/_posts/platform/app/troubleshooting/2000-01-01-runtime-issues.md index 031820d95..d33059bbb 100644 --- a/src/_posts/platform/app/troubleshooting/2000-01-01-runtime-issues.md +++ b/src/_posts/platform/app/troubleshooting/2000-01-01-runtime-issues.md @@ -46,8 +46,8 @@ error and the impact it has on your application: The very first step to mitigate the consequences of a Runtime Error is to ensure that you have regular, tested backups (for databases, this feature is -included in all our *business* plans) and to setup some -[redundancy]({% post_url platform/app/scaling/2000-01-01-scaling %}#redundancy). +included in all our *business* plans) and to set up some +[redundancy]({% post_url platform/app/scaling/2000-01-01-scaling %}#horizontal-scaling). Additionally, we generally advise to have a disaster recovery plan in place. This plan should ideally outline the actions to be taken in the event of such a From 4e86b6da1e3dffe379317bea07c56c57e84f0209 Mon Sep 17 00:00:00 2001 From: Benjamin ACH Date: Mon, 18 May 2026 18:01:55 +0200 Subject: [PATCH 08/13] docs: add architecture optimization guide --- .../2000-01-01-choosing-container-size.md | 13 +- ...-01-optimizing-application-architecture.md | 123 ++++++++++++++++++ .../app/scaling/2000-01-01-scaling.md | 45 +++---- 3 files changed, 148 insertions(+), 33 deletions(-) create mode 100644 src/_posts/platform/app/scaling/2000-01-01-optimizing-application-architecture.md diff --git a/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md b/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md index 3851ca5d8..9db7bdd5a 100644 --- a/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md +++ b/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md @@ -1,7 +1,7 @@ --- title: Choosing a Container Size nav: Choosing a Container Size -modified_at: 2026-05-12 00:00:00 +modified_at: 2026-05-18 00:00:00 tags: app scaling containers memory metrics index: 1 --- @@ -24,8 +24,8 @@ limits, and PID limits. ## Start Safe, Then Adjust -When you are unsure about the right size, start with a size that is slightly -larger than your first estimate. After deployment, use [metrics][metrics], +When you are unsure about the right size, start with a size that gives your +application enough headroom. After deployment, use [metrics][metrics], [alerts][alerts], and realistic load testing to adjust the size. Avoid choosing a smaller size only because the application starts successfully. @@ -62,10 +62,11 @@ read our [external testing procedures][external-testing]. {% endnote %} -## Adjust With Scaling +## Match Capacity to Traffic -Use the general [scaling][scaling] guidance to decide whether the application -needs vertical scaling, horizontal scaling, or a combination of both. +Once you have chosen the target size for each process type, use +[Scaling Your Application][scaling] to configure the expected capacity for your +traffic and workload. If the application is critical or you are unsure about the safest sizing strategy, contact Scalingo support. diff --git a/src/_posts/platform/app/scaling/2000-01-01-optimizing-application-architecture.md b/src/_posts/platform/app/scaling/2000-01-01-optimizing-application-architecture.md new file mode 100644 index 000000000..a7f1e6823 --- /dev/null +++ b/src/_posts/platform/app/scaling/2000-01-01-optimizing-application-architecture.md @@ -0,0 +1,123 @@ +--- +title: Optimizing Application Architecture +nav: Optimized Architecture +modified_at: 2026-05-18 00:00:00 +tags: app scaling architecture containers memory metrics performance concurrency +index: 5 +--- + +An optimized application architecture makes efficient use of the resources +allocated to each container while keeping the application easy to operate. On +Scalingo, this usually means separating workloads into [process types][procfile], +keeping web requests short, tuning concurrency carefully, and using metrics to +decide when to optimize code, split work, or scale the application. + +This page focuses on how to structure the application workload. To choose a +container size, see [Choosing a Container Size][choosing-container-size]. To +change the number or size of running containers, see +[Scaling Your Application][scaling]. + + +## Design Around Process Types + +Use [process types][procfile] to separate workloads that do not have the same +operational profile. A `web` process should handle HTTP requests and return +responses quickly. Background workers, schedulers, importers, exporters, and +other resource-intensive jobs should run in dedicated process types. + +This separation has several advantages: + +- each workload can have its own number of containers; +- each workload can use a container size adapted to its resource profile; +- long or heavy tasks do not block request handling; +- worker concurrency can be tuned independently from web concurrency; +- incidents are easier to diagnose from metrics and logs. + +Typical process types include: + +- `web` for HTTP traffic; +- `worker` for background jobs; +- `clock` or `scheduler` for recurring jobs; +- dedicated workers for heavy jobs such as PDF generation, image processing, + imports, exports, or batch tasks. + +For long tasks triggered by a user request, return quickly from the web process +and process the work asynchronously. See [Long Running Process][long-process] +for the general pattern. + + +## Tune Concurrency Carefully + +Concurrency lets a process handle more work in parallel, but each additional +thread, worker, or child process usually consumes more memory and may increase +database or external service pressure. + +Tune concurrency separately for each process type: + +- increase web concurrency only if response time and resource usage stay + healthy under realistic traffic; +- reduce worker concurrency if occasional jobs create memory pressure; +- keep enough database connections for the configured concurrency; +- use separate process types for jobs that have different resource profiles, + such as CPU-heavy and memory-heavy jobs. + +Some runtimes expose Scalingo-specific or buildpack-provided defaults and +environment variables. See the language pages for details: +[Ruby][ruby], [Python][python], [PHP][php], [Java][java], [Node.js][nodejs], +and [Go][go]. + + +## Handle Memory-Intensive Workloads + +Memory pressure is often caused by specific workloads rather than by every +request. Check whether the application consumes more memory regularly, or only +during occasional tasks such as: + +- background jobs; +- PDF generation; +- image or video processing; +- large imports or exports; +- report generation; +- scheduled batch tasks; +- large in-memory caches or datasets. + +Depending on what you observe, prefer the smallest change that addresses the +actual cause: + +- optimize the code path that allocates too much memory; +- tune runtime-specific memory settings; +- split heavy jobs into a dedicated process type; +- reduce worker or job concurrency; +- split large jobs into smaller chunks; +- isolate workloads that do not have the same resource profile. + +If each container still needs more memory after these changes, continue with +[Choosing a Container Size][choosing-container-size]. + + +## Size and Scale Your App + +Once your application is optimized for its workload: + +- [choose the right size][choosing-container-size] for each process type; +- [scale the application][scaling] to match capacity to traffic; +- read [Application Metrics][metrics] and configure [alerts][alerts] to monitor + the application after changes. + +If the application reaches its memory limit and crashes, see [Runtime +Issues][oom-diagnosis] for diagnosis and recovery guidance. + + +[alerts]: {% post_url platform/app/2000-01-01-alerts %} +[choosing-container-size]: {% post_url platform/app/scaling/2000-01-01-choosing-container-size %} +[go]: {% post_url languages/go/2000-01-01-start %} +[java]: {% post_url languages/java/2000-01-01-start %} +[long-process]: {% post_url platform/app/2000-01-01-long-process %} +[metrics]: {% post_url platform/app/2000-01-01-metrics %} +[nodejs]: {% post_url languages/nodejs/2000-01-01-start %} +[oom-diagnosis]: {% post_url platform/app/troubleshooting/2000-01-01-runtime-issues %}#out-of-memory-crashes +[php]: {% post_url languages/php/2000-01-01-start %} +[procfile]: {% post_url platform/app/2000-01-01-procfile %} +[python]: {% post_url languages/python/2000-01-01-start %} +[ruby]: {% post_url languages/ruby/2000-01-01-start %} +[scaling]: {% post_url platform/app/scaling/2000-01-01-scaling %} diff --git a/src/_posts/platform/app/scaling/2000-01-01-scaling.md b/src/_posts/platform/app/scaling/2000-01-01-scaling.md index 44d09ccfc..41b4fd42c 100644 --- a/src/_posts/platform/app/scaling/2000-01-01-scaling.md +++ b/src/_posts/platform/app/scaling/2000-01-01-scaling.md @@ -1,7 +1,7 @@ --- title: Scaling Your Application nav: Scaling -modified_at: 2026-05-12 00:00:00 +modified_at: 2026-05-18 00:00:00 index: 10 --- @@ -62,13 +62,13 @@ application to traffic fluctuations. Here is a quick comparison table, in the context of a Platform as a Service: -| | Vertical Scaling | Horizontal Scaling | -| --------------- | --------------------------------------- | --------------------------------- | -| **Approach** | Enhancing individual instance capacity | Adding more instances | -| **Cost** | Can become expensive at higher limits | Often more cost-efficient | -| **Resilience** | Low (single point of failure) | High (distributed resources) | -| **Flexibility** | Low, limited by physical/virtual constraints | High, limited by the application architecture | -| **When** | Lack or overuse of CPU or RAM (swap increase or decrease) | Increase or decrease in total application traffic | +| | Vertical Scaling | Horizontal Scaling | +|-----------------|----------------------------------------------|---------------------------------------------------| +| **Approach** | Enhancing individual instance capacity | Adding more instances | +| **Cost** | Can become expensive at higher limits | Often more cost-efficient | +| **Resilience** | Low (single point of failure) | High (distributed resources) | +| **Flexibility** | Low, limited by physical/virtual constraints | High, limited by the application architecture | +| **When** | Lack or overuse of resources (CPU or RAM) | Increase or decrease in total application traffic | ### Memory Usage and Scaling Decisions @@ -77,22 +77,16 @@ If your application is approaching its memory limit, check its horizontal scaling to the memory usage pattern. Use vertical scaling when each container needs more memory to run safely. In -that case, choose a larger [container size][container-sizes]. Adding more -containers can help when memory usage increases because traffic increases and -the workload can be distributed across multiple containers. However, horizontal -scaling will not fix an application that individually requires more memory than -the selected container size provides. +that case, choose a larger size with +[Choosing a Container Size][choosing-container-size]. Adding more containers +can help when memory usage increases because traffic increases and the workload +can be distributed across multiple containers. However, horizontal scaling will +not fix an application that individually requires more memory than the selected +container size provides. -If memory usage keeps increasing over time without returning to a stable -baseline, investigate a possible memory leak before relying only on scaling. - -Before choosing a smaller container size, validate your application under -realistic load and monitor memory usage over time. We recommend keeping enough -headroom below the memory limit to reduce the risk of -[Out of Memory crashes][oom-diagnosis]. See -[Choosing a Container Size][choosing-container-size] for detailed sizing -guidance, and configure [alerts][alerts] with [app notifiers][notifiers] to be -warned before memory usage becomes critical. +If memory pressure comes from specific jobs or high concurrency, first review +the application structure with +[Optimizing Application Architecture][optimizing-architecture]. ## Scaling Limits @@ -257,9 +251,6 @@ To learn more about events and notifiers, please visit the page dedicated to [routing-requests]: {% post_url platform/networking/public/2000-01-01-routing %}#requests-distribution [Scalingo Autoscaler]: {% post_url platform/app/scaling/2000-01-01-scalingo-autoscaler %} -[alerts]: {% post_url platform/app/2000-01-01-alerts %} [choosing-container-size]: {% post_url platform/app/scaling/2000-01-01-choosing-container-size %} -[container-sizes]: {% post_url platform/internals/2000-01-01-container-sizes %} [metrics]: {% post_url platform/app/2000-01-01-metrics %} -[notifiers]: {% post_url platform/app/2000-01-01-notifiers %} -[oom-diagnosis]: {% post_url platform/app/troubleshooting/2000-01-01-runtime-issues %}#out-of-memory-crashes +[optimizing-architecture]: {% post_url platform/app/scaling/2000-01-01-optimizing-application-architecture %} From a60b636ae37cdffea4798af2a0666b4c24035a79 Mon Sep 17 00:00:00 2001 From: Benjamin ACH Date: Mon, 18 May 2026 18:19:45 +0200 Subject: [PATCH 09/13] docs: clarify focused process guidance --- .../2000-01-01-optimizing-application-architecture.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/src/_posts/platform/app/scaling/2000-01-01-optimizing-application-architecture.md b/src/_posts/platform/app/scaling/2000-01-01-optimizing-application-architecture.md index a7f1e6823..5f6ad3d8b 100644 --- a/src/_posts/platform/app/scaling/2000-01-01-optimizing-application-architecture.md +++ b/src/_posts/platform/app/scaling/2000-01-01-optimizing-application-architecture.md @@ -25,10 +25,15 @@ operational profile. A `web` process should handle HTTP requests and return responses quickly. Background workers, schedulers, importers, exporters, and other resource-intensive jobs should run in dedicated process types. +Favor focused processes that can be scaled independently, with a clear role +and a predictable resource profile. This is usually easier to operate than a +single large process that handles every workload. + This separation has several advantages: - each workload can have its own number of containers; - each workload can use a container size adapted to its resource profile; +- focused containers can start and scale faster; - long or heavy tasks do not block request handling; - worker concurrency can be tuned independently from web concurrency; - incidents are easier to diagnose from metrics and logs. From 599beee977f9e16d525c19ff1dd0e59af9879513 Mon Sep 17 00:00:00 2001 From: Benjamin ACH Date: Thu, 28 May 2026 18:04:09 +0200 Subject: [PATCH 10/13] docs: clarify process type guidance --- ...-01-optimizing-application-architecture.md | 29 +++++++------------ 1 file changed, 11 insertions(+), 18 deletions(-) diff --git a/src/_posts/platform/app/scaling/2000-01-01-optimizing-application-architecture.md b/src/_posts/platform/app/scaling/2000-01-01-optimizing-application-architecture.md index 5f6ad3d8b..dc3fbd578 100644 --- a/src/_posts/platform/app/scaling/2000-01-01-optimizing-application-architecture.md +++ b/src/_posts/platform/app/scaling/2000-01-01-optimizing-application-architecture.md @@ -22,33 +22,25 @@ change the number or size of running containers, see Use [process types][procfile] to separate workloads that do not have the same operational profile. A `web` process should handle HTTP requests and return -responses quickly. Background workers, schedulers, importers, exporters, and -other resource-intensive jobs should run in dedicated process types. +responses quickly, using [background jobs][long-process] for long work. -Favor focused processes that can be scaled independently, with a clear role -and a predictable resource profile. This is usually easier to operate than a -single large process that handles every workload. - -This separation has several advantages: - -- each workload can have its own number of containers; -- each workload can use a container size adapted to its resource profile; -- focused containers can start and scale faster; -- long or heavy tasks do not block request handling; -- worker concurrency can be tuned independently from web concurrency; -- incidents are easier to diagnose from metrics and logs. +Run background workers, importers, exporters, and other resource-intensive jobs +in dedicated process types so each workload can use its own container count and +size, making scaling and debugging easier. Typical process types include: - `web` for HTTP traffic; - `worker` for background jobs; -- `clock` or `scheduler` for recurring jobs; +- `clock` or `scheduler` custom process types for long-running, + high-frequency, or precise recurring jobs; - dedicated workers for heavy jobs such as PDF generation, image processing, imports, exports, or batch tasks. -For long tasks triggered by a user request, return quickly from the web process -and process the work asynchronously. See [Long Running Process][long-process] -for the general pattern. +For most recurring jobs, consider the +[Scalingo Scheduler][scalingo-scheduler] with a `cron.json` file as the +built-in option for running scheduled tasks without a continuously running +process. ## Tune Concurrency Carefully @@ -126,3 +118,4 @@ Issues][oom-diagnosis] for diagnosis and recovery guidance. [python]: {% post_url languages/python/2000-01-01-start %} [ruby]: {% post_url languages/ruby/2000-01-01-start %} [scaling]: {% post_url platform/app/scaling/2000-01-01-scaling %} +[scalingo-scheduler]: {% post_url platform/app/task-scheduling/2000-01-01-scalingo-scheduler %} From 5e5b6b67e17cfa78b892ef0ec1cf5a35734860d4 Mon Sep 17 00:00:00 2001 From: Benjamin ACH Date: Fri, 29 May 2026 09:41:40 +0200 Subject: [PATCH 11/13] docs: clarify container sizing sections --- .../2000-01-01-choosing-container-size.md | 27 +++++++++---------- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md b/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md index 9db7bdd5a..0218c638b 100644 --- a/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md +++ b/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md @@ -6,11 +6,10 @@ tags: app scaling containers memory metrics index: 1 --- -Choosing the right container size is a balance between safety, performance, and -cost. Memory is often the resource that most directly constrains this choice -because each running container must stay below its own memory quota. A safe -initial size gives your application enough memory headroom while you collect -real metrics and validate the workload. +Choosing the right container size is a balance between reliability, +performance, and cost. Memory is often the resource that most directly +constrains this choice because each running container must stay below its own +memory quota. For simple applications, the default `M` container size is often a reasonable starting point. Choose a larger size from the beginning when you already know @@ -22,22 +21,19 @@ See the [container sizes][container-sizes] page for the available sizes, memory limits, and PID limits. -## Start Safe, Then Adjust +## Pick an Initial Size When you are unsure about the right size, start with a size that gives your -application enough headroom. After deployment, use [metrics][metrics], -[alerts][alerts], and realistic load testing to adjust the size. +application enough headroom to handle traffic peaks, expensive requests, and +occasional jobs. Avoid choosing a smaller size only because the application starts successfully. An application can boot with low memory usage and still consume much more memory under real traffic, scheduled jobs, large requests, or specific user flows. -Before downsizing, validate that the application keeps enough memory headroom -below the limit over time. - -## Validate With Metrics and Load Testing +## Validate the Size Before changing the container size, inspect the application charts in the [Metrics tab][metrics]. Compare memory usage with the memory quota of the @@ -56,13 +52,16 @@ Pay attention to: If production metrics are not enough to validate a size, test the application with realistic load and non-sensitive data. +Before downsizing, validate that the application keeps enough memory headroom +below the limit over time. + {% note %} Before running intensive load tests against an application hosted on Scalingo, read our [external testing procedures][external-testing]. {% endnote %} -## Match Capacity to Traffic +## Adjust the Formation Once you have chosen the target size for each process type, use [Scaling Your Application][scaling] to configure the expected capacity for your @@ -72,7 +71,7 @@ If the application is critical or you are unsure about the safest sizing strategy, contact Scalingo support. -## Monitor Resource Usage +## Keep Monitoring Configure [alerts][alerts] for critical metrics, and keep [notifiers][notifiers] configured so the right people receive notifications From c30a2ead8d21ad6744cfae03d651ece7f31e2ee8 Mon Sep 17 00:00:00 2001 From: Benjamin ACH Date: Fri, 29 May 2026 09:50:08 +0200 Subject: [PATCH 12/13] docs: tighten architecture optimization sections --- ...-01-optimizing-application-architecture.md | 31 ++++++++++--------- 1 file changed, 16 insertions(+), 15 deletions(-) diff --git a/src/_posts/platform/app/scaling/2000-01-01-optimizing-application-architecture.md b/src/_posts/platform/app/scaling/2000-01-01-optimizing-application-architecture.md index dc3fbd578..f8dcf1f90 100644 --- a/src/_posts/platform/app/scaling/2000-01-01-optimizing-application-architecture.md +++ b/src/_posts/platform/app/scaling/2000-01-01-optimizing-application-architecture.md @@ -24,18 +24,18 @@ Use [process types][procfile] to separate workloads that do not have the same operational profile. A `web` process should handle HTTP requests and return responses quickly, using [background jobs][long-process] for long work. -Run background workers, importers, exporters, and other resource-intensive jobs -in dedicated process types so each workload can use its own container count and -size, making scaling and debugging easier. +Run background workers, importers, exporters, and scheduled jobs in process +types that match their role. This makes each workload easier to scale, size, +and debug independently. Typical process types include: - `web` for HTTP traffic; - `worker` for background jobs; -- `clock` or `scheduler` custom process types for long-running, - high-frequency, or precise recurring jobs; -- dedicated workers for heavy jobs such as PDF generation, image processing, - imports, exports, or batch tasks. +- `clock` or `scheduler` [custom process types][custom-clock-processes] for + long-running, high-frequency, or precise recurring jobs; +- dedicated workers for imports, exports, batch tasks, or other specialized + workloads. For most recurring jobs, consider the [Scalingo Scheduler][scalingo-scheduler] with a `cron.json` file as the @@ -64,11 +64,11 @@ environment variables. See the language pages for details: and [Go][go]. -## Handle Memory-Intensive Workloads +## Isolate Resource-Intensive Workloads -Memory pressure is often caused by specific workloads rather than by every -request. Check whether the application consumes more memory regularly, or only -during occasional tasks such as: +Resource pressure is often caused by specific workloads rather than by every +request. Check whether the application consumes more CPU or memory regularly, +or only during occasional tasks such as: - background jobs; - PDF generation; @@ -81,12 +81,12 @@ during occasional tasks such as: Depending on what you observe, prefer the smallest change that addresses the actual cause: -- optimize the code path that allocates too much memory; -- tune runtime-specific memory settings; -- split heavy jobs into a dedicated process type; +- optimize the code path that consumes too many resources; +- tune runtime-specific memory or concurrency settings; - reduce worker or job concurrency; - split large jobs into smaller chunks; -- isolate workloads that do not have the same resource profile. +- move specialized jobs to their own process type when they do not share the + same resource profile as the rest of the application. If each container still needs more memory after these changes, continue with [Choosing a Container Size][choosing-container-size]. @@ -107,6 +107,7 @@ Issues][oom-diagnosis] for diagnosis and recovery guidance. [alerts]: {% post_url platform/app/2000-01-01-alerts %} [choosing-container-size]: {% post_url platform/app/scaling/2000-01-01-choosing-container-size %} +[custom-clock-processes]: {% post_url platform/app/task-scheduling/2000-01-01-custom-clock-processes %} [go]: {% post_url languages/go/2000-01-01-start %} [java]: {% post_url languages/java/2000-01-01-start %} [long-process]: {% post_url platform/app/2000-01-01-long-process %} From 4ea7ef900268f3f355b83faef1118dc755ce3762 Mon Sep 17 00:00:00 2001 From: Benjamin Date: Tue, 2 Jun 2026 14:13:11 +0200 Subject: [PATCH 13/13] Apply suggestions from code review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: François --- .../2000-01-01-choosing-container-size.md | 84 ++++++++++++------- .../app/scaling/2000-01-01-scaling.md | 2 +- .../scaling/2000-01-01-scalingo-autoscaler.md | 4 +- 3 files changed, 56 insertions(+), 34 deletions(-) diff --git a/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md b/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md index 0218c638b..8e8fc71fa 100644 --- a/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md +++ b/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md @@ -6,10 +6,7 @@ tags: app scaling containers memory metrics index: 1 --- -Choosing the right container size is a balance between reliability, -performance, and cost. Memory is often the resource that most directly -constrains this choice because each running container must stay below its own -memory quota. +Choosing the right container size is an important step when deploying an application on Scalingo. Container size determines the amount of CPU and memory available to your application, which directly affects its performance, stability, and cost. Consequently, evaluating your application's resource requirements, and monitoring its consumption are crucial to match your workload, whether you're running a small development environment or a production-critical service. For simple applications, the default `M` container size is often a reasonable starting point. Choose a larger size from the beginning when you already know @@ -31,47 +28,70 @@ Avoid choosing a smaller size only because the application starts successfully. An application can boot with low memory usage and still consume much more memory under real traffic, scheduled jobs, large requests, or specific user flows. +## Picking an Initial Size +For simple applications, the default `M` container size is often a reasonable +starting point. Choose a larger size from the beginning when you already know +that your application has higher memory needs, for example because it uses a +memory-intensive runtime, high concurrency, large in-memory datasets, caches, +background jobs processing large payloads, or unknown production traffic. -## Validate the Size +When you are unsure about the right size, start with a size that gives your +application enough headroom to handle traffic peaks, expensive requests, and +occasional jobs. -Before changing the container size, inspect the application charts in the -[Metrics tab][metrics]. Compare memory usage with the memory quota of the -selected container size, and also review CPU usage and application-level -signals. +Avoid choosing a smaller size only because the application starts successfully. +An application can boot with low memory usage and still consume much more +memory under real traffic, scheduled jobs, large requests, or specific user +flows. -Pay attention to: +See the [container sizes][container-sizes] page for the available sizes, memory +limits, and PID limits. -- CPU usage. -- RAM and swap usage. -- Whether memory usage returns to a stable baseline after traffic peaks. -- Response time. -- 5xx errors. -- Restart events. +## Adjusting the Container Formation + +Once you have chosen an initial size for each [process type], regularly review application charts in the [Metrics tab][metrics]. + +Resource utilization trends provide valuable insight into whether the current container size is appropriate or if additional resources are required to maintain performance and stability: + +- **CPU usage**: Sustained high CPU utilization may indicate that the application + is CPU-bound and would benefit from additional CPU resources or more container + instances ([horizontal scaling][h-scaling]). Conversely, consistently low CPU + usage may suggest that the application is overprovisioned and could be + downsized to reduce costs. +- **Memory and swap**: Monitor memory consumption to ensure the application has + sufficient headroom during normal operation and traffic peaks. Frequent use of + swap space is a strong indicator of memory pressure and can significantly + degrade performance, often signaling the need for a [larger container + size][v-scaling]. +- **Application-level signals**: Increasing response times or a growing number of + server-side errors can indicate that the application is approaching its + capacity limits, even when CPU and memory utilization appear healthy, and may + warrant [scaling out][h-scaling] to maintain service quality and fluent user + experience. +- **Restart events**: Unexpected or recurring container restarts can point to + resource exhaustion, such as out-of-memory (OOM) conditions, application + crashes, or other operational issues. Investigating restart patterns can help + determine whether scaling up resources is necessary to improve application + stability. If production metrics are not enough to validate a size, test the application with realistic load and non-sensitive data. -Before downsizing, validate that the application keeps enough memory headroom -below the limit over time. +If the application is critical or you are unsure about the safest sizing +strategy, contact Scalingo support. {% note %} -Before running intensive load tests against an application hosted on Scalingo, -read our [external testing procedures][external-testing]. +- While scaling out and scaling up are generally safe options, we usually advise + to take extra care when scaling in or down. Please ensure that the application + keeps enough memory headroom below the limit over time to avoid any + unavailability. +- Before running intensive load tests against an application hosted on Scalingo, + read our [external testing procedures][external-testing]. {% endnote %} -## Adjust the Formation - -Once you have chosen the target size for each process type, use -[Scaling Your Application][scaling] to configure the expected capacity for your -traffic and workload. - -If the application is critical or you are unsure about the safest sizing -strategy, contact Scalingo support. - - -## Keep Monitoring +## Monitoring Configure [alerts][alerts] for critical metrics, and keep [notifiers][notifiers] configured so the right people receive notifications @@ -89,3 +109,5 @@ crash diagnosis and recovery guidance. [notifiers]: {% post_url platform/app/2000-01-01-notifiers %} [oom-diagnosis]: {% post_url platform/app/troubleshooting/2000-01-01-runtime-issues %}#out-of-memory-crashes [scaling]: {% post_url platform/app/scaling/2000-01-01-scaling %} +[h-scaling]: {% post_url platform/app/scaling/2000-01-01-scaling %}#horizontal-scaling +[v-scaling]: {% post_url platform/app/scaling/2000-01-01-scaling %}#vertical-scaling diff --git a/src/_posts/platform/app/scaling/2000-01-01-scaling.md b/src/_posts/platform/app/scaling/2000-01-01-scaling.md index 41b4fd42c..698d9a04d 100644 --- a/src/_posts/platform/app/scaling/2000-01-01-scaling.md +++ b/src/_posts/platform/app/scaling/2000-01-01-scaling.md @@ -63,7 +63,7 @@ application to traffic fluctuations. Here is a quick comparison table, in the context of a Platform as a Service: | | Vertical Scaling | Horizontal Scaling | -|-----------------|----------------------------------------------|---------------------------------------------------| +| --------------- | -------------------------------------------- | ------------------------------------------------- | | **Approach** | Enhancing individual instance capacity | Adding more instances | | **Cost** | Can become expensive at higher limits | Often more cost-efficient | | **Resilience** | Low (single point of failure) | High (distributed resources) | diff --git a/src/_posts/platform/app/scaling/2000-01-01-scalingo-autoscaler.md b/src/_posts/platform/app/scaling/2000-01-01-scalingo-autoscaler.md index 7274cd9d2..63c2e06db 100644 --- a/src/_posts/platform/app/scaling/2000-01-01-scalingo-autoscaler.md +++ b/src/_posts/platform/app/scaling/2000-01-01-scalingo-autoscaler.md @@ -325,7 +325,7 @@ options and values before validating. for available values. -## Enabling the Autoscaler +## Enabling an Autoscaler Enabling (or re-enabling) an Autoscaler allows you to put a previously [disabled](#disabling-an-autoscaler) Autoscaler back in action, using the saved configuration. @@ -374,7 +374,7 @@ may decide to either scale-out (i.e. boot up additional containers) or scale-in ## Disabling an Autoscaler Disabling an Autoscaler allows you to put it out of action, while saving its -configuration for later use. It can be [re-enabled](#enabling-the-autoscaler) +configuration for later use. It can be [re-enabled](#enabling-an-autoscaler) anytime. Sometimes it can be useful to temporarily disable an Autoscaler to only rely on