Scalingo · benjaminach · May 12, 2026 · May 12, 2026 · May 12, 2026 · May 12, 2026
diff --git a/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md b/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md
@@ -0,0 +1,113 @@
+---
+title: Choosing a Container Size
+nav: Choosing a Container Size
+modified_at: 2026-05-18 00:00:00
+tags: app scaling containers memory metrics
+index: 1
+---
+
+Choosing the right container size is an important step when deploying an application on Scalingo. Container size determines the amount of CPU and memory available to your application, which directly affects its performance, stability, and cost. Consequently, evaluating your application's resource requirements, and monitoring its consumption are crucial to match your workload, whether you're running a small development environment or a production-critical service.
+
+For simple applications, the default `M` container size is often a reasonable
+starting point. Choose a larger size from the beginning when you already know
+that your application has higher memory needs, for example because it uses a
+memory-intensive runtime, high concurrency, large in-memory datasets, caches,
+background jobs processing large payloads, or unknown production traffic.
+
+See the [container sizes][container-sizes] page for the available sizes, memory
+limits, and PID limits.
+
+
+## Pick an Initial Size
+
+When you are unsure about the right size, start with a size that gives your
+application enough headroom to handle traffic peaks, expensive requests, and
+occasional jobs.
+
+Avoid choosing a smaller size only because the application starts successfully.
+An application can boot with low memory usage and still consume much more
+memory under real traffic, scheduled jobs, large requests, or specific user
+flows.
+## Picking an Initial Size
+
+For simple applications, the default `M` container size is often a reasonable
+starting point. Choose a larger size from the beginning when you already know
+that your application has higher memory needs, for example because it uses a
+memory-intensive runtime, high concurrency, large in-memory datasets, caches,
+background jobs processing large payloads, or unknown production traffic.
+
+When you are unsure about the right size, start with a size that gives your
+application enough headroom to handle traffic peaks, expensive requests, and
+occasional jobs.
+
+Avoid choosing a smaller size only because the application starts successfully.
+An application can boot with low memory usage and still consume much more
+memory under real traffic, scheduled jobs, large requests, or specific user
+flows.
+
+See the [container sizes][container-sizes] page for the available sizes, memory
+limits, and PID limits.
+
+## Adjusting the Container Formation
+
+Once you have chosen an initial size for each [process type], regularly review application charts in the [Metrics tab][metrics].
+
+Resource utilization trends provide valuable insight into whether the current container size is appropriate or if additional resources are required to maintain performance and stability:
+
+- **CPU usage**: Sustained high CPU utilization may indicate that the application
+  is CPU-bound and would benefit from additional CPU resources or more container
+  instances ([horizontal scaling][h-scaling]). Conversely, consistently low CPU
+  usage may suggest that the application is overprovisioned and could be
+  downsized to reduce costs.
+- **Memory and swap**: Monitor memory consumption to ensure the application has
+  sufficient headroom during normal operation and traffic peaks. Frequent use of
+  swap space is a strong indicator of memory pressure and can significantly
+  degrade performance, often signaling the need for a [larger container
+  size][v-scaling].
+- **Application-level signals**: Increasing response times or a growing number of
+  server-side errors can indicate that the application is approaching its
+  capacity limits, even when CPU and memory utilization appear healthy, and may
+  warrant [scaling out][h-scaling] to maintain service quality and fluent user
+  experience.
+- **Restart events**: Unexpected or recurring container restarts can point to
+  resource exhaustion, such as out-of-memory (OOM) conditions, application
+  crashes, or other operational issues. Investigating restart patterns can help
+  determine whether scaling up resources is necessary to improve application
+  stability.
+
+If production metrics are not enough to validate a size, test the application
+with realistic load and non-sensitive data.
+
+If the application is critical or you are unsure about the safest sizing
+strategy, contact Scalingo support.
+
+{% note %}
+- While scaling out and scaling up are generally safe options, we usually advise
+  to take extra care when scaling in or down. Please ensure that the application
+  keeps enough memory headroom below the limit over time to avoid any
+  unavailability.
+- Before running intensive load tests against an application hosted on Scalingo,
+  read our [external testing procedures][external-testing].
+{% endnote %}
+
+
+## Monitoring
+
+Configure [alerts][alerts] for critical metrics, and keep
+[notifiers][notifiers] configured so the right people receive notifications
+before resource usage becomes critical.
+
+If the application consumes all its available memory, it can be terminated by
+the system. See the [Runtime Issues][oom-diagnosis] page for Out of Memory
+crash diagnosis and recovery guidance.
+
+
+[alerts]: {% post_url platform/app/2000-01-01-alerts %}
+[container-sizes]: {% post_url platform/internals/2000-01-01-container-sizes %}
+[external-testing]: {% post_url security/procedures/2000-01-01-external-testing %}#can-i-run-a-load-test-on-my-application-that-is-running-on-scalingo
+[metrics]: {% post_url platform/app/2000-01-01-metrics %}
+[notifiers]: {% post_url platform/app/2000-01-01-notifiers %}
+[oom-diagnosis]: {% post_url platform/app/troubleshooting/2000-01-01-runtime-issues %}#out-of-memory-crashes
+[scaling]: {% post_url platform/app/scaling/2000-01-01-scaling %}
+[h-scaling]: {% post_url platform/app/scaling/2000-01-01-scaling %}#horizontal-scaling
+[v-scaling]: {% post_url platform/app/scaling/2000-01-01-scaling %}#vertical-scaling
diff --git a/src/_posts/platform/app/scaling/2000-01-01-optimizing-application-architecture.md b/src/_posts/platform/app/scaling/2000-01-01-optimizing-application-architecture.md
@@ -0,0 +1,122 @@
+---
+title: Optimizing Application Architecture
+nav: Optimized Architecture
+modified_at: 2026-05-18 00:00:00
+tags: app scaling architecture containers memory metrics performance concurrency
+index: 5
+---
+
+An optimized application architecture makes efficient use of the resources
+allocated to each container while keeping the application easy to operate. On
+Scalingo, this usually means separating workloads into [process types][procfile],
+keeping web requests short, tuning concurrency carefully, and using metrics to
+decide when to optimize code, split work, or scale the application.
+
+This page focuses on how to structure the application workload. To choose a
+container size, see [Choosing a Container Size][choosing-container-size]. To
+change the number or size of running containers, see
+[Scaling Your Application][scaling].
+
+
+## Design Around Process Types
+
+Use [process types][procfile] to separate workloads that do not have the same
+operational profile. A `web` process should handle HTTP requests and return
+responses quickly, using [background jobs][long-process] for long work.
+
+Run background workers, importers, exporters, and scheduled jobs in process
+types that match their role. This makes each workload easier to scale, size,
+and debug independently.
+
+Typical process types include:
+
+- `web` for HTTP traffic;
+- `worker` for background jobs;
+- `clock` or `scheduler` [custom process types][custom-clock-processes] for
+  long-running, high-frequency, or precise recurring jobs;
+- dedicated workers for imports, exports, batch tasks, or other specialized
+  workloads.
+
+For most recurring jobs, consider the
+[Scalingo Scheduler][scalingo-scheduler] with a `cron.json` file as the
+built-in option for running scheduled tasks without a continuously running
+process.
+
+
+## Tune Concurrency Carefully
+
+Concurrency lets a process handle more work in parallel, but each additional
+thread, worker, or child process usually consumes more memory and may increase
+database or external service pressure.
+
+Tune concurrency separately for each process type:
+
+- increase web concurrency only if response time and resource usage stay
+  healthy under realistic traffic;
+- reduce worker concurrency if occasional jobs create memory pressure;
+- keep enough database connections for the configured concurrency;
+- use separate process types for jobs that have different resource profiles,
+  such as CPU-heavy and memory-heavy jobs.
+
+Some runtimes expose Scalingo-specific or buildpack-provided defaults and
+environment variables. See the language pages for details:
+[Ruby][ruby], [Python][python], [PHP][php], [Java][java], [Node.js][nodejs],
+and [Go][go].
+
+
+## Isolate Resource-Intensive Workloads
+
+Resource pressure is often caused by specific workloads rather than by every
+request. Check whether the application consumes more CPU or memory regularly,
+or only during occasional tasks such as:
+
+- background jobs;
+- PDF generation;
+- image or video processing;
+- large imports or exports;
+- report generation;
+- scheduled batch tasks;
+- large in-memory caches or datasets.
+
+Depending on what you observe, prefer the smallest change that addresses the
+actual cause:
+
+- optimize the code path that consumes too many resources;
+- tune runtime-specific memory or concurrency settings;
+- reduce worker or job concurrency;
+- split large jobs into smaller chunks;
+- move specialized jobs to their own process type when they do not share the
+  same resource profile as the rest of the application.
+
+If each container still needs more memory after these changes, continue with
+[Choosing a Container Size][choosing-container-size].
+
+
+## Size and Scale Your App
+
+Once your application is optimized for its workload:
+
+- [choose the right size][choosing-container-size] for each process type;
+- [scale the application][scaling] to match capacity to traffic;
+- read [Application Metrics][metrics] and configure [alerts][alerts] to monitor
+  the application after changes.
+
+If the application reaches its memory limit and crashes, see [Runtime
+Issues][oom-diagnosis] for diagnosis and recovery guidance.
+
+
+[alerts]: {% post_url platform/app/2000-01-01-alerts %}
+[choosing-container-size]: {% post_url platform/app/scaling/2000-01-01-choosing-container-size %}
+[custom-clock-processes]: {% post_url platform/app/task-scheduling/2000-01-01-custom-clock-processes %}
+[go]: {% post_url languages/go/2000-01-01-start %}
+[java]: {% post_url languages/java/2000-01-01-start %}
+[long-process]: {% post_url platform/app/2000-01-01-long-process %}
+[metrics]: {% post_url platform/app/2000-01-01-metrics %}
+[nodejs]: {% post_url languages/nodejs/2000-01-01-start %}
+[oom-diagnosis]: {% post_url platform/app/troubleshooting/2000-01-01-runtime-issues %}#out-of-memory-crashes
+[php]: {% post_url languages/php/2000-01-01-start %}
+[procfile]: {% post_url platform/app/2000-01-01-procfile %}
+[python]: {% post_url languages/python/2000-01-01-start %}
+[ruby]: {% post_url languages/ruby/2000-01-01-start %}
+[scaling]: {% post_url platform/app/scaling/2000-01-01-scaling %}
+[scalingo-scheduler]: {% post_url platform/app/task-scheduling/2000-01-01-scalingo-scheduler %}
diff --git a/src/_posts/platform/app/scaling/2000-01-01-scaling.md b/src/_posts/platform/app/scaling/2000-01-01-scaling.md
@@ -1,7 +1,7 @@
 ---
 title: Scaling Your Application
 nav: Scaling
-modified_at: 2026-01-02 12:00:00
+modified_at: 2026-05-18 00:00:00
 index: 10
 ---
 
@@ -62,22 +62,40 @@ application to traffic fluctuations.
 
 Here is a quick comparison table, in the context of a Platform as a Service:
 
-|                 | Vertical Scaling                        | Horizontal Scaling                |
-| --------------- | --------------------------------------- | --------------------------------- |
-| **Approach**    | Enhancing individual instance capacity  | Adding more instances             |
-| **Cost**        | Can become expensive at higher limits   | Often more cost-efficient         |
-| **Resilience**  | Low (single point of failure)           | High (distributed resources)      |
-| **Flexibility** | Low, limited by physical/virtual constraints | High, limited by the application architecture |
-| **When**        | Lack or overuse of CPU or RAM (swap increase or decrease) | Increase or decrease in total application traffic |
+|                 | Vertical Scaling                             | Horizontal Scaling                                |
+| --------------- | -------------------------------------------- | ------------------------------------------------- |
+| **Approach**    | Enhancing individual instance capacity       | Adding more instances                             |
+| **Cost**        | Can become expensive at higher limits        | Often more cost-efficient                         |
+| **Resilience**  | Low (single point of failure)                | High (distributed resources)                      |
+| **Flexibility** | Low, limited by physical/virtual constraints | High, limited by the application architecture     |
+| **When**        | Lack or overuse of resources (CPU or RAM)    | Increase or decrease in total application traffic |
 
+### Memory Usage and Scaling Decisions
 
-## Limitations
+If your application is approaching its memory limit, check its
+[memory metrics][metrics] and apply the same distinction between vertical and
+horizontal scaling to the memory usage pattern.
 
-- Vertical scaling is limited by the platform. The biggest container we can
-  currently boot is the `2XL` container, with 4GB of RAM. For a comprehensive
-  list of container sizes and corresponding specifications, please see our
+Use vertical scaling when each container needs more memory to run safely. In
+that case, choose a larger size with
+[Choosing a Container Size][choosing-container-size]. Adding more containers
+can help when memory usage increases because traffic increases and the workload
+can be distributed across multiple containers. However, horizontal scaling will
+not fix an application that individually requires more memory than the selected
+container size provides.
+
+If memory pressure comes from specific jobs or high concurrency, first review
+the application structure with
+[Optimizing Application Architecture][optimizing-architecture].
+
+
+## Scaling Limits
+
+- Vertical scaling currently goes up to the `2XL` container size, with 4GB of
+  RAM. For a comprehensive list of container sizes and corresponding
+  specifications, please see our
   [dedicated documentation page]({% post_url platform/internals/2000-01-01-container-sizes %}).
-- Horizontal scaling is limited by default to a maximum of 10 containers per
+- Horizontal scaling is available by default up to 10 containers per
   [process type]({% post_url platform/app/2000-01-01-procfile %}). This limit
   can be increased via our support team.
 
@@ -233,3 +251,6 @@ To learn more about events and notifiers, please visit the page dedicated to
 
 [routing-requests]: {% post_url platform/networking/public/2000-01-01-routing %}#requests-distribution
 [Scalingo Autoscaler]: {% post_url platform/app/scaling/2000-01-01-scalingo-autoscaler %}
+[choosing-container-size]: {% post_url platform/app/scaling/2000-01-01-choosing-container-size %}
+[metrics]: {% post_url platform/app/2000-01-01-metrics %}
+[optimizing-architecture]: {% post_url platform/app/scaling/2000-01-01-optimizing-application-architecture %}