Conversation
Also Fix AWS EFS AP counter to use 10,000 limit
Also add retry logic to the aws-efc AWS API calls.
knolleary
left a comment
There was a problem hiding this comment.
I'm nervous about the aws-efs logic that now has to query up to 9999 (rather than up to 999) endpoints just to find the one with the most space. Can we use a different heuristic that finds one with 'enough' space so we don't have to poll the whole list every time an instance is created? Or do some caching to remember the top ten with most space and only rescan if necessary?
There was a problem hiding this comment.
Pull request overview
This PR updates the Kubernetes driver and AWS EFS integration to improve retry behavior and avoid unnecessary PVC recreation, aligning error handling with err.code rather than err.response.statusCode.
Changes:
- Add
async-retrydependency and use it to retry AWS EFSDescribe*calls on throttling. - Update PVC creation flow to only create a PVC when it does not already exist.
- Adjust Kubernetes API retry logic to check
err.codefor 429 responses.
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| package.json | Adds async-retry dependency to support new retry behavior. |
| package-lock.json | Locks async-retry and its transitive dependency retry. |
| lib/aws-efs.js | Wraps EFS DescribeFileSystems / DescribeAccessPoints in retry logic for throttling cases. |
| kubernetes.js | Skips PVC creation if it already exists; changes retry condition to use err.code. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Tested locally and with now wrapping readNamespacedPersistentVolumeClaim we see the error get logged And it correctly created the PVC |
Description
Ensure
err.codeis checked noterr.response.statusCodeIt also ensures that the Persistent Storage PVC is only created if it doesn't exist (e.g. don't try and recreate for a suspended instance).
If the PVC is created then it will also add retry logic to the AWS EFS calls if using EFS AccessPoints to store multiple instance storage directories on single EFS (Only really used for FFC)
Related Issue(s)
Checklist
flowforge.yml?FlowFuse/helmto update ConfigMap TemplateFlowFuse/CloudProjectto update values for Staging/ProductionLabels
area:migrationlabel