feat: Make 429 passthrough instead return 401#1232
feat: Make 429 passthrough instead return 401#1232sboily wants to merge 2 commits intoory:masterfrom
Conversation
|
@wewelll after you approved the changes, what is the next step to move this forward? |
|
@finsterwalder i think i need to check the new tests required because they failed. But it's not clear to me if maintainers are interested about this PR. |
|
@sboily Thank you so much for you continuous effort to work on this! |
aeneasr
left a comment
There was a problem hiding this comment.
Awesome, thank you for your contribution! This looks pretty good and I have some ideas how to improve it further :)
| return body, nil | ||
| } else if res.StatusCode == http.StatusTooManyRequests { | ||
| // Handle rate limiting - pass through the response | ||
| body, _ := io.ReadAll(res.Body) |
There was a problem hiding this comment.
I wonder if it would be possible to pass through these headers and the body? Also what about the other handlers with upstream code?
There was a problem hiding this comment.
Good point! I checked the codebase and found other places that make upstream calls and would need similar 429 handling. As i said before, i was concentrated on my use case. What is your preference here, do i need to update the PR to support other places? I need to check better but seems i need to update 2 places for authenticators and 3 for authorizers.
About headers/body i can propose a modification if you want but we'll get probably a larger PR. So we have 2 options i think for this PR. 1. merge it as this to start a 429 support, where i understand from your comment it's probably not enough to have a complete full support or 2. maybe having a better specification about what we want to support here and how. I think you the best person to make this choice :).
There was a problem hiding this comment.
For the other places, i did a new commit, let me know if it what you have in your mind for the other handlers.
| body, _ := io.ReadAll(res.Body) | ||
| err := helper.ErrTooManyRequests | ||
| if len(body) > 0 { | ||
| err = err.WithReasonf("%s", string(body)) |
There was a problem hiding this comment.
The body may be some HTML, do we really want it in the error as JSON? It will look super strange
There was a problem hiding this comment.
Hello @aeneasr,
I agree, embedding the raw body isn't ideal, i was just focus on my use case. What's your preference here:
- Detect content-type and handle appropriately (parse JSON responses, skip HTML, include plain text)
or - Remove body passthrough entirely and use a simple hardcoded message
Let me know what you think would work best!
There was a problem hiding this comment.
i think we should remove this body inclusion and move the x- headers to the response body
CI is green but i dont change my code 😅, i have just sync main. |
67f65e6 to
2f47f2a
Compare
2f47f2a to
6b4b1b2
Compare
6b4b1b2 to
66ca2e0
Compare
66ca2e0 to
b44fb61
Compare
|
Hello @aeneasr, any news for this PR, what can i do to help to merge it? Thank you :) |
aeneasr
left a comment
There was a problem hiding this comment.
sorry for the long wait times, i'll now try to help get this over the line
| err = err.WithReasonf("%s", string(body)) | ||
| } | ||
| // Pass through important rate limit headers | ||
| for _, header := range []string{"Retry-After", "X-RateLimit-Limit", "X-RateLimit-Remaining", "X-RateLimit-Reset", "X-RateLimit-Type"} { |
There was a problem hiding this comment.
I think oathkeeper should return these as header
| body, _ := io.ReadAll(res.Body) | ||
| err := helper.ErrTooManyRequests | ||
| if len(body) > 0 { | ||
| err = err.WithReasonf("%s", string(body)) |
There was a problem hiding this comment.
i think we should remove this body inclusion and move the x- headers to the response body
|
During review, I found some shared mem access problems that would lead to side effects and race conditions in this pr. it will take longer to fix this up. |
Fix: Properly propagate 429 Too Many Requests responses instead of converting to 401
This pull request fixes an issue where Oathkeeper was incorrectly converting 429 (Too Many Requests) responses from upstream services into 401 (Unauthorized) errors. This behavior is problematic for API consumers as it conflates rate limiting with authentication failures, leading to incorrect client behavior.
The Problem
When an upstream service returns a 429 status code (rate limit exceeded), Oathkeeper was treating it as an authentication failure and returning 401. This causes several issues:
Retry-Afterand rate limit details are lostChanges Made
Modified error handling in
errors_hydrator.go: Updated theDefaultHydratorto properly recognize and propagate 429 status codes from upstream responses instead of converting them to 401.Preserved upstream response details: The fix ensures that:
Retry-After,X-RateLimit-*) are preservedImpact
This change improves API usability by:
Example
Before this fix:
{ "error": { "code": 401, "status": "Unauthorized", "message": "Access credentials are invalid" } }After this fix:
{ "error": { "code": 429, "status": "Too Many Requests", "reason": "{\"detail\":{\"error\":\"rate_limit_exceeded\",\"message\":\"API rate limit exceeded. You have made 71 requests in the current minute, which exceeds your limit of 70 requests per minute.\",\"limit_type\":\"minute\",\"current_usage\":71,\"limit\":70,\"retry_after_seconds\":5,\"reset_time\":1751085840,\"documentation\":\"https://docs.angany.ai/api/rate-limiting\"}}", "message": "Too many requests" } }No breaking changes are introduced by this patch.
Related issue(s)
This fixes a previously unknown bug where 429 responses were being incorrectly converted to 401 errors. The bug can be reproduced by:
Checklist
Further Comments
This fix addresses a significant usability issue for developers building applications that consume APIs through Oathkeeper. Converting rate limit errors to authentication errors violates the principle of least surprise and makes it impossible for clients to implement proper retry logic.
The fix has been tested in a dev environment with high-throughput rate limiting scenarios (100+ requests/second) and correctly propagates 429 responses without any performance impact.
This change aligns Oathkeeper with HTTP standards (RFC 6585) and common API gateway practices where rate limiting and authentication are treated as distinct concerns with different error codes.
This PR is intent to have discussion about this issue, let me know if it incorrect to address it like this. There is also an issue filled here #1167.