fix: treat CookieJar write methods as potentially throwing#3426
Conversation
Co-authored-by: barjin <61918049+barjin@users.noreply.github.com>
Co-authored-by: barjin <61918049+barjin@users.noreply.github.com>
CookieJar methods as potentially throwing
| "access": "public" | ||
| }, | ||
| "dependencies": { | ||
| "@apify/log": "^2.5.32", |
There was a problem hiding this comment.
The only part I'm unsure about is adding @apify/log as a @crawlee/http-client dependency... but well, we gotta log somehow.
I'm open to ideas here.
There was a problem hiding this comment.
#3399 should resolve this problem, shouldn't it?
There was a problem hiding this comment.
@crawlee/http-client should have no dependencies on other Crawlee packages (definitely not @crawlee/core), as it's used from @crawlee/utils - importing @crawlee/core would create a circular dependency. It's a bit of a predicament :/
| timeout?: number; | ||
| } { | ||
| const proxyUrl = options?.proxyUrl ?? options?.session?.proxyInfo?.url; | ||
| const cookieJar = options?.cookieJar ?? options?.session?.cookieJar ?? new CookieJar(); |
There was a problem hiding this comment.
Seems like it still creates cookie jar if not passed 🤔
then on a line 119 and 127 the jar is used in each request: this.applyCookies(currentRequest, cookieJar) or do I miss smth?
There was a problem hiding this comment.
upd: took a closer look at the implementation, and the solution looks correct to me.
| const cookies = this.cookieJar.getCookiesSync(url); | ||
| return cookies.map((c) => toughCookieToBrowserPoolCookie(c)); | ||
| } catch (e) { | ||
| this.log.warning('Could not get cookies from cookie jar.', { url, error: (e as Error).message }); |
There was a problem hiding this comment.
For an error to be thrown here, the cookie jar would have to accept a malformed cookie string first and then somehow fail at retrieving it. Is there a realistic way this could happen? If not, I'd perhaps re-throw the error in this case.
There was a problem hiding this comment.
Yeah, good point, all the errors I could find in the source are parameter-value related, so probably user's fault.
@copilot revert both getCookies and getCookieString hunks.
There was a problem hiding this comment.
Done in fd55f27. Reverted both getCookies and getCookieString to their original form without try-catch.
| try { | ||
| return this.cookieJar.getCookieStringSync(url, {}); | ||
| } catch (e) { | ||
| this.log.warning('Could not get cookie string.', { url, error: (e as Error).message }); |
There was a problem hiding this comment.
| "access": "public" | ||
| }, | ||
| "dependencies": { | ||
| "@apify/log": "^2.5.32", |
There was a problem hiding this comment.
#3399 should resolve this problem, shouldn't it?
|
|
||
| test('setCookies will log warning (not throw) on invalid cookies', () => { | ||
| const url = 'https://www.example.com'; | ||
| const cookies = [{ name: 'cookie1', value: 'my-cookie', domain: 'abc.different.domain' }]; |
There was a problem hiding this comment.
Would be nice to explain why this cookie is invalid (it's for a different domain than the one that sets it). Also, can we test calling setCookie with a malformed raw cookie string?
There was a problem hiding this comment.
@copilot work on this, add a comment and a new test with a malformed cookie string.
There was a problem hiding this comment.
Done in 36d82f7. Added a comment explaining that abc.different.domain doesn't match the request URL (causing tough-cookie to reject it), and added a new test that spies on setCookieSync to simulate a throw and verifies setCookie handles it gracefully without re-throwing.
Co-authored-by: barjin <61918049+barjin@users.noreply.github.com>
Co-authored-by: barjin <61918049+barjin@users.noreply.github.com>
CookieJar methods as potentially throwingCookieJar write methods as potentially throwing
| vitest.spyOn(session.cookieJar, 'setCookieSync').mockImplementation(() => { | ||
| throw new Error('malformed cookie'); | ||
| }); |
There was a problem hiding this comment.
@copilot is the mock implementation necessary? If the cookie content is in fact malformed, the actual implementation of setCookieSync should detect it, right? The current test only seems to ensure that the method catches exceptions thrown from setCookieSync.
There was a problem hiding this comment.
@barjin tell your agent to get cracking please 😁
There was a problem hiding this comment.
Good call — the mock wasn't needed. A cookie with domain=evil.com set against https://www.example.com causes the real setCookieSync to throw. Updated in b81fca2.
There was a problem hiding this comment.
But it throws because the cookie is for a different domain. Try to make an actually malformed cookie string. @copilot
There was a problem hiding this comment.
@copilot make a test where the cookie string is a garbled mess (not a well-formed cookie string). This shouldn't fail either.
There was a problem hiding this comment.
Updated in 636757c — now uses 'garbled!!!@#$%nonsense' which causes the real tough-cookie parser to throw "Cookie failed to parse".
Co-authored-by: barjin <61918049+barjin@users.noreply.github.com>
Co-authored-by: barjin <61918049+barjin@users.noreply.github.com>
base-http-client.ts: wrapgetCookies()andsetCookie()calls in try-catch withconsole.warnsession.ts: wrapsetCookie()in try-catch withthis.log.warninggetCookiesandgetCookieStringinsession.tssetCookieusing a garbled string that causes real tough-cookie parse failureOriginal prompt
tough-cookie'sCookieJars throw on invalidSet-Cookieresponse contents #3425🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.