Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions .changeset/sigv4-path-encoding.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
'@tigrisdata/storage': patch
---

Fix `403 SignatureDoesNotMatch` from `copy`, `move`, and `updateObject` when the object key contains `/` or any other character that requires percent-encoding (space, `?`, `=`, etc.) and the request is signed with access-key SigV4.

Two related fixes:

- Custom HTTP client now constructs `SignatureV4` with `uriEscapePath: false`, matching S3's single-encoding canonical-path scheme. The default `true` (AWS-standard double-encoding) caused the signer to re-escape any percent sequence in the path during canonicalization, while Tigris gateway uses S3 single-encoding — producing the mismatch for every key with a special char.
- Object keys in the request path and `X-Amz-Copy-Source` header are URL-encoded per-segment via the new `encodeObjectKey` helper, preserving `/` as a separator so the wire URL stays valid.

OAuth/session-token callers were unaffected because that auth path skips SigV4 signing entirely.
84 changes: 84 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,3 +108,87 @@ Root-level scripts:
- The project uses Husky for Git hooks (commitlint via `commit-msg`, Biome `check` via `pre-commit`).
- Commitizen is configured for Conventional Commits (`pnpm commit`).
- Biome handles linting and formatting (replaced ESLint and Prettier).

## Known pitfalls

### SigV4 path encoding for object keys

Scope: this only applies to code that talks to the **custom HTTP
client** in `shared/http-client.ts` (used by `copy`, `move`,
`updateObject`, and the `setBucket*` family). Anything routed through
the AWS SDK (`ListObjectsV2Command`, `PutObjectCommand`,
`HeadObjectCommand`, `GetObjectCommand`, `RemoveObjectCommand`, etc.)
handles its own SigV4 correctly — don't touch encoding there.

**Rule for the custom client**: whenever you build a **request path**
that embeds an object key, do **not** encode the whole key with
`encodeURIComponent`. Use `encodeObjectKey` from `@shared/utils`
instead.

Why the path specifically: there are two encoding traps that compound.

1. **Signer setting**: `@smithy/signature-v4`'s `SignatureV4`
constructor defaults `uriEscapePath: true` — the AWS-standard
double-encoding scheme. S3 uses single-encoding, so our custom HTTP
client passes `uriEscapePath: false`. If you ever build a second
signer instance, set the same flag. With the default, the signer
re-percent-encodes any sequence in the path during canonicalization
(`%20` → `%2520`), and Tigris gateway — which uses S3 single-encoding
— diverges from the client canonical and returns
`403 SignatureDoesNotMatch` for every key with a special char.
2. **Pre-encoding**: even with `uriEscapePath: false`, the wire URL
must still be a valid URL — keys with `/`, spaces, `?`, etc. need
percent-encoding when serialized. Use `encodeObjectKey` (single
encode, per-segment, preserves `/`) so the on-wire path is valid
and matches what the signer signs.

**Other surfaces — not affected by the double-encode bug but worth
knowing**:

- **Query strings**: the HTTP client extracts query params via
`URL.searchParams.forEach`, which returns the **decoded** value
before handing it to the signer. The signer encodes once for the
canonical query string, the server canonicalizes the same way —
single encoding round-trip. Writing `?prefix=${encodeURIComponent(p)}`
is safe; so is writing the value raw and letting `URL` percent-encode
it.
- **Header values**: SigV4's canonical-header construction does not
URL-escape header values (only trims and collapses whitespace), so
there's no double-encode mismatch. However, S3's
`X-Amz-Copy-Source` spec **requires** the value be URL-encoded; the
server URL-decodes it before treating it as a key. Use
`encodeObjectKey` there too — it gives you the spec-correct encoding
for free (special chars escaped, `/` separators preserved).

Symptoms that point at this bug:

- `403 Forbidden` with body `<Code>SignatureDoesNotMatch</Code>` for
requests that include a nested object key (key contains `/`).
- Works fine in tests that mock the network or run with OAuth/session
tokens (those skip the SigV4 branch entirely — see
`shared/http-client.ts`), then fails the first time a real consumer
exercises the call with access keys.
- A previously-fine flat key like `file.txt` succeeds, but
`folder/file.txt` 403s.

Mitigation for new code:

```ts
import { encodeObjectKey } from '@shared/utils';

path: `/${bucket}/${encodeObjectKey(key)}?x-id=...`,
headers: {
[TigrisHeaders.COPY_SOURCE]: `${srcBucket}/${encodeObjectKey(src)}`,
}
```

When adding tests for any function that hits the custom HTTP client,
exercise at least these two key shapes against the real gateway with
access-key auth:

- A key with `/` (e.g. `folder/file.txt`) — catches the per-segment
encoding regression.
- A key with a character that requires percent-encoding (space, `?`,
`=`, `&`) — catches the `uriEscapePath` regression. Unit tests on
`encodeObjectKey` alone aren't enough; the bug only shows up when
the signer's canonicalization pass runs end-to-end.
6 changes: 3 additions & 3 deletions packages/storage/src/lib/object/copy.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import { TigrisHeaders } from '@shared/headers';
import { handleError } from '@shared/utils';
import { encodeObjectKey, handleError } from '@shared/utils';
import { config, missingConfigError } from '../config';
import { createStorageClient } from '../http-client';
import type { TigrisStorageConfig, TigrisStorageResponse } from '../types';
Expand Down Expand Up @@ -56,7 +56,7 @@ export async function copyOrMove(
}

const headers: Record<string, string> = {
[TigrisHeaders.COPY_SOURCE]: `${srcBucket}/${encodeURIComponent(src)}`,
[TigrisHeaders.COPY_SOURCE]: `${srcBucket}/${encodeObjectKey(src)}`,
};

if (rename) {
Expand All @@ -66,7 +66,7 @@ export async function copyOrMove(
try {
const response = await storageHttpClient.request({
method: 'PUT',
path: `/${destBucket}/${encodeURIComponent(dest)}?x-id=CopyObject`,
path: `/${destBucket}/${encodeObjectKey(dest)}?x-id=CopyObject`,
headers,
});

Expand Down
6 changes: 3 additions & 3 deletions packages/storage/src/lib/object/update.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import { PutObjectAclCommand } from '@aws-sdk/client-s3';
import { TigrisHeaders } from '@shared/headers';
import { handleError } from '@shared/utils';
import { encodeObjectKey, handleError } from '@shared/utils';
import { config, missingConfigError } from '../config';
import { createStorageClient } from '../http-client';
import { createTigrisClient } from '../tigris-client';
Expand Down Expand Up @@ -50,9 +50,9 @@ export async function updateObject(
try {
const response = await storageHttpClient.request({
method: 'PUT',
path: `/${bucket}/${encodeURIComponent(key)}?x-id=CopyObject`,
path: `/${bucket}/${encodeObjectKey(key)}?x-id=CopyObject`,
headers: {
[TigrisHeaders.COPY_SOURCE]: `${bucket}/${encodeURIComponent(path)}`,
[TigrisHeaders.COPY_SOURCE]: `${bucket}/${encodeObjectKey(path)}`,
[TigrisHeaders.RENAME]: 'true',
},
});
Expand Down
5 changes: 5 additions & 0 deletions shared/http-client.ts
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,11 @@ async function generateSignatureHeaders(
region: 'auto',
service: 's3',
sha256: Sha256,
// S3 uses single-encoding for the canonical path; the default
// (`true`) is the AWS-standard double-encoding scheme and produces
// `SignatureDoesNotMatch` whenever a path segment contains
// characters that need percent-encoding (space, `?`, `=`, etc.).
uriEscapePath: false,
});

const query: Record<string, string> = {};
Expand Down
31 changes: 30 additions & 1 deletion shared/utils.test.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import { describe, expect, it } from 'vitest';
import { executeWithConcurrency } from './utils';
import { encodeObjectKey, executeWithConcurrency } from './utils';

describe('executeWithConcurrency', () => {
it('should execute all tasks and return results in order', async () => {
Expand Down Expand Up @@ -136,3 +136,32 @@ describe('executeWithConcurrency', () => {
expect(executionOrder).toEqual([1, 2, 3]);
});
});

describe('encodeObjectKey', () => {
it('encodes a flat key with no slashes', () => {
expect(encodeObjectKey('file.txt')).toBe('file.txt');
});

it('preserves slashes as path separators', () => {
// Regression: a plain `encodeURIComponent` turns `/` into `%2F`,
// which the SigV4 signer then double-encodes to `%252F` during
// canonical-request construction, producing SignatureDoesNotMatch.
expect(encodeObjectKey('folder/file.txt')).toBe('folder/file.txt');
expect(encodeObjectKey('a/b/c/d.txt')).toBe('a/b/c/d.txt');
});

it('encodes special characters within segments', () => {
expect(encodeObjectKey('folder/my file.txt')).toBe('folder/my%20file.txt');
expect(encodeObjectKey('folder/?weird=name.txt')).toBe(
'folder/%3Fweird%3Dname.txt'
);
});

it('preserves trailing slash (folder markers)', () => {
expect(encodeObjectKey('folder/')).toBe('folder/');
});

it('handles empty input', () => {
expect(encodeObjectKey('')).toBe('');
});
});
17 changes: 17 additions & 0 deletions shared/utils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,23 @@ export async function executeWithConcurrency<T>(
return results;
}

/**
* Encode an object key for use in an S3-style request path or
* `X-Amz-Copy-Source` header. Encodes each path segment with
* `encodeURIComponent` and rejoins with `/`, so the slash separators
* survive verbatim.
*
* Why this matters: when the request is signed with SigV4 (access-key
* auth path), the signer URI-escapes the path again during canonical
* request construction. If the key was already escaped with a plain
* `encodeURIComponent`, the `/` becomes `%2F` and then `%252F` —
* mismatch against the server's canonical (which decodes `%2F`
* back to `/` first). Result: `SignatureDoesNotMatch`.
*/
export function encodeObjectKey(key: string): string {
return key.split('/').map(encodeURIComponent).join('/');
}
Comment thread
designcode marked this conversation as resolved.
Comment thread
designcode marked this conversation as resolved.

export function toError(value: unknown): Error {
if (value instanceof Error) return value;
if (typeof value === 'string') return new Error(value);
Expand Down
Loading