diff --git a/.markdownlint.json b/.markdownlint.json index 192f4edc3..d9283c1ba 100644 --- a/.markdownlint.json +++ b/.markdownlint.json @@ -1,4 +1,5 @@ { + "fenced-code-language": false, "single-h1": false, "no-bare-urls": false, "no-emphasis-as-header": false, diff --git a/IPIP/0000-http-reframe-cbor.md b/IPIP/0000-http-reframe-cbor.md new file mode 100644 index 000000000..552034a8f --- /dev/null +++ b/IPIP/0000-http-reframe-cbor.md @@ -0,0 +1,85 @@ +# IPIP 0000: Add DAG-CBOR support to Reframe over HTTP + + + +- Start Date: 2022-09-29 +- Related Issues: + - https://github.com/ipld/edelweiss/issues/16#issuecomment-1074161577 + - https://github.com/ipfs/kubo/issues/8823 + +## Summary + + +This IPIP adds DAG-CBOR support to Reframe over HTTP. + +## Motivation + +We've been using Reframe in Kubo for a while and it is clear that Reframe +messages are not designed to be created or read by humans. + +The plaintext DAG-JSON representation of messages does not really bring +anything to the table (because both CIDs and Multiaddrs are in a format that +needs manual encoding/decoding anyway), and the utility is limited to debugging +and use in examples. + +We've also identified some HTTP caching and scaling issues due to all methods +sharing the same URL path and the way `Etag` header is generated, and how +it made streaming responses impossible. + +## Detailed design + +We already support DAG-JSON, with its own content type. +The change here is to add support for requests and responses sent as DAG-CBOR, +with own content type: `application/vnd.ipfs.rpc+dag-cbor`. + +We change the URL to include method name on the path. This allows deployments +to scale better: set different HTTP cache control policies, or route different +methods to different backend services. + +For details, see changes made to `reframe/REFRAME_HTTP_TRANSPORT.md`. + +## Test fixtures + +TODO: add CIDs of sample DAG-CBOR messages after https://github.com/ipfs/go-delegated-routing implements it, and has own tests. + +## Design rationale + +IPFS stack aims to support both DAG-CBOR and DAG-JSON. Users can store JSON as +CBOR and vice versa. Having consistent support for both in Reframe not only +aligns with user expectations, but also allows us to save some bytes +(bandwidth, response caching requirements) by using a binary CBOR as the +production format. + +### User benefit + +User will be able to choose between binary and human-readable representation, +just like they do in other parts of IPFS/IPLD stack. + +DAG-JSON is the implicit default, improving ergonomics when debugging with `curl` +in CLI or fetching response via regular web browser + +### Compatibility + +IPFS / IPLD stack already includes both DAG-CBOR and DAG-JSON libraries. +The `version` parameter of the HTTP wire protocol is bumped to `2`. + +Reframe endpoints that care about backward-compatibility with Kubo 0.16 +can keep support for requests sent with `version=1`. + +### Security + +N/A, we will use the same DAG-CBOR encoder/decoder as the rest of the stack. + +### Alternatives + +Alternative is to do nothing, and end up with: + +- inconsistent user experience +- wasted bandwidth and cache storage +- difficult deployment and scaling (all methods under same endpoint) + +### Copyright + +Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). diff --git a/reframe/REFRAME_HTTP_TRANSPORT.md b/reframe/REFRAME_HTTP_TRANSPORT.md index c5a85ed78..dcad275ac 100644 --- a/reframe/REFRAME_HTTP_TRANSPORT.md +++ b/reframe/REFRAME_HTTP_TRANSPORT.md @@ -16,38 +16,81 @@ The Reframe over HTTP protocol is defining the transport and message serialization mechanisms for sending Reframe messages over HTTP `POST` and `GET`, and provides guidance for implementers around HTTP caching. -# Organization of this document - -- [HTTP Transport Design](#http-transport-design) - - [HTTP Caching Considerations](#http-caching-considerations) - - [POST vs GET](#post-vs-get) - - [Avoiding sending the same response messages twice](#avoiding-sending-the-same-response-messages-twice) - - [Client controls for time-based caching](#client-controls-for-time-based-caching) - - [Rate-limiting non-cachable POST requests](#rate-limiting-non-cachable-post-requests) +## Organization of this document + +- [HTTP Endpoint](#http-endpoint) + - [Content type](#content-type) + - [HTTP methods](#http-methods) + - [Other notes](#other-notes) +- [HTTP Caching Considerations](#http-caching-considerations) + - [POST vs GET](#post-vs-get) + - [Etag](#etag) + - [Last-Modified](#last-modified) + - [Cache-Control](#cache-control) + - [Rate-limiting non-cachable POST requests](#rate-limiting-non-cachable-post-requests) - [Implementations](#implementations) -# HTTP Transport Design +## HTTP Endpoint -All messages sent in HTTP body MUST be encoded as DAG-JSON and use explicit content type `application/vnd.ipfs.rpc+dag-json; version=1` +``` +https://rpc-service.example.net/reframe +``` + +URL of a Reframe endpoint must end with `/reframe` path. + +### Content type + +Requests SHOULD be sent with explicit `Accept` and `Content-Type` HTTP headers specifying the body format. + +All messages sent in HTTP body MUST be encoded as either: + +- [DAG-CBOR](https://ipld.io/specs/codecs/dag-cbor/spec/), and use explicit content type `application/vnd.ipfs.rpc+dag-cbor; version=2` + - **This is a CBOR (binary) format for use in production.** + - CBOR request MUST include HTTP header: `Accept: application/vnd.ipfs.rpc+dag-cbor; version=2` + - CBOR request AND response MUST include header: `Content-Type: application/vnd.ipfs.rpc+dag-cbor; version=2` +- [DAG-JSON](https://ipld.io/specs/codecs/dag-json/spec/), and use explicit content type `application/vnd.ipfs.rpc+dag-json; version=2` + - **This is a human-readable plain text format for use in testing and debugging.** + - JSON request MUST include header: `Accept: application/vnd.ipfs.rpc+dag-json; version=2` + - JSON request AND response MUST include header: `Content-Type: application/vnd.ipfs.rpc+dag-json; version=2` + +Implementations SHOULD error when an explicit content type is missing, but MAY decide to implement some defaults instead. +The rules around implicit content type are as follows: + +- Requests without a matching `Content-Type` header MAY be interpreted as DAG-JSON. +- Requests without a matching `Accept` header MAY produce a DAG-JSON response. +- Responses without a matching `Content-Type` header MAY be interpreted as DAG-JSON. + +### HTTP methods Requests MUST be sent as either: -- `GET /reframe?q={percent-encoded-dag-json}` - - DAG-JSON is supported via a `?q` query parameter, and the value MUST be [percent-encoded](https://en.wikipedia.org/wiki/Percent-encoding) +- `GET /reframe/{method}/{request-as-mbase64url-dag-cbor}` + - Cachable HTTP `GET` requests with message passed as DAG-CBOR in HTTP path segment, encoded as URL-safe [`base64url` multibase](https://docs.ipfs.io/concepts/glossary/#base64url) string + - Cachable `method` name is placed on the URL path, allowing for different caching strategies per `method`, and custom routing/scaling per `method`, if needed. + - DAG-CBOR in multibase `base64url` is used (even when request body is DAG-JSON) because JSON may include characters that are not safe to be used in URLs, and percent-encoding or base-encoding a big JSON query may take too much space. + - Suitable for sharing links, sending bigger messages, and when a query result MUST benefit from HTTP caching (see _HTTP Caching Considerations_ below). + - DAG-CBOR response is the implicit default, unless explicit `Accept` header is passed +- `GET /reframe/{method}?json={percent-encoded-request-as-dag-json}` + - DAG-JSON is supported via a `?json` query parameter, and the value MUST be [percent-encoded](https://en.wikipedia.org/wiki/Percent-encoding) - Suitable for sharing links, sending smaller messages, testing and debugging. -- `POST /reframe` - - Ephemeral HTTP `POST` request with message passed as DAG-JSON in HTTP request body + - DAG-JSON response is the implicit default, unless explicit `Accept` header is passed +- `POST /reframe/{method}` + - Ephemeral HTTP `POST` request with DAG-JSON or DAG-CBOR message passed in HTTP request body and a mandatory `Content-Type` header informing endpoint how to parse the body - Suitable for bigger messages, and when HTTP caching should be skipped for the most fresh results + - Response type is the same as `Content-Type` of the request, unless explicit `Accept` header is passed Servers MUST support `GET` for methods marked as cachable and MUST support `POST` for all methods (both cachable and not-cachable). This allows servers to rate-limit `POST` when cachable `GET` could be used instead, and enables clients to use `POST` as a fallback in case there is a technical problem with bigger Reframe messages not fitting in a `GET` URL. See "Caching Considerations" section. +### Other notes + If a server supports HTTP/1.1, then it MAY send chunked-encoded messages. Clients supporting HTTP/1.1 MUST accept chunked-encoded responses. Requests and Responses MUST occur over a single HTTP call instead of the server being allowed to dial back the client with a response at a later time. The response status code MUST be 200 if the RPC transaction succeeds, even when there's an error at the application layer, and a non-200 status code if the RPC transaction fails. -If a server chooses to respond to a single request message with a group of messages in the response it should do so as a set of `\n` delimited DAG-JSON messages (i.e. `{Response1}\n{Response2}...`). +If a server chooses to respond to a single request message with a group of DAG-JSON messages in the response it should do so as a set of `\n` delimited DAG-JSON messages (i.e. `{Response1}\n{Response2}...`). +DAG-CBOR responses require no special handling, as they are already self-delimiting due to the nature of the CBOR encoding. -Requests and responses MUST come with `version=1` as a _Required Parameter_ in the `Accept` and `Content-Type` HTTP headers. +Requests and responses MUST come with `version=2` as a _Required Parameter_ in the `Accept` and `Content-Type` HTTP headers. Note: This version header is what allows the transport to more easily evolve over time (e.g. if it was desired to change the transport to support other encodings than DAG-JSON, utilize headers differently, move the request data from the body, etc.). Not including the version number is may lead to incompatibility with future versions of the transport. @@ -65,23 +108,42 @@ Use of `GET` endpoint is not mandatory, but suggested if a Reframe deployment expects to handle the same message query multiple times, and want to leverage existing HTTP tooling to maximize HTTP cache hits. -### Avoiding sending the same response messages twice +### Etag + +For small responses. -Implementations MUST always return strong +Implementations MAY return [`Etag`](https://httpwg.org/specs/rfc7232.html#header.etag) HTTP header based -on digest of DAG-JSON response messages. This allows clients to send -inexpensive conditional requests with +on a digest of response messages ONLY when `Etag` generation does not require +buffering bigger response in memory before sending it to the client. + +In other words, do not use `Etag` if it will block a big, streaming response. +Streaming responses should use `Last-Modified` instead. + +`Etag` allows clients to send inexpensive conditional requests with [`If-None-Match`](https://httpwg.org/specs/rfc7232.html#header.if-none-match) header, which will skip when the response message did not change. -### Client controls for time-based caching +### Last-Modified + +For streaming responses. -Implementations can also return (optional) +Implementations SHOULD return [`Last-Modified`](https://httpwg.org/specs/rfc7232.html#header.last-modified) -HTTP header, allowing clients to send conditional requests with +HTTP header with bigger, streaming responses. + +This allows clients to send conditional requests with [`If-Modified-Since`](https://httpwg.org/specs/rfc7232.html#header.if-modified-since) header to specify their acceptance for stale (cached) responses. +### Cache-Control + +Implementations MAY return custom `Cache-Control` per Reframe method, +when a specific cache window makes sense in the context of specific method. + +It is also acceptable to leave it out and let reverse HTTP provies / CDNs to +set it. Value will depend on use case, and expected load. + ### Rate-limiting non-cachable POST requests HTTP endpoint can return status code @@ -99,6 +161,6 @@ Retry-After: 3600 too many POST requests: consider switching to cachable GET or try again later (see Retry-After header) ``` -# Implementations +## Implementations https://github.com/ipfs/go-delegated-routing