From 77637d39a343312c3de236cf6f7c0c7c478242b1 Mon Sep 17 00:00:00 2001 From: nsheff Date: Thu, 28 May 2026 08:17:21 -0400 Subject: [PATCH] Clarify start/end query parameter and Range header edge cases - Allow start or end query parameters to be used independently - Change out-of-bounds start from 400 to 416 (matches EBI) - Add explicit 416 for out-of-bounds end - Align Range header behavior with RFC 7233 (clip last-byte-pos, reject first-byte-pos > length with 416) Fixes #107 --- docs/decision_record.md | 22 ++++++++++++++++++++++ docs/sequences/README.md | 6 +++--- 2 files changed, 25 insertions(+), 3 deletions(-) diff --git a/docs/decision_record.md b/docs/decision_record.md index 1910071..4156521 100644 --- a/docs/decision_record.md +++ b/docs/decision_record.md @@ -8,6 +8,28 @@ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "S [TOC] +## 2026-05-28 Clarify start/end query parameter and Range header edge cases + +### Decision + +1. The `start` and `end` query parameters may be used independently (either one alone, or both together). +2. If only `start` is specified, the server returns the sub-sequence from `start` to the end of the sequence. +3. If only `end` is specified, the server returns the sub-sequence from position 0 to `end`. +4. If `start` or `end` exceeds the total sequence length, the server MUST respond with `Range Not Satisfiable` (416). Previously, the spec required `Bad Request` (400) for out-of-bounds `start`. +5. For Range header requests, behavior follows [RFC 7233](https://datatracker.ietf.org/doc/html/rfc7233#section-2.1): if `last-byte-pos` exceeds the sequence length, the server MUST clip it to the sequence length; if `first-byte-pos` exceeds the sequence length, the server MUST respond with `Range Not Satisfiable`. + +### Rationale + +The specification was ambiguous about whether `start` and `end` must both be provided or could be used independently. The compliance suite and reference implementation (EBI) already allowed either parameter to be omitted, so we codified this existing behavior. + +For query parameters, both out-of-bounds `start` and `end` now require `Range Not Satisfiable` (416), which matches the reference implementation (EBI) behavior. The previous spec required `Bad Request` (400) for out-of-bounds `start`. + +For Range headers, RFC 7233 specifies that an out-of-bounds `last-byte-pos` should be clipped to the representation length, while an out-of-bounds `first-byte-pos` makes the range unsatisfiable. The previous specification text ("out of bounds = Bad Request") was ambiguous and inconsistent with RFC 7233. We now explicitly align Range header behavior with the RFC. + +### Linked issues + +- + ## 2024-11-20 Level 2 return values should not return transient attributes ### Decision diff --git a/docs/sequences/README.md b/docs/sequences/README.md index 8bd8cce..9a178ca 100644 --- a/docs/sequences/README.md +++ b/docs/sequences/README.md @@ -177,14 +177,14 @@ Content-type: text/vnd.ga4gh.refget.v2.0.0+plain | Parameter | Data Type | Required | Description | |-----------|-------------------------|----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `start` | 32-bit unsigned integer | Optional | The start position of the range on the sequence, 0-based, inclusive. The server MUST respond with a `Bad Request` error if start is specified and is larger than the total sequence length. The server MUST respond with a `Range Not Satisfiable` error if start and end are specified and start is greater than end and the sequence is not a circular chromosome. Otherwise if the server does not support circular chromosomes it MUST respond with `Not Implemented` if the start is greater than the end. The server MUST respond with `Bad Request` if start and the Range header are both specified. | -| `end` | 32-bit unsigned integer | Optional | The end position of the range on the sequence, 0-based, exclusive. The server MUST respond with a `Range Not Satisfiable` error if start and end are specified and start is greater than end and the sequence is not a circular chromosome. Otherwise if the server does not support circular chromosomes it MUST respond with `Not Implemented` if the start is greater than the end. The server MUST respond with `Bad Request` if end and the Range header are both specified. | +| `start` | 32-bit unsigned integer | Optional | The start position of the range on the sequence, 0-based, inclusive. Either `start` or `end` may be specified independently; if only `start` is specified, the server returns the sub-sequence from `start` to the end of the sequence. The server MUST respond with a `Range Not Satisfiable` error if start is specified and is larger than the total sequence length. The server MUST respond with a `Range Not Satisfiable` error if start and end are specified and start is greater than end and the sequence is not a circular chromosome. Otherwise if the server does not support circular chromosomes it MUST respond with `Not Implemented` if the start is greater than the end. The server MUST respond with `Bad Request` if start and the Range header are both specified. | +| `end` | 32-bit unsigned integer | Optional | The end position of the range on the sequence, 0-based, exclusive. Either `start` or `end` may be specified independently; if only `end` is specified, the server returns the sub-sequence from position 0 to `end`. The server MUST respond with a `Range Not Satisfiable` error if `end` is greater than the total sequence length. The server MUST respond with a `Range Not Satisfiable` error if start and end are specified and start is greater than end and the sequence is not a circular chromosome. Otherwise if the server does not support circular chromosomes it MUST respond with `Not Implemented` if the start is greater than the end. The server MUST respond with `Bad Request` if end and the Range header are both specified. | #### Request parameters | Parameter | Data Type | Required | Description | |-----------|-------------------------|----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `Range` | string | Optional | Range header as specified in [RFC 7233](https://tools.ietf.org/html/rfc7233#section-3.1), however only a single byte range per GET request is supported by the specification. The byte range of the sequence to return, 0-based inclusive of start and end bytes specified. The server MUST respond with a `Bad Request` error if both a Range header and start or end query parameters are specified. The server MUST respond with a `Bad Request` error if one or more ranges are out of bounds of the sequence. | +| `Range` | string | Optional | Range header as specified in [RFC 7233](https://tools.ietf.org/html/rfc7233#section-3.1), however only a single byte range per GET request is supported by the specification. The byte range of the sequence to return, 0-based inclusive of start and end bytes specified. The server MUST respond with a `Bad Request` error if both a Range header and start or end query parameters are specified. Per RFC 7233, if the `last-byte-pos` exceeds the sequence length, the server MUST clip it to the sequence length; if the `first-byte-pos` exceeds the sequence length, the server MUST respond with `Range Not Satisfiable`. | | `Accept` | string | Optional | The formatting of the returned sequence, defaults to `text/vnd.ga4gh.refget.v2.0.0+plain` if not specified. A server MAY support other formatting of the sequence. The server SHOULD respond with a `Not Acceptable` error if the client requests a format not supported by the server. | #### Response