-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
[docs] decision: browsing contexts are exposed as handle objects #17681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
AutomatedTester
wants to merge
4
commits into
trunk
Choose a base branch
from
adr-bidi-context-handles
base: trunk
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+290
β0
Open
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
6b5f6f8
[docs] decision: browsing contexts are exposed as handle objects
AutomatedTester 019945c
[docs] 17681: fold user contexts into the browsing-context handle ADR
AutomatedTester 539a98c
[docs] 17681: renumber cross-refs to PR numbers; align expect_popup +β¦
AutomatedTester 047f88f
[docs] 17681: add Java sketch + event-scoping subsection
AutomatedTester File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
290 changes: 290 additions & 0 deletions
290
docs/decisions/17681-browsing-contexts-exposed-as-handle-objects.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,290 @@ | ||
| # 17681. Browsing contexts are exposed as handle objects | ||
|
|
||
| - Status: Proposed | ||
| - Date: 2026-06-11 | ||
| - Discussion: https://github.com/SeleniumHQ/selenium/pull/17681 | ||
|
|
||
| ## Context | ||
|
|
||
| Working with more than one tab/window over BiDi is awkward today because there is no | ||
| object that represents a single browsing context. The binding exposes a flat module β | ||
| every operation is called on one shared instance and takes the context id explicitly: | ||
|
|
||
| ```python | ||
| ctx = driver.browsing_context.create(type=WindowTypes.TAB) # returns a bare string id | ||
| driver.browsing_context.navigate(context=ctx, url="https://...") | ||
| driver.browsing_context.capture_screenshot(context=ctx) | ||
| driver.browsing_context.close(ctx) | ||
| ``` | ||
|
|
||
| This has two costs that compound for parallel work: | ||
|
|
||
| 1. **The user threads the context id through every call by hand.** There is no handle to | ||
| curry, so multi-tab code is verbose and error-prone, and event handlers cannot naturally | ||
| mean "this tab". | ||
| 2. **There is no clean unit to hand to a worker.** Driving N tabs concurrently means N | ||
| workers each repeating the `context=` bookkeeping against one shared module object β no | ||
| per-tab identity, no encapsulation. | ||
|
|
||
| Parallelisation is the motivating question. Selenium's BiDi transport is synchronous (one | ||
| WebSocket per driver); concurrency, when wanted, comes from threads. But threads have | ||
| nothing tab-shaped to own. Making one context per worker safe and ergonomic requires (a) a | ||
| per-context object and (b) a transport that is correct under concurrent use β the latter is | ||
| a per-binding internal (see Consequences) and not decided here. | ||
|
|
||
| Playwright is the reference: it exposes Browser β BrowserContext β Page, and **every | ||
| operation lives on the object** (`page.goto()`, `page.screenshot()`), never | ||
| `goto(context_id, url)`. That object identity is exactly what makes | ||
| `asyncio.gather(page_a.goto(...), page_b.goto(...))` β or a thread per page β trivially | ||
| safe, because there is no shared mutable state to coordinate. | ||
|
|
||
| Isolation is the same question one level up. Playwright's `BrowserContext` is an isolated | ||
| partition (separate cookies/storage) that *owns* its pages; BiDi's equivalent is the **user | ||
| context** (`browser.createUserContext`), which the spec defines as a collection of top-level | ||
| contexts with its own storage/cookie/permission/proxy partition. Crucially, a browsing | ||
| context's user context is **fixed when it is created and cannot be reassigned** β the protocol | ||
| has no "move to another user context" command, and child contexts inherit their parent's. So | ||
| the isolation unit cannot be bolted on beside `create()` after the fact; it and the per-context | ||
| handle are one object-model question, and are decided here together. Most users, though, only | ||
| ever want "an isolated tab, or not" β they should not have to learn an isolation object to get | ||
| it. | ||
|
|
||
| ## Decision | ||
|
|
||
| Bindings expose a **per-browsing-context handle object** bound to a single context id. | ||
| Operations that target a context are available as methods on the handle, in addition to the | ||
| existing flat module API. | ||
|
|
||
| Normative requirements: | ||
|
|
||
| - `create(...)`, the entries of `get_tree(...)`, and | ||
| `expect_page()`/`expect_popup()` (see | ||
| [17671](17671-bidi-events-awaited-with-expect-context-managers.md)) return handle objects, | ||
| not bare id strings. A handle exposes the context id for protocol-level use. | ||
| - The handle carries the per-context operations: `navigate`, `reload`, `activate`, `close`, | ||
| `capture_screenshot`, `print`, `set_viewport`, `traverse_history`, `locate_nodes`, | ||
| `handle_user_prompt`, and per-context event registration / `expect_*` waiters scoped to | ||
| **this** context. | ||
| - The existing flat module API | ||
| (`driver.browsing_context.navigate(context=id, ...)`, etc.) **remains** and is the | ||
| compatibility surface; the handle delegates to it. This is additive. | ||
| - **Concurrency contract** (enabled by, but separate from, this decision): a single driver | ||
| may be driven from multiple threads, one context per thread. Bindings state this contract | ||
| explicitly and ensure their transport upholds it (per-binding internal work β lock the | ||
| message/callback state, signal command completion without busy-waiting, bound event | ||
| dispatch). | ||
| - The cross-binding **name** of the handle is part of this decision (candidates: a | ||
| `Page`-like object, `Tab`, `BrowsingContextHandle`). One name, adapted to each language's | ||
| casing. | ||
|
|
||
| Code sketch β Python (reference target): | ||
|
|
||
| ```python | ||
| tab = driver.browsing_context.create(type=WindowTypes.TAB) # -> handle, not a bare id | ||
| tab.navigate("https://example.com") | ||
| tab.capture_screenshot() | ||
| tab.add_event_handler("load", on_load) # scoped to THIS context | ||
| with tab.expect_navigation(url="**/dashboard"): | ||
| tab.click_somehow() | ||
| tab.close() | ||
|
|
||
| # parallelism becomes clean β one object per worker, ids hidden: | ||
| from concurrent.futures import ThreadPoolExecutor | ||
| tabs = [driver.browsing_context.create(type=WindowTypes.TAB) for _ in range(4)] | ||
| with ThreadPoolExecutor() as ex: | ||
| ex.map(lambda t: t.navigate(url), tabs) # safe under the concurrency contract | ||
| ``` | ||
|
|
||
| Code sketch β other bindings (idiomatic shape, same semantics): | ||
|
|
||
| ```javascript | ||
| const tab = await driver.browsingContext().create({ type: 'tab' }); // -> handle | ||
| await tab.navigate('https://example.com'); | ||
| await Promise.all(tabs.map(t => t.navigate(url))); | ||
| ``` | ||
|
|
||
| ```java | ||
| // Java β same semantics, idiomatic shape | ||
| BrowsingContext tab = driver.browsingContext().create(WindowType.TAB); // -> handle | ||
| tab.navigate("https://example.com", ReadinessState.COMPLETE); | ||
| ``` | ||
|
|
||
| ### User contexts (the isolation unit) | ||
|
|
||
| Because a context's user context is fixed at creation (see Context), the isolation unit is the | ||
| **factory** for the contexts in it, not an attach-after API. Two entry points cover the two | ||
| real needs: | ||
|
|
||
| - **The common case is a boolean on creation.** `create(..., isolated=True)` returns an | ||
| ordinary browsing-context handle whose context lives in a fresh user context. **Closing that | ||
| handle also removes the user context it created** (which, per spec, closes any child contexts | ||
| and discards that partition's storage β `removeUserContext` is irreversible). The user never | ||
| touches an isolation object. This is the 80% path. | ||
| - **The explicit case is the factory.** `browser.create_user_context(...)` returns the user | ||
| context, and browsing contexts are created *from* it | ||
| (`user_context.create_browsing_context(...)`). Its lifetime is **caller-managed** | ||
| (`remove()`), because one user context may own several tabs. Use this when tabs must share an | ||
| isolated partition, or to set per-partition options. | ||
| - **A new user context inherits the session's options.** Whether created via `isolated=True` or | ||
| `create_user_context()`, an unset `acceptInsecureCerts` / `proxy` / `unhandledPromptBehavior` | ||
| **defaults to the value the session was started with** (from its `options`), not to the | ||
| browser default; explicit arguments override. An isolated tab therefore behaves like the | ||
| session the user configured. | ||
| - **The isolation types are binding-internal.** The user-context object and the handle types | ||
| are private/implementation structures β returned and usable, but not a prominent public class | ||
| to learn. Bindings keep the surface minimal (id access, `remove`, the factory method), since | ||
| the overwhelming majority of use is `isolated=True`. | ||
| - **The default user context** is reachable through the same model, so ordinary (non-isolated) | ||
| tabs are not a special case. | ||
|
|
||
| ```python | ||
| # 80% β isolation on/off, zero config, returns a normal tab handle | ||
| tab = driver.browsing_context.create(type=WindowTypes.TAB, isolated=True) | ||
| tab.navigate("https://example.com") | ||
| tab.close() # also removes the user context it created (storage discarded) | ||
|
|
||
| # explicit β several tabs in one isolated partition, or per-partition options | ||
| uc = driver.browser.create_user_context(proxy=...) # inherits session opts unless overridden | ||
| a = uc.create_browsing_context() | ||
| b = uc.create_browsing_context() # same isolated partition | ||
| uc.remove() | ||
| ``` | ||
|
|
||
| ### Events are scoped by subscription, not by the user context | ||
|
|
||
| A user context isolates storage, cookies, permissions, and proxy β it does **not** isolate event | ||
| delivery. Which events a subscriber receives is decided by the subscription's scope, evaluated at | ||
| the moment the event fires (BiDi `session.subscribe`): | ||
|
|
||
| - **global** (no scope) β events from every context in every user context; | ||
| - **`contexts=[β¦]`** β only those browsing contexts and their descendant frames; | ||
| - **`userContexts=[β¦]`** β every context in those user contexts, **including ones created later** | ||
| (membership is checked when the event fires, not snapshotted at subscribe time). | ||
|
|
||
| This applies uniformly to `log.*` and `network.*` as to `browsingContext.*`. Consequently an | ||
| `isolated=True` tab does **not** by itself yield isolated logs/network: a global `network`/`log` | ||
| subscription still sees its traffic, and vice versa. To confine log/network events to an isolated | ||
| partition the subscription must be scoped β per tab via the handle (`contexts=[tab]`), or per | ||
| partition via the user context (`userContexts=[uc]`, which also covers future tabs). Bindings | ||
| therefore expose log/network registration on **both** the per-context handle and the user-context | ||
| object, while the bare `network`/`log` module stays global. | ||
|
|
||
| ```java | ||
| // GLOBAL (default): every context, every user context | ||
| new Network(driver).onBeforeRequestSent(r -> | ||
| log("[global] " + r.getRequest().getUrl())); | ||
|
|
||
| // PER-TAB: contexts = [tab.getId()] β this context and its frames only | ||
| BrowsingContext tab = driver.browsingContext().create(WindowType.TAB, /* isolated= */ true); | ||
| tab.network().onResponseCompleted(r -> | ||
| log("[tab] " + r.getResponseData().getUrl())); | ||
|
|
||
| // PER-USER-CONTEXT: userContexts = [profile.getId()] β whole partition, incl. tabs opened later | ||
| UserContext profile = driver.browser().createUserContext(); | ||
| profile.network().onBeforeRequestSent(r -> | ||
| log("[partition] " + r.getRequest().getUrl())); | ||
| profile.createBrowsingContext(WindowType.TAB); // created after subscribe β still delivered | ||
| ``` | ||
|
|
||
| ## Considered options | ||
|
|
||
| - **Per-context handle object, flat API retained (chosen)** β gives multi-tab code an | ||
| object per context, hides ids, makes one-context-per-worker parallelism clean, and is | ||
| purely additive. Matches the model users know from Playwright. | ||
| - **Keep only the flat `context=`-passing API** β no new surface, but leaves the | ||
| id-threading verbosity and gives parallel workers no encapsulated unit. Rejected: it is | ||
| the problem being solved. | ||
| - **Adopt a full async/`Page` object model (asyncio-native, like Playwright)** β the most | ||
| capable model, but a major architectural change to a synchronous binding. Rejected | ||
| here as out of scope; it deserves its own RFC. A synchronous handle plus the concurrency | ||
| contract covers the bulk of real parallel use. | ||
| - **Introduce a universal GUID object registry (Playwright-style routing)** β unnecessary: | ||
| BiDi already keys everything by `context`/`navigation`/`realm` ids. Rejected in favour of | ||
| routing events by the existing context id into the relevant handle. | ||
| - **Isolation as a boolean on `create`, isolation object kept internal (chosen)** β the 80% who | ||
| just want an isolated tab get `isolated=True` and never meet an isolation object; the few who | ||
| need a shared partition or per-partition options use the explicit factory. Matches Playwright's | ||
| split (`browser.new_page()` shortcut vs `new_context()`), but keeps boolean ergonomics. | ||
| - **Expose user context only as a first-class public object (Playwright `BrowserContext` style), | ||
| no shortcut** β rejected: forces everyone who wants a single isolated tab to learn a two-step | ||
| object model they otherwise never need. | ||
| - **Put the per-partition knobs (proxy/certs/prompt) on `create(...)` alongside `isolated`** β | ||
| rejected: conflates per-partition options with per-tab creation. Those options belong on | ||
| `create_user_context()`; `isolated=` stays zero-config and inherits the session's options. | ||
|
|
||
| ## Consequences | ||
|
|
||
| - Multi-tab and parallel code becomes object-oriented and id-free; an instance per worker | ||
| removes the shared-state coordination that the flat API forces. | ||
| - A new handle type per binding, and `create`/`get_tree`/`expect_page`/`expect_popup` return | ||
| types change from bare ids to handles β bindings introduce this additively (the handle still surfaces | ||
| the id; the flat API is unchanged) and document the new return shape. The same applies to | ||
| `create_user_context`/`get_user_contexts`, which now return user-context handles. Making the | ||
| handle a string-compatible id wrapper (equality/hash/serialization unchanged) keeps these | ||
| return-type changes non-breaking. | ||
| - **Prerequisite, not part of this record:** the transport must be safe and efficient under | ||
| concurrent use (no busy-wait, locked shared state, bounded event dispatch). That is a | ||
| per-binding internal change with its own tests; this decision only states the contract it | ||
| must satisfy. | ||
| - **User contexts are folded into this object model** (this decision absorbs what would have | ||
| been a separate record): `isolated=True` for the common case, the `create_user_context()` | ||
| factory for the explicit case, session-option inheritance, and internal/private isolation | ||
| types. Specific follow-on effects: | ||
| - **Behaviour change to flag:** a user context created with an unset option now inherits the | ||
| *session's* option rather than the *browser* default β e.g. `create_user_context(proxy=None)` | ||
| yields the session's proxy. Bindings document this. | ||
| - **High-risk wire mapping (capability/wire-level β verify per binding):** translating the | ||
| session's classic capabilities into BiDi user-context parameters. `acceptInsecureCerts` is a | ||
| clean bool; `proxy` maps the W3C proxy capability to BiDi's proxy-configuration union; | ||
| `unhandledPromptBehavior` maps the classic string to a `UserPromptHandler`, with the classic | ||
| "β¦ and notify" variants mapped to their base action (BiDi surfaces prompts via events | ||
| regardless). Capture from the `options` object at construction (otherwise discarded), with | ||
| the negotiated capabilities as the fallback for Remote attach. | ||
| - **Lifecycle:** closing an `isolated=True` handle removes the user context it created | ||
| (irreversible, discards storage); the explicit factory's lifetime is caller-managed because | ||
| it can own several tabs. | ||
| - Per-context event handlers require the subscription layer to track scope per context | ||
| (today some bindings key subscriptions by event name only, so context scoping is honoured | ||
| only for the first subscriber) β bindings fix this as part of adopting handle-scoped | ||
| events. | ||
|
|
||
| ## Binding status | ||
|
|
||
| | Binding | Status | Notes / tracking link | | ||
| |------------|---------|----------------------------------------------------------------------| | ||
| | Java | pending | | | ||
| | Python | pending | flat module API only (`browsing_context.<op>(context=id)`); no handle object yet | | ||
| | Ruby | pending | | | ||
| | .NET | pending | | | ||
| | JavaScript | pending | | | ||
|
|
||
| ## Appendix | ||
|
|
||
| Relevant BiDi surface: `browsingContext.create` (`type: "tab" | "window"`, optional | ||
| `userContext`), `browsingContext.getTree`, and the per-context commands | ||
| (`navigate`, `reload`, `activate`, `close`, `captureScreenshot`, `print`, `setViewport`, | ||
| `traverseHistory`, `locateNodes`, `handleUserPrompt`), and the `browsingContext.contextCreated` | ||
| event that backs `expect_page`/`expect_popup` (see | ||
| [17671](17671-bidi-events-awaited-with-expect-context-managers.md)). Every browsing-context | ||
| event already carries a `context` id, which is what lets events route to the right handle. | ||
|
|
||
| Isolation unit (verified against the spec): `browser.createUserContext` | ||
| (params `acceptInsecureCerts`, `proxy`, `unhandledPromptBehavior`), `browser.getUserContexts`, | ||
| and `browser.removeUserContext` (which closes all the user context's tabs and permanently | ||
| deletes its storage; the `"default"` user context always exists and cannot be removed). | ||
| `browsingContext.Info` carries a `userContext` field, so `getTree` reports each context's | ||
| partition. A user context is a collection of top-level contexts with its own | ||
| storage/cookie/permission/proxy partition, fixed at creation and inherited by child contexts; | ||
| there is **no** command to move a context to a different user context. This is the protocol | ||
| fact that makes the user context the *factory* for its browsing contexts. | ||
|
|
||
| Event scoping (verified against the spec, Β§3.6): a subscription carries a set of *event names*, | ||
| *top-level traversable ids*, and *user context ids*; `session.subscribe` with neither `contexts` | ||
| nor `userContexts` is a global subscription. At event time the remote end returns true if the | ||
| subscription is global, or if the firing navigable's associated user context is in the | ||
| subscription's user context ids β so a `userContexts` subscription covers contexts created later | ||
| in that partition. This is why user contexts isolate storage but not event delivery. | ||
|
|
||
| No new wire protocol is required β this decision is about the binding-side object model | ||
| (handles, the `isolated=` shortcut, the user-context factory, session-option inheritance) and | ||
| the concurrency contract around it. | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
browsing_contexthere is still verbose, simple accessor for couple of methods even without internal state... Can be simplified:tab,contexthere is an example of alternate names, honestly for me as for end user it is not obvious what is BiDi'sbrowsingContextanduserContext.TaborPageis cleaner.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the most part, I don't think users should need to understand contexts which is why I call it "isolation" in terms of browser context we already have the
switchToAPIs which we can extend to understand isolation.I don't really want too much on driver as we can head into the space of creating a god object which is worse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thinkΒ for the most part the
UserContextdoesn't need to be surfaced to the user. I would for the verbose case have this and for a more succinct API that allows people to do whatever they want