This proposal is an early design sketch by the ChromeOS team to describe the problem of document scanning on the web and solicit feedback on the proposed solution. It has not been approved to ship in Chrome.
This proposal describes a web-native API for document scanning. It aims to provide a standardized, secure, and user-friendly way for web applications to interact with scanning hardware, such as flatbed scanners and automatic document feeders (ADF).
- Discussion: Issue tracker on GitHub
- Introduction
- Terminology
- Goals
- Non-goals
- Use cases
- Potential Solution
- Detailed design discussion
- Considered alternatives
- Security and Privacy Considerations
- Stakeholder Feedback / Opposition
- References & Acknowledgements
Imagine a user trying to upload a signed medical release form to a patient portal, or a small business owner scanning a stack of invoices into an accounting web app. Today, these users must often leave the browser, find an OS-specific scanning application, save the file to their local disk, and then return to the web app to upload it. Alternatively, they might be forced to install proprietary browser extensions or "bridge" software that creates a maintenance burden and security risks.
The Web Scanning API solves this by providing a secure, web-native interface for document scanning. It abstracts underlying protocols—including TWAIN, eSCL (AirScan), SANE, and WSD—into a cohesive JavaScript API. This allows developers to build seamless imaging workflows that work across different hardware and operating systems without needing to understand the intricacies of low-level transport protocols.
By leveraging a user-mediated permission model (similar to WebUSB or WebBluetooth), the API ensures that hardware access is strictly controlled by the user, protecting privacy while enabling powerful new capabilities for the web.
To help developers without background in imaging hardware, we define the following terms:
- Platen (Flatbed): The glass surface where a document is placed manually. Best for single pages or fragile documents.
- ADF (Automatic Document Feeder): A component that automatically pulls a stack of paper into the scanner. Essential for high-volume batch scanning.
- Duplex Scanning: The ability to scan both sides of a piece of paper automatically.
- Resolution (DPI): Measured in Dots Per Inch. Higher DPI means more detail but larger file sizes and slower scan speeds.
- Color Mode: Options typically include
color(24-bit),grayscale(8-bit), andmonochrome(1-bit black and white). - eSCL (AirScan): A modern, driverless, network-based scanning protocol that uses HTTP and XML.
- TWAIN Direct: A modern, driverless successor to the TWAIN standard that uses JSON "ScanTickets" for job configuration.
- Paper Jam: A mechanical error where paper becomes stuck inside the ADF mechanism.
- Standardization: Provide a unified API for document scanning agnostic of the underlying platform.
- Hardware Abstraction: Hide the complexity of various scanning protocols (TWAIN, eSCL, etc.) behind a clean JavaScript interface.
- Security & Privacy: Ensure hardware access is strictly controlled by the user through a permission-gated device picker, preventing silent surveillance or fingerprinting.
- Efficiency: Use streaming to handle high-resolution or multi-page scans without exhausting system memory.
- Asynchronicity: Handle mechanical hardware latency using Promises, events, and asynchronous iterables.
- Low-level Driver Access: The API is not intended to provide raw access to hardware registers or proprietary driver settings that are not relevant to standard scanning tasks.
- Direct Network Discovery: Web pages will not be able to silently discover all scanners on a local network; this discovery is managed by the browser during the device selection process.
A user needs to scan a single document (e.g., a signed contract) from a flatbed scanner into a web-based document management system.
An enterprise user (e.g., in a medical or legal office) needs to scan a large stack of documents using an ADF, potentially with duplex (two-sided) scanning enabled, directly into a record-keeping system.
A user scans a high-resolution photo. To ensure the UI feels responsive, the application displays chunks of the image as they are received. For high-volume workflows, the application can start uploading the first page while the second page is still being mechanically processed.
The API is exposed via navigator.scanners and uses a pattern similar to WebUSB or WebHID, where a user-mediated chooser is the primary entry point for hardware access.
The entry point for the API, used to request access to scanners.
// Triggers a browser-native device picker.
const scanner = await navigator.scanners.requestScanner({
filters: [{ vendorId: '0x04a9' }] // Optional filtering
});Represents a specific scanning device authorized by the user. Identifiers are randomized and scoped to the origin.
const capabilities = await scanner.getCapabilities();
console.log(`Available resolutions: ${capabilities.resolutions}`);
const controller = new AbortController();
const job = await scanner.scan({
resolution: 300,
colorMode: 'color',
source: 'adf',
signal: controller.signal
});Represents an ongoing scan operation. It implements AsyncIterable to yield pages as they complete.
// The job provides streaming access to scanned pages.
for await (const page of job) {
console.log(`Received page ${page.index}`);
const blob = await page.getBlob();
// page also provides a ReadableStream for chunked data
const stream = page.data;
}
job.onpaperjam = () => {
console.error("The ADF is jammed. Please clear the paper path.");
};async function performSimpleScan() {
const scanner = await navigator.scanners.requestScanner();
const job = await scanner.scan({ resolution: 300 });
// For a simple scan, we just take the first result
const { value: firstPage } = await job[Symbol.asyncIterator]().next();
const blob = await firstPage.getBlob();
const img = document.createElement('img');
img.src = URL.createObjectURL(blob);
document.body.appendChild(img);
}async function performBatchScan() {
const scanner = await navigator.scanners.requestScanner();
const job = await scanner.scan({
source: 'adf',
duplex: true
});
for await (const page of job) {
await uploadPage(await page.getBlob());
console.log(`Uploaded page ${page.index}`);
}
console.log("All pages scanned and uploaded.");
}To prevent fingerprinting and unauthorized hardware access, the API mandates a browser-native device chooser. Web applications cannot enumerate all connected scanners without an explicit user gesture. The requestScanner() method triggers this UI, and the resulting Scanner object represents a specific user-granted permission.
Scanning a single A4 page at 600 DPI can produce over 100MB of raw data. Returning this as a single atomic Blob can lead to out-of-memory crashes, especially on mobile devices or low-end hardware. The ScanJob interface uses ReadableStream and AsyncIterable to allow developers to process data as it arrives, enabling progressive rendering and efficient uploads.
Scanning is a mechanical process. Scanners can be Busy, Offline, or in an Error state (e.g., Paper Jam, Cover Open). The ScanJob interface provides event handlers and state properties to allow applications to react to these conditions and guide the user through physical troubleshooting. For consistency, the API maps protocol-specific status codes (e.g., from eSCL's /ScannerStatus) to a standardized set of ScanError types.
To ensure a consistent developer experience across different hardware, the API defines a high-level set of scan options (e.g., colorMode, source) that browsers map to underlying protocol-specific settings (such as SANE options or eSCL XML elements). This abstraction prevents vendor-specific fragmentation while still allowing for the most common professional workflows.
Modern scanners often support advanced hardware-level processing. While the initial version of the API focuses on core imaging (PNG/JPEG output), future extensions may include:
- Automatic Blank Page Detection: Skipping empty pages in a batch.
- Auto-rotation and Auto-cropping: Correcting page orientation and detecting document boundaries.
- Archival and Security Formats: Support for PDF/A (archival fidelity) and cryptographic provenance via C2PA signatures. These features currently require significant new development in underlying platform layers (like
lorgnetteon ChromeOS) and are deferred to later versions of the specification.
Different scanning protocols use different units (e.g., eSCL uses 1/300ths of an inch). The Web Scanning API normalizes all spatial dimensions to millimeters (physical) or CSS pixels (logical), ensuring a consistent developer experience across all hardware.
While WebUSB/WebBluetooth provide low-level access, they require web developers to implement complex protocol drivers (e.g., TWAIN over USB) in JavaScript. This is extremely burdensome, prone to errors, and insecure as it exposes raw hardware buffers to the web page.
Legacy APIs like chrome.documentScan are restricted to specific browser environments and extensions.
- Secure Contexts: The API is strictly limited to HTTPS origins.
- Permissions Policy: Access is controlled by the
web-scanningpolicy, defaulting toself. - Randomized Identifiers: Scanner IDs are randomized and origin-bound.
Scanner.idonexample.comwill be different from the ID for the same device onother-site.com. - Transient Activation: Requests for hardware access (
requestScanner) or starting a scan (scan) require a user gesture (e.g., a click). - Visible Indicators: Browsers should show a persistent "Scanning..." indicator while the hardware is active, similar to camera/microphone indicators.
- Google (Chrome): Positive, driving the proposal.
- TWAIN Working Group: Potentially positive, as their TWAIN Direct standard is explicitly designed to enable scanners to communicate directly with web applications.
- Mopria Alliance (eSCL): Potentially positive, as the proposed API leverages their vendor-neutral driverless standard.