Avoid full ROOT file downloads for metadata queries (use JSROOT remote access or partial reads)

## Background

Currently, the server downloads the entire remote ROOT file to answer questions about tree/event/collection metadata (e.g., entry counts, branch info, etc.) using `ROOTAnalyzer`. This happens whenever methods like `analyzeFile`, `getEventStatistics`, etc. are called, even though only a small portion of the file is needed for such queries.

This approach is inefficient, especially for large files or datasets with many files, since only the file header, directory structure, and key TTree objects need to be read. Downloading full files can be slow and resource‐intensive.

## Proposal

### 1. Use JSROOT's Remote-File Support
JSROOT supports reading ROOT files directly via HTTP/HTTPS or XRootD URLs, issuing byte-range requests to retrieve only the required metadata blocks. Instead of downloading the file into a buffer and passing a blob to `openFile`, pass the file URL directly to JSROOT:

```js
// Instead of:
const fileData = await this.xrootdClient.readFile(remotePath);
const blob = new Blob([new Uint8Array(fileData)]);
const file = await openFile(blob);

// Use:
const file = await openFile('https://xrootd-server.org/path/to/file.root'); // or root:// url, if JSROOT supports
```

This results in JSROOT fetching only the bytes necessary for metadata queries, significantly reducing transfer time and load.

### 2. (Alternative) Use xrdcp --range or other byte-range approaches
If HTTP endpoints are unavailable, implement logic to read only the file header and key/streamer/TTree objects using partial reads over `root://`, reusing existing range-support in `xrdcp`. This requires a more complex parser, but is possible.

### 3. Cache ROOT File Analysis Results
To avoid repeat downloads for the same (unchanged) file, implement a cache keyed on file path and modification time.

### 4. (Optional) Parallelize Dataset-Wide Operations
For dataset-wide stats, process files in parallel (up to a safe concurrency limit) to improve wall-clock performance.

## References
- [JSROOT Partial Read Example](https://jsroot.gsi.de/latest/examples.htm?file=https://xrootd-public.cern.ch//store/test/root_v6.14.00/Run2012B_SingleMu.root&item=Events)
- [uproot](https://github.com/scikit-hep/uproot5) for reference implementation in Python

## Impact
- Large reduction in bandwidth and latency for all metadata queries
- Makes interactive metadata browsing with large datasets practical
- If combined with caching and (if needed) parallel fetches, the server would be much more scalable for production use

---

**Summary:** Instead of always downloading full ROOT files for metadata queries, support partial-IO approaches (via JSROOT with remote URLs or range reads), and cache results for repeated queries.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid full ROOT file downloads for metadata queries (use JSROOT remote access or partial reads) #58

Background

Proposal

1. Use JSROOT's Remote-File Support

2. (Alternative) Use xrdcp --range or other byte-range approaches

3. Cache ROOT File Analysis Results

4. (Optional) Parallelize Dataset-Wide Operations

References

Impact

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Avoid full ROOT file downloads for metadata queries (use JSROOT remote access or partial reads) #58

Description

Background

Proposal

1. Use JSROOT's Remote-File Support

2. (Alternative) Use xrdcp --range or other byte-range approaches

3. Cache ROOT File Analysis Results

4. (Optional) Parallelize Dataset-Wide Operations

References

Impact

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions