Skip to content

Nested documents new UI options for advanced search #3382

@luis100

Description

@luis100

Feature Overview

Expose Solr nested documents through the RODA UI in two complementary ways:

  1. Advanced AIP Search — filter the AIP catalogue by nested child metadata (e.g. "show AIPs containing an email from X with subject Y")
  2. Virtual Catalogues — a new dropdown context that queries nested children directly as first-class results (e.g. "show all emails across all mailbox AIPs")

The email archive is the reference use case but the implementation must be fully generic — any metadata type that produces nested Solr documents with a content_type field benefits automatically through configuration alone.

Note

Nested document support in Solr is already wired end-to-end: SolrXMLLoader parses <field name="X"><doc>...</doc></field> blocks into child SolrInputDocument entries, and indexDescriptiveMetadataFields propagates them to the parent AIP document. The only missing piece is the ingest XSLT crosswalk and the UI configuration to expose and search those children.


Architecture

Data flow diagram
EmailArchive XML (descriptive metadata)
        │
        ▼ emailarchive.xslt (ingest crosswalk)
Solr parent doc  ←── content_type: emailarchive
  └─ child doc   ←── content_type: email, subject_txt, sender_s, sentDate_dt, ...
  └─ child doc
  └─ ...
        │
        ├── ParentWhichFilterParameter ──► /api/v2/aips/find → IndexResult<IndexedAIP> (parent mailbox AIPs)
        │                                    Used by: Advanced Search nested filter group
        │
        └── ChildOfFilterParameter    ──► /api/v2/aips/find → IndexResult<IndexedAIP> (child email docs)
                                           Used by: Virtual catalogue dropdown
                                           Results rendered by: ConfigurableAsyncTableCell (reads fields map)
                                           Row click: browse/{parentUUID}

Why children come back as IndexedAIP

The API endpoint /api/v2/aips/find always deserialises results into IndexedAIP. For child documents, mandatory AIP fields (permissions, ancestors, ghost, etc.) come from the parent or are empty. Child-specific dynamic Solr fields (subject_txt, sender_s, …) appear in indexedAip.getFields(). ConfigurableAsyncTableCell already reads from this map for any non-default column — no new Java result type is needed.

Child document UUID pattern

Child UUIDs follow the format {parentAIPUUID}/{fieldName}#{index} (e.g. 5f28e162-.../emails#42). The parent UUID is extracted by splitting on / and taking the first segment — used for the row-click navigation to browse/{parentUUID}.


Implementation Phases

# Issue Title Scope
0 #3660 EmailArchive Metadata Schema XSD + XSLT crosswalk + roda-wui.properties registration. Zero Java.
1 #3661 Advanced AIP Search: Nested Filter Groups New nested_group field type in AdvancedSearchFieldsPanel; compiles to ParentWhichFilterParameter.
2 #3662 Virtual Catalogue: Config, SearchWrapper & CatalogueSearch ui.lists.{listId}.catalogue.* config; extend SearchWrapper.Components for multiple IndexedAIP lists; virtual list builders with ChildOfFilterParameter.
3 #3663 History Token Wiring Virtual catalogue selection reflected in URL via existing $prefilter scheme; fix handlePreFilterSearch for empty filter tokens.
4 #3664 i18n and Integration Tests All new labels localised; 4 integration test scenarios covering indexing, child query, parent query, and permission enforcement.
5 #3665 Documentation — Nested Documents feature guide Developer guide: how to add any nested metadata type (XSD + XSLT + config); virtual catalogue setup; advanced search nested groups; EmailArchive worked example.

Dependency order

Phase 0 (XSD + XSLT)
  ├── Phase 1 (nested filter groups)    ← independent of Phase 2, can run in parallel
  └── Phase 2 (virtual catalogue)
       └── Phase 3 (history wiring)
            └── Phase 4 (i18n + tests)
                 └── Phase 5 (documentation)

Key Design Decisions

Important

No new Java result type. ConfigurableAsyncTableCell<IndexedAIP> reads object.getFields().get(c.getField()) for custom columns. Virtual catalogue columns map directly to dynamic Solr field names — no IndexedGenericDocument or VirtualCatalogueList class required.

Important

Config follows ui.lists.{listId}.* convention. Virtual catalogues are defined under the same ui.lists namespace as Search_AIPs, with new catalogue.* sub-keys for dropdown label, icon, base filter, and click action. No new JSON config files.

Important

History reuses $prefilter scheme. Virtual catalogue selection maps to #search/$prefilter/title/{Label}/@{listId}. The @Search_emails token is parsed by the existing handlePreFilterSearch() — it ends up in classFilters keyed by "Search_emails" and the constructor routes it to the virtual list builder. No changes to Search.RESOLVER.

Warning

SearchWrapper.Components clash. The inner class is keyed by Class<? extends IsIndexed>. Multiple virtual catalogues using IndexedAIP would overwrite each other. Phase 2 adds parallel string-keyed maps (virtualSearchPanels, virtualLists) keyed by listId.


Affected Files Summary

All files touched across all phases

New files:

  • roda-core/roda-core/src/main/resources/config/schemas/emailarchive.xsd
  • roda-core/roda-core/src/main/resources/config/crosswalks/ingest/emailarchive.xslt
  • roda-core/roda-core-tests/.../NestedDocumentsTest.java

Modified files:

  • roda-core/roda-common-data/.../data/common/RodaConstants.java
  • roda-core/roda-common-data/.../data/v2/index/SearchField.java
  • roda-ui/roda-wui/src/main/resources/config/roda-wui.properties
  • roda-ui/roda-wui/src/main/resources/config/i18n/ServerMessages.properties (+ all locale files)
  • roda-ui/.../client/common/search/AdvancedSearchFieldsPanel.java
  • roda-ui/.../client/common/search/SearchWrapper.java
  • roda-ui/.../client/common/search/CatalogueSearch.java

Metadata

Metadata

Assignees

Labels

No fields configured for Feature.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions