Skip to content

Conversation

@kodjo-anipah
Copy link
Member

@kodjo-anipah kodjo-anipah commented Dec 3, 2025

Description

Extends the entity suggestion and filtering system to support:

  1. Custom identifier columns (related_identifier) — entity attributes can now specify which column to use when relating entries, instead of always assuming _id.
  2. Composite display values — filter dropdowns can display values composed from multiple fields using a template (e.g. {node_id} ({hostname})).
  3. Computed field filtering — a new provider-based system that enables filtering on runtime/computed values that don't exist in the database (e.g. input runtime status).

Motivation and Context

When using the entity table, we can define EntityAttributes which allow us to specify a related_collection and related_property (the value displayed in the filter options).

Problem 1: Wrong identifier column. Some entities don't reference related entities by MongoDB ObjectId. For example, inputs reference nodes via a node_id (UUID string), not the node's _id. When a user selected a node from the filter dropdown, the filter returned no results because it was matching against _id when it should match against node_id.

Problem 2: Non-human-readable display values. Sometimes there is no single database field that lets users easily identify an entity. For example, nodes are best identified by a combination of the short node_id and hostname (since hostnames alone may be shared across nodes).

Problem 3: Filtering on runtime state. Input runtime status (RUNNING, FAILED, etc.) is held in memory across cluster nodes, not stored in MongoDB. There was no way to filter inputs by their actual runtime state — only by desired_state, which may differ from reality.

Changes

Custom Identifier Columns

EntityAttribute gains three new optional fields:

  • related_identifier — the column in the related collection to match against (defaults to _id)
  • related_display_fields — list of fields to include in the composite display value
  • related_display_template — template string for formatting (e.g. "{node_id} ({hostname})")

These are plumbed through:

  • EntitySuggestionResource / EntitySuggestionService — the suggestion endpoint now accepts identifier, display_fields, display_template, and identifier_type query parameters.
  • MongoEntitySuggestionService — projects and returns the correct fields; builds display values using CompositeDisplayFormatter.
  • EntitySuggestion record — gains an optional targetId field so the frontend knows which value to use as the filter value (the identifier column value), distinct from the MongoDB _id.
  • EntityTitleServiceImpl / EntityIdentifier — the title resolution service now respects custom identifier fields and types when looking up entity titles, and uses composite display formatting when display fields/template are provided.
  • Frontend (SuggestionsListFilter, useFilterValueSuggestions, useFiltersWithTitle, PaginationTypes) — passes the new attribute metadata through to the suggestion API and title resolution, and uses targetId as the filter value when present.

Composite Display Formatting

CompositeDisplayFormatter is a utility that builds a display string from a BSON Document:

  • With a template: replaces {field} placeholders with document values, cleans up empty parentheses from missing values.
  • Without a template: concatenates all field values with spaces.

Computed Field Filtering

A new provider-based system for filtering on values that don't exist in MongoDB:

  • ComputedFieldProvider (interface) — implementations return a Set<String> of entity IDs that match a given filter value. Registered via Guice Multibinder.
  • ComputedFieldRegistry (singleton) — maps field names to their providers. Injected into DbFilterExpressionParser.
  • DbFilterExpressionParser — now separates filter expressions into database fields vs. computed fields. Computed fields are resolved via their provider, and the resulting IDs are injected into the MongoDB query as { _id: { $in: [...] } }. Multiple values for the same computed field are OR'd; different computed fields are AND'd (intersection).
  • DbQueryCreator — accepts an optional authToken parameter (needed for cluster-wide queries) and passes it through to the filter parser.

Input Runtime Status (concrete example)

InputRuntimeStatusProvider implements ComputedFieldProvider for the runtime_status field:

  • Queries the local node's InputRegistry directly (no HTTP overhead).
  • Queries remote cluster nodes in parallel via RemoteInputStatesResource.
  • Caches results for 5 seconds to avoid repeated cluster-wide fetches.
  • Groups granular IOState.Type values into user-friendly categories: Running, Not Running, Setup Mode, Failed.

Supporting changes:

  • InputRegistry — refactored to use an internal InputStateCache with dual-indexed storage (byId and byState maps) for efficient lookups by input ID or by state. Subscribes to IOStateChangedEvent on the EventBus to keep state indexes updated in real time.
  • InputStatesResource — new GET /system/inputstates/local endpoint returning Map<String, String> (inputId → status) for cluster-wide aggregation.
  • RemoteInputStatesResource — Retrofit interface updated with getLocalStatuses().
  • InputsResource — the runtime_status attribute is defined with filterOptions (static set of status group labels) and wired to the ComputedFieldRegistry. The desired_state attribute is no longer filterable (replaced by runtime_status). The paginated list endpoint extracts the auth token from HttpHeaders and passes it through for computed field queries.
  • ServerBindings — binds ComputedFieldRegistry and registers InputRuntimeStatusProvider.

Todo:

Frontend fetches input states to refresh the input state. We need a proper batch call to return only input states for the currently viewable inputs.

Test Plan

  • Verify node filter on the Inputs page shows composite display values (e.g. abc123 (graylog-node-1))
  • Verify selecting a node filter correctly filters inputs by node_id
  • Verify the runtime status filter shows the four status groups (Running, Not Running, Setup Mode, Failed)
  • Verify filtering by runtime status returns correct inputs
  • Verify filter chips display resolved titles after page reload

nodeStatuses.forEach((id, status) ->
result.computeIfAbsent(id, k -> ConcurrentHashMap.newKeySet()).add(status));
} catch (Exception e) {
LOG.debug("Error fetching input states from node {}: {}", entry.getKey(), e.getMessage());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should we do if we fail to fetch from a node

@kodjo-anipah kodjo-anipah changed the title first draft of getting a related identifier Address Entity-table limitations Feb 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant