Skip to content

[XM Cyber][Entity Inventory] Add Entity Inventory data stream#19550

Open
muskan-agarwal26 wants to merge 8 commits into
elastic:feature/xm_cyber-0.1.0from
muskan-agarwal26:datastream-entity_inventory
Open

[XM Cyber][Entity Inventory] Add Entity Inventory data stream#19550
muskan-agarwal26 wants to merge 8 commits into
elastic:feature/xm_cyber-0.1.0from
muskan-agarwal26:datastream-entity_inventory

Conversation

@muskan-agarwal26

Copy link
Copy Markdown
Contributor

Proposed commit message

The initial release includes entity_inventory data stream, associated dashboard and visualizations.

XM Cyber fields are mapped to their corresponding ECS fields where possible.

Test samples were derived from live data samples, which were subsequently sanitized.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.
  • I have verified that any added dashboard complies with Kibana's Dashboard good practices

How to test this PR locally

To test the XM Cyber package:

  • Clone integrations repo.
  • Install elastic package locally.
  • Start elastic stack using elastic-package.
  • Move to integrations/packages/xm_cyber directory.
  • Run the following command to run tests.

elastic-package test

2026/06/16 15:05:33  INFO New version is available - v0.124.0. Download from: https://github.com/elastic/elastic-package/releases/tag/v0.124.0
Run asset tests for the package
2026/06/16 15:05:33  INFO elastic-package v0.120.0 version-hash 97620231 (build time: 2026-02-18T21:47:16+05:30)
2026/06/16 15:05:33  INFO elastic-stack: 8.18.0
--- Test results for package: xm_cyber - START ---
╭──────────┬──────────────────┬───────────┬───────────────────────────────────────────────────────────────────┬────────┬──────────────╮
│ PACKAGE  │ DATA STREAM      │ TEST TYPE │ TEST NAME                                                         │ RESULT │ TIME ELAPSED │
├──────────┼──────────────────┼───────────┼───────────────────────────────────────────────────────────────────┼────────┼──────────────┤
│ xm_cyber │                  │ asset     │ dashboard xm_cyber-2170babe-e0a3-4289-a13b-fcb606f812a7 is loaded │ PASS   │      1.105µs │
│ xm_cyber │ entity_inventory │ asset     │ index_template logs-xm_cyber.entity_inventory is loaded           │ PASS   │        163ns │
│ xm_cyber │ entity_inventory │ asset     │ ingest_pipeline logs-xm_cyber.entity_inventory-0.1.0 is loaded    │ PASS   │         93ns │
╰──────────┴──────────────────┴───────────┴───────────────────────────────────────────────────────────────────┴────────┴──────────────╯
--- Test results for package: xm_cyber - END   ---
Done
Run pipeline tests for the package
2026/06/16 15:05:38  INFO elastic-package v0.120.0 version-hash 97620231 (build time: 2026-02-18T21:47:16+05:30)
2026/06/16 15:05:38  INFO elastic-stack: 8.18.0
--- Test results for package: xm_cyber - START ---
╭──────────┬──────────────────┬───────────┬──────────────────────────────────────────────────────┬────────┬──────────────╮
│ PACKAGE  │ DATA STREAM      │ TEST TYPE │ TEST NAME                                            │ RESULT │ TIME ELAPSED │
├──────────┼──────────────────┼───────────┼──────────────────────────────────────────────────────┼────────┼──────────────┤
│ xm_cyber │ entity_inventory │ pipeline  │ (ingest pipeline warnings test-entity-inventory.log) │ PASS   │ 383.237731ms │
│ xm_cyber │ entity_inventory │ pipeline  │ test-entity-inventory.log                            │ PASS   │ 211.352897ms │
╰──────────┴──────────────────┴───────────┴──────────────────────────────────────────────────────┴────────┴──────────────╯
--- Test results for package: xm_cyber - END   ---
Done
Run policy tests for the package
2026/06/16 15:05:39  INFO elastic-package v0.120.0 version-hash 97620231 (build time: 2026-02-18T21:47:16+05:30)
2026/06/16 15:05:39  INFO elastic-stack: 8.18.0
--- Test results for package: xm_cyber - START ---
No test results
--- Test results for package: xm_cyber - END   ---
Done
Run script tests for the package
PKG xm_cyber
[no test files]
--- Test results for package: xm_cyber - START ---
No test results
--- Test results for package: xm_cyber - END   ---
Done
Run static tests for the package
2026/06/16 15:05:40  INFO elastic-package v0.120.0 version-hash 97620231 (build time: 2026-02-18T21:47:16+05:30)
--- Test results for package: xm_cyber - START ---
╭──────────┬──────────────────┬───────────┬──────────────────────────┬────────┬──────────────╮
│ PACKAGE  │ DATA STREAM      │ TEST TYPE │ TEST NAME                │ RESULT │ TIME ELAPSED │
├──────────┼──────────────────┼───────────┼──────────────────────────┼────────┼──────────────┤
│ xm_cyber │ entity_inventory │ static    │ Verify sample_event.json │ PASS   │ 173.810707ms │
╰──────────┴──────────────────┴───────────┴──────────────────────────┴────────┴──────────────╯
--- Test results for package: xm_cyber - END   ---
Done
Run system tests for the package
2026/06/16 15:05:40  INFO elastic-package v0.120.0 version-hash 97620231 (build time: 2026-02-18T21:47:16+05:30)
2026/06/16 15:05:40  INFO elastic-stack: 8.18.0
2026/06/16 15:05:40  INFO Installing package...
2026/06/16 15:05:53  INFO Running test for data_stream "entity_inventory" with configuration 'default'
2026/06/16 15:06:02  INFO Setting up independent Elastic Agent...
2026/06/16 15:06:12  INFO Setting up service...
2026/06/16 15:06:33  INFO Validating test case...
2026/06/16 15:06:34  INFO Tearing down service...
2026/06/16 15:06:35  INFO Write container logs to file: /root/elastic/gitlab-integration/integrations/build/container-logs/xm_cyber-1781602595318074434.log
2026/06/16 15:06:38  INFO Tearing down agent...
2026/06/16 15:06:38  INFO Write container logs to file: /root/elastic/gitlab-integration/integrations/build/container-logs/elastic-agent-1781602598262733231.log
2026/06/16 15:06:48  INFO Uninstalling package...
--- Test results for package: xm_cyber - START ---
╭──────────┬──────────────────┬───────────┬───────────┬────────┬───────────────╮
│ PACKAGE  │ DATA STREAM      │ TEST TYPE │ TEST NAME │ RESULT │  TIME ELAPSED │
├──────────┼──────────────────┼───────────┼───────────┼────────┼───────────────┤
│ xm_cyber │ entity_inventory │ system    │ default   │ PASS   │ 40.991983944s │
╰──────────┴──────────────────┴───────────┴───────────┴────────┴───────────────╯
--- Test results for package: xm_cyber - END   ---
Done

Related issues

Screenshots

integrations_1 integrations_2

Implementation Details

Default Config Values:

  • interval: 24h
  • page_size: 1000

@muskan-agarwal26 muskan-agarwal26 requested a review from a team as a code owner June 16, 2026 09:37
@elastic-vault-github-plugin-prod

Copy link
Copy Markdown

🚀 Benchmarks report

To see the full report comment with /test benchmark fullreport

).do_request().as(refResp,
refResp.Body.decode_json().as(ro,
{
"events": [{"message": "retry"}],

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 HIGH data_stream/entity_inventory/.../cel.yml.hbs:114

Retry-path event becomes phantom pipeline_error doc

On a 401/419 the CEL program emits {"events": [{"message": "retry"}], ...}. The ingest pipeline has no drop for a message of 'retry' — only for 'Refresh token successful' and 'Refresh token expired, forcing re-auth', which the CEL never produces. So rename_message_to_event_original moves retry into event.original, then json_event_original_into_xm_cyber_entity_inventory tries to parse the bare string retry as JSON, fails, and the pipeline-level on_failure block sets event.kind: pipeline_error and tags the doc preserve_original_event. Every token-refresh cycle therefore lands one phantom error document in logs-xm_cyber.entity_inventory-*.

Recommendation:

Either stop emitting an event on the retry path (mutate cursor only and rely on want_more) or, if a marker event is wanted, drop it in the pipeline. The CEL-only fix:

: (resp.StatusCode == 401 || resp.StatusCode == 419) ?
  post_request(
    base + "/api/refresh-token",
    "application/json",
    {"refreshToken": tok.refresh}.encode_json()
  ).with(
    {"Header": {"Content-Type": ["application/json"]}}
  ).do_request().as(refResp,
    refResp.Body.decode_json().as(ro,
      {
        "events": [],
        "cursor": (refResp.StatusCode == 200) ?
          {
            "access_token": ro.accessToken,
            "refresh_token": ro.refreshToken,
            "need_reauth": false,
          }
        :
          {
            "access_token": "",
            "refresh_token": "",
            "need_reauth": true,
          },
        "want_more": true,
      }
    )
  )

🤖 AI-Generated Review | Vera Review Bot

⚠️ Automated review — verify suggestions before applying.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This review is only half correct; the incorrect half is that emitting a cursor in conjunction with an empty events array will have any effect.

The correct fix is to either update the drop processor to match the retry message, or to add a filebeat drop to the agent template and remove the drop processor.

- drop:
description: Drops CEL informational token-refresh success messages from the ingest stream.
tag: drop_cel_refresh_token_success
if: ctx.message instanceof String && ctx.message == 'Refresh token successful'

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MEDIUM data_stream/entity_inventory/.../default.yml:17

Dead drop processors for refresh-token messages

Both drop_cel_refresh_token_success (message == 'Refresh token successful') and drop_cel_refresh_token_force_reauth (message == 'Refresh token expired, forcing re-auth') match strings the CEL program never produces. The CEL emits 'retry' on the token-refresh path; nothing in this package emits the two literal phrases the drops are guarding against. They are inert filters.

Recommendation:

Remove the stale drops, or replace them with one that matches what this CEL actually emits if the retry event is kept:

- drop:
    description: Drops CEL token-refresh retry signal events.
    tag: drop_cel_refresh_token_retry
    if: ctx.message instanceof String && ctx.message == 'retry'

🤖 AI-Generated Review | Vera Review Bot

⚠️ Automated review — verify suggestions before applying.

@@ -0,0 +1,3 @@
{"id":"11405078888731052442","accessKeyCreationDate":"Unknown","podIP":"","ec2PublicIpAddress":"","agentVersion":{"major":1,"minor":55,"patch":2},"agentVersionStr":"1.55.2","arch":"Amd64","cmId":"0000","connectionCounter":160,"customProperties":{"snifferStatus":"Active","snifferStatusChangeable":true,"domainWorkgroup":{"type":"workgroup","data":"workgroup"},"ouComputer":"workgroup","subnetInfo":"172.0.0.0/24","macAddresses":["00:50:56:3D:0A:93"],"ouUser":"workgroup","labels":[{"label":"spooler"},{"label":"device_without_edr"}],"snifferStatusConfiguration":"ForcedEnabled","custom_labels":[{"label":"testLabel"},{"label":" sn : azure Identity : sn "},{"label":"Azure Identity"},{"label":"2test_vmazure virtual machine_azure_test"},{"label":"1azureCRazure Container registry_test"}],"hardwareInfo":{"totalRamMb":"2047","cpuProcessorType":"Intel(R) Xeon(R) Gold 5318Y CPU @ 2.10GHz","cpuCoreCount":1,"cpuCount":1,"cpuManufacturer":"GenuineIntel","cpuSpeedMhz":2095,"systemManufacturer":"VMware, Inc.","systemModel":"VMware Virtual Platform"}},"customerId":"fda93183-19f4-447d-bd49-83633329ee37","disabled":false,"disabledChangedAt":"2025-12-01T05:31:01.103Z","disabledReason":"revivedByCmNodeMgr","firstSeen":"2024-08-07T12:18:26.093Z","hasUpdateAvailable":false,"installationId":"00000000-0000-0000-0000-000000000001","ipv4":[{"data":[192,168,1,203],"type":"Buffer"}],"ipv4Num":[2885681155],"ipv4Str":["192.0.2.0"],"ipv6":[{"data":[253,170,63,62,245,208,0,1,168,44,1,25,76,65,80,190],"type":"Buffer"}],"ipv6Str":["fe80::1c2c:5b3a:97df:13f1"],"lastConnectionTime":"2026-05-03T08:40:06.399Z","lastDisconnectionReason":"Keepalive","lastRebootTime":"2025-05-15T09:33:32.000Z","lastStatusChange":"2026-05-03T08:40:06.399Z","latestPossibleAgentVersion":{"major":1,"minor":55,"patch":2},"latestPossibleAgentVersionStr":"1.55.2","name":"172-0-0-3","nameUppercase":"172-0-0-3","notIncludedInAttacks":false,"os":{"version":{"build":0,"major":10,"minor":0,"patch":18363},"servicePack":{"build":0,"major":0,"minor":0,"patch":0},"distributionName":"","distributionVersion":"","name":"Windows 10 ver 1909"},"osType":"Windows","productType":"Workstation","remoteAddress":"199.203.99.104","securityFlags":["hasSession","hasCachedCredentials"],"status":"active","timeToReviveAt":"2026-05-08T00:00:00Z","type":"agent","typeDisplayName":"Device","hasMatchingSID":false,"lastUpdatedAt":"2026-05-03T09:06:44.280Z","securityFlagsForDisplay":[{"key":"examplekey"}],"southOwner":"south-owner-1","domainName":"workgroup","labels":[{"id":"testLabel","type":"custom"},{"id":" sn : azure Identity : sn ","type":"custom"},{"id":"Azure Identity","type":"custom"},{"id":"2test_vmazure virtual machine_azure_test","type":"custom"},{"id":"1azureCRazure Container registry_test","type":"custom"},{"id":"!@$TEST:2))","type":"custom"},{"id":" sn : Access Token : sn ","type":"custom"},{"id":"Email Service","type":"custom"},{"id":"shirel-device","type":"custom"},{"id":"felix test","type":"custom"}],"machineId":"ca722442-9a91-849d-0fdd-438e7a0701f1","agentType":"Service","category":"enterprise","xmLabels":[{"id":"Spooler server"},{"id":"Public IP"},{"id":"Device without EDR"}],"importedLabels":["SN Name : 172-0-0-3","SN Created : 2026-04-14 05:15:00"],"entityDetails":{"name":"172-0-0-3","id":"11405078888731052442","isAsset":true,"subType":"windows","subTypeDisplayName":"Device"},"accountId":"702947630755","arn":"arn:aws:ssm:us-east-2:702947630755:parameter/EC2Rescue/Passwords/i-0d056ac1b7c822c92","displayName":"/EC2Rescue/Passwords/i-0d056ac1b7c822c92","entityType":"agent","region":"us-east-2","ruleDisplayName":"702947630755 / /EC2Rescue/Passwords/i-0d056ac1b7c822c92","ssmParameterDataType":"text","ssmParameterDescription":"New local Administrator password for instance i-0d056ac1b7c822c92","ssmParameterKeyId":"alias/aws/ssm","ssmParameterLastModifiedDate":"2021-07-28T08:11:54.200Z","ssmParameterLastModifiedUser":"arn:aws:sts::702947630755:assumed-role/AmazonSSMRoleForInstancesQuickSetup/i-0d056ac1b7c822c92","ssmParameterName":"/EC2Rescue/Passwords/i-0d056ac1b7c822c92","ssmParameterTier":"Standard","ssmParameterType":"SecureString","ssmParameterVersion":1,"useType":"Storage","xmProviderAccount":"xm-test3","xmUpdateTime":"2026-05-05T21:05:15.079Z","accountName":"xm-test3","organizationId":"o-wvjziar78j","awsTags":[{"Key":"aws:cloudformation:stack-id","Value":"arn:aws:cloudformation:us-east-1:908522078858:stack/StackSet-crowdstrike-SensorManagement-9fb10f6b-9dc3-4c3c-a078-dcec6bde4487/3493fc10-2bf9-11f0-a92a-0affd5d0d7df"},{"Key":"aws:cloudformation:stack-name","Value":"StackSet-crowdstrike-SensorManagement-9fb10f6b-9dc3-4c3c-a078-dcec6bde4487"},{"Key":"aws:cloudformation:logical-id","Value":"CrowdStrikeSensorManagementFalconCredentialsSecret"}],"secretKmsKeyId":"alias/tenant-secret-kms-local","secretDescription":"Falcon API credentials used by the 1-Click sensor management orchestrator.","tagsStr":["aws:cloudformation:stack-id: arn:aws:cloudformation:us-east-1:908522078858:stack/StackSet-crowdstrike-SensorManagement-9fb10f6b-9dc3-4c3c-a078-dcec6bde4487/3493fc10-2bf9-11f0-a92a-0affd5d0d7df","aws:cloudformation:stack-name: StackSet-crowdstrike-SensorManagement-9fb10f6b-9dc3-4c3c-a078-dcec6bde4487","aws:cloudformation:logical-id: CrowdStrikeSensorManagementFalconCredentialsSecret"],"kmsKeyAliases":["alias/aws/secretsmanager","alias/example"],"kmsKeyCreationDate":"2024-12-05T15:14:24.368Z","kmsKeyDescription":"","kmsKeyManager":"CUSTOMER","kmsKeyOrigin":"AWS_KMS","kmsKeyState":"Enabled","kmsKeyUsage":"ENCRYPT_DECRYPT"}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MEDIUM data_stream/entity_inventory/.../test-entity-inventory.log:1

Real-looking customer data in pipeline test fixture

The first event contains values that are not synthetic per the anonymize-logs conventions: real-shaped AWS account IDs (702947630755, 908522078858), a real-looking public IPv4 (remoteAddress: 199.203.99.104), a real-looking principal email inside an SSM parameter assumed-role ARN (...zur@​xmcyber.com), and a real-looking CloudFormation stack ID (StackSet-crowdstrike-SensorManagement-9fb10f6b-9dc3-4c3c-a078-dcec6bde4487/3493fc10-2bf9-11f0-a92a-0affd5d0d7df). Pipeline test fixtures are committed and visible in the public repo.

Recommendation:

Replace with placeholders the anonymize-logs skill specifies — RFC 5737 IPs, example.com domains, synthetic UUIDs, and 123456789012-style AWS IDs:

{"id":"11405078888731052442","remoteAddress":"203.0.113.50","accountId":"123456789012","arn":"arn:aws:ssm:us-east-2:123456789012:parameter/EC2Rescue/Passwords/i-0d056ac1b7c822c92","ssmParameterLastModifiedUser":"arn:aws:sts::123456789012:assumed-role/AmazonSSMRoleForInstancesQuickSetup/i-0d056ac1b7c822c92","ruleDisplayName":"123456789012 / /EC2Rescue/Passwords/i-0d056ac1b7c822c92"}

🤖 AI-Generated Review | Vera Review Bot

⚠️ Automated review — verify suggestions before applying.

@andrewkroh andrewkroh added documentation Improvements or additions to documentation. Applied to PRs that modify *.md files. Crest Contributions from Crest developement team. labels Jun 16, 2026
}
`}}

# Page 2 — fetched via nextLink cursor=page2 (more specific match, must precede page 1)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of using rule order to restrict returned pages; it's brittle and not specified anywhere. We can use the query_params in this case AFAICS; if the cursor param is set to null, we can enforce an absent cursor. Then we can arbitrarily order the rules and so make them reader-friendly, in expectation order.

).do_request().as(refResp,
refResp.Body.decode_json().as(ro,
{
"events": [{"message": "retry"}],

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This review is only half correct; the incorrect half is that emitting a cursor in conjunction with an empty events array will have any effect.

The correct fix is to either update the drop processor to match the retry message, or to add a filebeat drop to the agent template and remove the drop processor.

Comment on lines +2115 to +2120
- append:
tag: append_preserve_on_collector_error
field: tags
value: preserve_original_event
allow_duplicates: false
if: ctx.error?.message != null

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add a

  - set:
      tag: set_pipeline_error_to_event_kind
      field: event.kind
      value: pipeline_error
      if: ctx.error?.message != null

"by": "_count",
"direction": "desc"
},
"title": "Included in Attacks"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is opposite to line 33. Either it should be renamed here to reflect the actual semantics (small change), or preferably the name should remain the same and the field semantics and name should be either inverted, or an additional field holding the negation should be added.

@muskan-agarwal26 muskan-agarwal26 requested a review from efd6 June 17, 2026 10:57

@chemamartinez chemamartinez left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to map as many entity fields as possible with ECS entity fields (https://www.elastic.co/docs/reference/ecs/ecs-entity).

Based on sample logs from pipeline and system tests, the fields that should be mapped are:

  • {host|user}.entity.lifecycle.last_activity (depending on the entity type, device or user)
  • user.entity.attributes.mfa_enabled

Regarding relationships fields, I don't see any data that we could map but I'd check if it is possible to get relationship data from the API with any extra parameters or endpoints.

Also, if the data stream collects entity inventory, it is important to set event.kind = asset for further entity workflows. Currently its value is state which makes less sense in my opinion (https://www.elastic.co/docs/reference/ecs/ecs-allowed-values-event-kind).

@elasticmachine

Copy link
Copy Markdown

💚 Build Succeeded

History

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Crest Contributions from Crest developement team. documentation Improvements or additions to documentation. Applied to PRs that modify *.md files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants