Skip to content

config_format: yaml: handle configuration metadata on yaml conf#11690

Open
cosmo0920 wants to merge 5 commits intomasterfrom
cosmo0920-handle-configuration-metadata-on-yaml-conf
Open

config_format: yaml: handle configuration metadata on yaml conf#11690
cosmo0920 wants to merge 5 commits intomasterfrom
cosmo0920-handle-configuration-metadata-on-yaml-conf

Conversation

@cosmo0920
Copy link
Copy Markdown
Contributor

@cosmo0920 cosmo0920 commented Apr 9, 2026

Closes #11683.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • New Features

    • Added top-level metadata section in YAML configs, supporting nested metadata properties (e.g., usecase, config_version, annotations).
  • Bug Fixes / Validation

    • Stricter validation: mapping/sequence values allowed only in metadata and rejected in other top-level sections.
  • Tests

    • Added fixtures and tests covering metadata parsing, nested metadata contents, a probabilistic sampling processor, and rejection of nested maps in non-metadata sections.

Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 9, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ad807d0e-edc6-4938-bf05-d05f71adb423

📥 Commits

Reviewing files that changed from the base of the PR and between aaef92c and e56d7b5.

📒 Files selected for processing (2)
  • src/config_format/flb_cf_yaml.c
  • tests/internal/config_format_yaml.c
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/internal/config_format_yaml.c

📝 Walkthrough

Walkthrough

Adds a top-level metadata YAML section to the parser: new enums/states, top-level recognition and allocation of a metadata config section, variant-driven parsing for mapping/sequence values inside metadata, tighter rejection rules for nested maps in non-metadata unknown sections, and unit tests/fixtures exercising these behaviors.

Changes

Cohort / File(s) Summary
Core Parser Logic
src/config_format/flb_cf_yaml.c
Introduce SECTION_METADATA / STATE_METADATA; recognize metadata at top-level and allocate cf_section; push variant-reading states for mapping/sequence values in metadata; insert parsed cfl_variant values into cf_section->properties; tighten STATE_SECTION_VAL to reject nested maps/sequences for env and non-metadata; use state_push_section(..., STATE_OTHER, SECTION_OTHER) for unknown top-level sections.
Tests & Harness
tests/internal/config_format_yaml.c
Add fixtures FLB_007/FLB_008; extend test_processors() expectations to include the new probabilistic processor; add test_metadata_section() to validate metadata parsing and properties; add test_other_nested_map_rejected() to assert rejection of nested maps in unknown top-level sections; register new tests.
YAML Fixtures
tests/internal/data/config_format/yaml/metadata.yaml, tests/internal/data/config_format/yaml/other_with_nested_map.yaml, tests/internal/data/config_format/yaml/processors.yaml
Add metadata.yaml with env, service, metadata (including nested annotations) and pipeline; add other_with_nested_map.yaml to trigger rejection of nested maps outside metadata; extend processors.yaml with a probabilistic sampling processor entry.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Suggested reviewers

  • edsiper

Poem

🐰 I nibble YAML leaves so bright,
I hide metadata out of sight,
Maps and lists I gently store,
Variants hop into my core —
Hoppity parse, config feels light! 🥕📋

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 14.29% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: implementing metadata section handling in YAML configuration format.
Linked Issues check ✅ Passed The implementation fully addresses #11683 requirements: accepts and validates metadata as legal YAML, stores it separately from operational config, and ensures it has no runtime effect.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing metadata section support as specified in #11683; no unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch cosmo0920-handle-configuration-metadata-on-yaml-conf

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
tests/internal/config_format_yaml.c (1)

840-887: Please assert the config still loads with metadata present.

This verifies parse-time retention, but the feature contract is “accept and ignore operationally.” Adding a flb_config_load_config_format() success check here would catch any later regression where cf->others starts affecting runtime config loading.

As per coding guidelines, tests/**: "Add or update tests for behavior changes, especially protocol parsing and encoder/decoder paths" and "Validate both success and failure paths (invalid payloads, boundary sizes, null/missing fields) in tests".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/internal/config_format_yaml.c` around lines 840 - 887, The test parses
a YAML config into a flb_cf (test_metadata_section via flb_cf_yaml_create) but
doesn’t assert that runtime config loading still succeeds when metadata is
present; call flb_config_load_config_format(cf) after the existing
parse/metadata assertions and add a TEST_CHECK that it returns success (e.g., 0)
to ensure cf->others/metadata is ignored at load time; finally proceed to
flb_cf_destroy(cf) as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/config_format/flb_cf_yaml.c`:
- Line 1839: In the STATE_METADATA handling in flb_cf_yaml.c ensure top-level
metadata scalar nodes are routed through state_variant_parse_scalar() instead of
falling through to flb_cf_section_property_add(); update the STATE_METADATA case
(and the similar block around lines 1921-1937) to detect scalar tokens at the
top metadata level and call state_variant_parse_scalar() to create CFL_VARIANT_*
values, then add the resulting variant to the metadata section rather than
storing the raw string via flb_cf_section_property_add(), preserving type parity
with nested metadata parsing.
- Around line 2420-2423: When cfl_kvlist_insert(state->cf_section->properties,
state->key, variant) fails the newly created nested value stored in variant is
leaked; fix by destroying/freeing variant before returning (call the appropriate
cleanup function, e.g., cfl_variant_destroy(variant)) and then return
YAML_FAILURE. Update the error path in the block that checks the return of
cfl_kvlist_insert so it releases variant (and nulls it if your codebase
convention requires) to avoid the memory leak; refer to symbols
cfl_kvlist_insert, state->cf_section->properties, state->key, and variant to
locate the change.

---

Nitpick comments:
In `@tests/internal/config_format_yaml.c`:
- Around line 840-887: The test parses a YAML config into a flb_cf
(test_metadata_section via flb_cf_yaml_create) but doesn’t assert that runtime
config loading still succeeds when metadata is present; call
flb_config_load_config_format(cf) after the existing parse/metadata assertions
and add a TEST_CHECK that it returns success (e.g., 0) to ensure
cf->others/metadata is ignored at load time; finally proceed to
flb_cf_destroy(cf) as before.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b477964d-a275-4772-af84-5bc1479daa1c

📥 Commits

Reviewing files that changed from the base of the PR and between 40a6d4a and 89d9f20.

📒 Files selected for processing (4)
  • src/config_format/flb_cf_yaml.c
  • tests/internal/config_format_yaml.c
  • tests/internal/data/config_format/yaml/metadata.yaml
  • tests/internal/data/config_format/yaml/other_with_nested_map.yaml

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 89d9f20510

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

return YAML_FAILURE;
}

if (cfl_kvlist_insert(state->cf_section->properties, state->key, variant) < 0) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Normalize metadata variant keys before insertion

When a metadata value is a map/array, this branch inserts it with raw state->key via cfl_kvlist_insert, while scalar metadata values still go through flb_cf_section_property_add (which applies the usual key normalization/translation). That makes key handling depend on value type: e.g., a camelCase key with a scalar is normalized, but the same key with a nested object is not, so lookups and downstream handling become inconsistent for mixed metadata shapes.

Useful? React with 👍 / 👎.

Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/internal/config_format_yaml.c`:
- Around line 472-480: The test currently dereferences second_processor and
sampling_type immediately after TEST_CHECK, which can crash on regression;
update the assertions in tests/internal/config_format_yaml.c to first verify
second_processor != NULL and second_processor->type == CFL_VARIANT_KVLIST (from
cfl_array_fetch_by_index) before accessing second_processor->data, and similarly
verify sampling_type != NULL and sampling_type->type == CFL_VARIANT_STRING (from
cfl_kvlist_fetch) before reading sampling_type->data.as_string; implement these
guards using explicit if checks around the existing checks (or convert
TEST_CHECK to conditional TEST_SKIP/return on failure) so failures produce clean
test assertions instead of dereference crashes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2e234d49-63da-4ee4-90dc-8c49795acb02

📥 Commits

Reviewing files that changed from the base of the PR and between 89d9f20 and bc08185.

📒 Files selected for processing (2)
  • tests/internal/config_format_yaml.c
  • tests/internal/data/config_format/yaml/processors.yaml
✅ Files skipped from review due to trivial changes (1)
  • tests/internal/data/config_format/yaml/processors.yaml

Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

YAML Config reserve attribute name for customer attributes

1 participant