Skip to content

in_tail: harden db offset restore across restart#11691

Merged
edsiper merged 3 commits intomasterfrom
integration-tail
Apr 9, 2026
Merged

in_tail: harden db offset restore across restart#11691
edsiper merged 3 commits intomasterfrom
integration-tail

Conversation

@edsiper
Copy link
Copy Markdown
Member

@edsiper edsiper commented Apr 9, 2026

This branch hardens in_tail restart recovery and adds broad regression coverage. The main plugin change makes DB restore safer by validating persisted offsets before reuse, preventing resume into stale positions after copytruncate or inode reuse. It also persists the last resumable offset when a trailing partial line is buffered, so restarts complete that line exactly once. To support that metadata, the SQLite schema is extended with additive automatic migration.

On the test side, the branch adds a comprehensive in_tail integration scenario covering discovery, startup-from-tail, rename rotation, repeated rotation, copytruncate, restart resume, partial-line restart recovery, DB migration, multiline, parser mode, long-line handling, rotate_wait, gzip, generic encoding, ignore_older, exclude_path, docker mode, and permission restoration. It also adds a separate multiline_interface integration scenario with a compiled probe that proves current API-level multiline bugs through strict xfail regression cases.


Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • New Features

    • Added an offset-marker mechanism to improve resume validation for database-backed tailing and reduce false resuming into replaced content.
  • Bug Fixes

    • Improved handling of offsets and stream positions for decompressed reads and fixed multiline parser group/overflow edge cases.
    • Enhanced ignore-older behavior to avoid re-ingesting aged-out files.
  • Documentation

    • Added comprehensive in_tail test scenario documentation.
  • Tests

    • Added extensive integration scenarios, parser configs, and end-to-end pytest coverage.
  • Chores

    • Added safe database schema migrations and stronger persistence checks at startup.

@edsiper edsiper requested a review from cosmo0920 as a code owner April 9, 2026 17:27
@edsiper edsiper changed the title in_tail: in_tail: harden db offset restore across restart Apr 9, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 9, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds DB-backed offset-marker persistence and validation to in_tail: stores a hash/size of bytes preceding persisted offsets, runs DB schema migrations, validates markers on resume to detect rewrites/truncations, adjusts file positioning/stream offset logic, tracks aged-out inodes, and adds extensive integration and multiline tests.

Changes

Cohort / File(s) Summary
Offset Marker DB & SQL
plugins/in_tail/tail_db.c, plugins/in_tail/tail_sql.h
Added db_apply_migration_if_needed() and two ALTER TABLE migrations; run migrations on DB open and abort on failure; select/insert/update offset_marker and offset_marker_size; use DB-derived offset for inserts/updates; expanded prepared-statement bindings.
File-level Marker Logic
plugins/in_tail/tail_file.c, plugins/in_tail/tail_file.h, plugins/in_tail/tail_file_internal.h
Added per-file fields db_offset_marker/db_offset_marker_size; new inline helper flb_tail_file_db_offset(); new APIs flb_tail_file_update_offset_marker() and flb_tail_file_offset_marker_matches(); integrate marker validation into set_file_position() and adjust stream_offset assignment for decompressed reads; initialize marker fields.
Aged-out inode tracking & scan changes
plugins/in_tail/tail_config.c, plugins/in_tail/tail_config.h, plugins/in_tail/tail_scan.c, plugins/in_tail/tail_scan.h, plugins/in_tail/tail_scan_glob.c, plugins/in_tail/tail_scan_win32.c
Added aged_out_file_inodes hash table to config; new register/unregister/fetch helpers and usage in glob/Win32 scanning to exclude files whose inode matches a stored aged-out inode; ensured proper allocation and cleanup.
Integration tests, scenarios & configs
tests/integration/scenarios/in_tail/..., tests/integration/scenarios/in_tail/tests/test_in_tail_001.py
Added README, ~18 YAML scenario configs covering rotation modes, parsers, encoding, db-backed resume, ignore/purge semantics, and a comprehensive pytest module with helpers exercising DB migrations, offset-marker behavior, rotations, and many restart cases.
Multiline regression tests
tests/internal/multiline.c
Added multiline-group test helpers, captured-logs accumulator, parser factory, and four regression tests exercising group flush/truncation and empty-context crash cases.

Sequence Diagram(s)

sequenceDiagram
    actor Operator
    participant Tail as "Tail Plugin"
    participant FS as "File System"
    participant DB as "SQLite DB"

    Note over Tail,DB: Normal run — persist offset + marker
    Operator->>Tail: Start tailing file
    Tail->>FS: Read bytes
    Tail->>Tail: Compute DB offset and offset_marker
    Tail->>DB: INSERT/UPDATE offset, offset_marker, offset_marker_size
    DB-->>Tail: OK

    Note over Tail,DB: Resume after restart
    Operator->>Tail: Restart/resume
    Tail->>DB: SELECT offset, offset_marker, offset_marker_size
    DB-->>Tail: stored_offset + marker
    Tail->>FS: Read bytes preceding stored_offset
    Tail->>Tail: Recompute marker
    alt Marker matches
        Tail->>FS: Seek to stored_offset (resume)
        FS-->>Tail: Ready
    else Marker mismatch
        Tail->>Tail: Reset file offsets to 0
        Tail->>DB: UPDATE stored offset to 0
        Tail->>FS: Seek to file start (re-read)
        FS-->>Tail: Ready
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • #11460 — Modifies the same in_tail DB paths (e.g., flb_tail_db_open, flb_tail_db_file_set, flb_tail_db_file_offset) and may overlap on schema/DB logic.

Suggested labels

backport to v4.2.x

Suggested reviewers

  • cosmo0920
  • koleini

Poem

🐰 A rabbit's rhyme on offset marks:
I nibble bytes where quiet offsets lie,
I hash the trail, then wink an eager eye.
If markers wander, I hop back to start —
Reset, re-read, and guard each log’s heart. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 5.32% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'in_tail: harden db offset restore across restart' accurately describes the main change: hardening database offset restoration when the in_tail plugin restarts, which involves validating persisted offsets and extending the SQLite schema.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch integration-tail

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
tests/integration/scenarios/in_tail/tests/test_in_tail_001.py (1)

617-640: Double-close of writer_old is handled safely but could be cleaner.

Line 631 calls writer_old.close() and the finally block at line 639 calls it again. While the close() method handles this safely (checks for None), consider removing the explicit close at line 631 since the finally block will handle cleanup.

♻️ Suggested simplification
         os.unlink(active_log)
-        writer_old.close()
 
         time.sleep(3)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration/scenarios/in_tail/tests/test_in_tail_001.py` around lines
617 - 640, The test test_in_tail_delete_and_recreate_same_path_is_reingested
calls writer_old.close() twice (an explicit close before sleep and again in the
finally block); remove the intermediate writer_old.close() and rely on the
finally block to close writer_old to keep cleanup centralized and avoid
redundant calls while preserving safety in PersistentWriter.close().
plugins/in_tail/tail_db.c (1)

224-228: Implicit signed-to-unsigned conversion for offset_marker_size.

sqlite3_column_int64() returns a signed sqlite3_int64, but offset_marker_size is size_t (unsigned). While the value should always be non-negative in practice (it's a byte count), consider adding a defensive check or cast to make the conversion explicit.

🔧 Suggested defensive cast
         /* offset_marker_size: column 5 */
-        *offset_marker_size = sqlite3_column_int64(ctx->stmt_get_file, 5);
+        *offset_marker_size = (size_t) sqlite3_column_int64(ctx->stmt_get_file, 5);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/in_tail/tail_db.c` around lines 224 - 228, The assignment to
*offset_marker_size uses sqlite3_column_int64(ctx->stmt_get_file, 5) (signed
sqlite3_int64) but offset_marker_size is a size_t; update the code in the
function that reads columns from ctx->stmt_get_file to explicitly handle
signed-to-unsigned conversion by reading the value into a sqlite3_int64 temp
(e.g. val = sqlite3_column_int64(...)), check if val is negative and handle that
case (e.g. treat as 0 or return an error), then assign (size_t)val to
*offset_marker_size to make the conversion explicit and defensive; keep the
current assignment for *offset_marker (column 4) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/in_tail/tail_sql.h`:
- Around line 38-39: The schema stores offset_marker and offset_marker_size as
INTEGER while the in-memory fields (uint64_t and size_t) and the code use
sqlite3_bind_int64()/sqlite3_column_int64(), which is unsafe for values >
INT64_MAX; change storage to an unsigned-safe representation (e.g., schema
columns "offset_marker" and "offset_marker_size" as BLOB or TEXT) and update the
binding/reading code to perform explicit safe conversions: when saving,
serialize the uint64_t/size_t to a fixed-width binary blob (use
sqlite3_bind_blob) or to a decimal string (use sqlite3_bind_text) and when
loading, parse/deserialize back into uint64_t/size_t with overflow checks;
alternatively, if you choose to keep INTEGER, enforce and validate values do not
exceed INT64_MAX on write/read and reject or clamp larger values to avoid UB.
Ensure changes touch the schema definition (offset_marker/offset_marker_size),
the code paths that call sqlite3_bind_int64()/sqlite3_column_int64(), and the
places that use the in-memory fields (uint64_t/size_t) so conversions and
overflow checks are consistent.

In `@tests/integration/scenarios/multiline_interface/assets/ml_interface_probe.c`:
- Around line 319-340: The test currently appends to stream id 1 without
creating a stream, conflating "no parser groups" with "invalid stream id"; in
run_empty_append() create a valid stream first (call flb_ml_stream_create(ml)
and check its return), then call flb_ml_append_text using the returned stream id
instead of the hardcoded 1, handle errors/teardown if stream creation fails, and
keep the rest of the function (time setup, append, logging, and teardown)
unchanged so the test isolates the empty-group append condition.

In
`@tests/integration/scenarios/multiline_interface/tests/test_multiline_interface_001.py`:
- Around line 41-49: The command list construction uses os.environ.get("CC",
"cc") as a single string which can include wrappers/flags (e.g., "ccache
clang"), causing ENOENT; change building of the command list so the CC value is
split into separate argv tokens before appending other args: obtain cc_value =
os.environ.get("CC", "cc"), split it (shlex.split or str.split) into tokens,
then extend the existing command list with those tokens followed by
str(PROBE_SOURCE), "-o", str(binary), *_load_probe_flags(), str(LIBFLUENT_BIT),
and the rpath flag so the executable and its flags are separate entries.
- Around line 18-31: The _load_probe_flags() routine currently ignores each
compile_commands.json entry's directory and treats CC as a single token; update
_load_probe_flags() to (1) use entry["directory"] as the working/base path when
resolving any relative -I flags produced from shlex.split(entry["command"]) so
that relative include paths are interpreted relative to entry["directory"]
(e.g., resolve "-I<path>" to an absolute or joined path when <path> is not
absolute), and (2) split the CC environment variable with
shlex.split(os.environ.get("CC", "cc")) so embedded options in CC are passed as
separate argv elements rather than a single program name; keep references to
COMPILE_COMMANDS and entries to locate the compile_commands.json parsing logic.

---

Nitpick comments:
In `@plugins/in_tail/tail_db.c`:
- Around line 224-228: The assignment to *offset_marker_size uses
sqlite3_column_int64(ctx->stmt_get_file, 5) (signed sqlite3_int64) but
offset_marker_size is a size_t; update the code in the function that reads
columns from ctx->stmt_get_file to explicitly handle signed-to-unsigned
conversion by reading the value into a sqlite3_int64 temp (e.g. val =
sqlite3_column_int64(...)), check if val is negative and handle that case (e.g.
treat as 0 or return an error), then assign (size_t)val to *offset_marker_size
to make the conversion explicit and defensive; keep the current assignment for
*offset_marker (column 4) unchanged.

In `@tests/integration/scenarios/in_tail/tests/test_in_tail_001.py`:
- Around line 617-640: The test
test_in_tail_delete_and_recreate_same_path_is_reingested calls
writer_old.close() twice (an explicit close before sleep and again in the
finally block); remove the intermediate writer_old.close() and rely on the
finally block to close writer_old to keep cleanup centralized and avoid
redundant calls while preserving safety in PersistentWriter.close().
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b7972091-0e84-429a-a332-027717ca2dd7

📥 Commits

Reviewing files that changed from the base of the PR and between 40a6d4a and 7d73ab0.

📒 Files selected for processing (27)
  • plugins/in_tail/tail_db.c
  • plugins/in_tail/tail_file.c
  • plugins/in_tail/tail_file.h
  • plugins/in_tail/tail_file_internal.h
  • plugins/in_tail/tail_sql.h
  • tests/integration/scenarios/in_tail/README.md
  • tests/integration/scenarios/in_tail/config/parsers_tail_it.conf
  • tests/integration/scenarios/in_tail/config/tail_docker_mode.yaml
  • tests/integration/scenarios/in_tail/config/tail_exclude_path.yaml
  • tests/integration/scenarios/in_tail/config/tail_generic_encoding.yaml
  • tests/integration/scenarios/in_tail/config/tail_gzip.yaml
  • tests/integration/scenarios/in_tail/config/tail_ignore_active_older.yaml
  • tests/integration/scenarios/in_tail/config/tail_ignore_older.yaml
  • tests/integration/scenarios/in_tail/config/tail_inotify.yaml
  • tests/integration/scenarios/in_tail/config/tail_inotify_new_files_from_tail.yaml
  • tests/integration/scenarios/in_tail/config/tail_inotify_read_from_tail.yaml
  • tests/integration/scenarios/in_tail/config/tail_multiline.yaml
  • tests/integration/scenarios/in_tail/config/tail_parser.yaml
  • tests/integration/scenarios/in_tail/config/tail_rotate_wait_short.yaml
  • tests/integration/scenarios/in_tail/config/tail_skip_long_lines.yaml
  • tests/integration/scenarios/in_tail/config/tail_stat.yaml
  • tests/integration/scenarios/in_tail/config/tail_stat_db_compare_filename.yaml
  • tests/integration/scenarios/in_tail/config/tail_truncate_long_lines.yaml
  • tests/integration/scenarios/in_tail/tests/test_in_tail_001.py
  • tests/integration/scenarios/multiline_interface/README.md
  • tests/integration/scenarios/multiline_interface/assets/ml_interface_probe.c
  • tests/integration/scenarios/multiline_interface/tests/test_multiline_interface_001.py

Comment on lines +38 to +39
" offset_marker INTEGER DEFAULT 0," \
" offset_marker_size INTEGER DEFAULT 0" \
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify how marker fields are bound/read and whether signed SQLite APIs are used.
rg -n -C4 'offset_marker|offset_marker_size|sqlite3_bind_int64|sqlite3_column_int64|sqlite3_bind_text|sqlite3_column_text' plugins/in_tail

Repository: fluent/fluent-bit

Length of output: 18213


Use unsigned-safe storage for offset_marker fields in database schema.

Lines 38–39 and 60–63 define offset_marker and offset_marker_size as SQLite INTEGER (signed 64-bit), while the in-memory fields in tail_file_internal.h (lines 119–120) are uint64_t and size_t. The code binds and reads these values using sqlite3_bind_int64() and sqlite3_column_int64(), which causes undefined behavior for values exceeding INT64_MAX. This can corrupt marker round-trips and trigger false marker mismatch during restart, resetting offsets unexpectedly.

Use BLOB or TEXT storage with explicit safe conversion, or document and enforce that markers never exceed INT64_MAX.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/in_tail/tail_sql.h` around lines 38 - 39, The schema stores
offset_marker and offset_marker_size as INTEGER while the in-memory fields
(uint64_t and size_t) and the code use
sqlite3_bind_int64()/sqlite3_column_int64(), which is unsafe for values >
INT64_MAX; change storage to an unsigned-safe representation (e.g., schema
columns "offset_marker" and "offset_marker_size" as BLOB or TEXT) and update the
binding/reading code to perform explicit safe conversions: when saving,
serialize the uint64_t/size_t to a fixed-width binary blob (use
sqlite3_bind_blob) or to a decimal string (use sqlite3_bind_text) and when
loading, parse/deserialize back into uint64_t/size_t with overflow checks;
alternatively, if you choose to keep INTEGER, enforce and validate values do not
exceed INT64_MAX on write/read and reject or clamp larger values to avoid UB.
Ensure changes touch the schema definition (offset_marker/offset_marker_size),
the code paths that call sqlite3_bind_int64()/sqlite3_column_int64(), and the
places that use the in-memory fields (uint64_t/size_t) so conversions and
overflow checks are consistent.

Comment on lines +319 to +340
static int run_empty_append(void)
{
int ret;
struct flb_config *config;
struct flb_ml *ml;
struct flb_time tm;

config = flb_config_init();
if (!config) {
return 2;
}

ml = flb_ml_create(config, "empty-append");
if (!ml) {
return teardown(config, NULL, 2);
}

flb_time_get(&tm);
ret = flb_ml_append_text(ml, 1, &tm, "hello", 5);
fprintf(stdout, "append_ret=%d\n", ret);

return teardown(config, ml, 0);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find the probe file
find . -name "ml_interface_probe.c" -type f

Repository: fluent/fluent-bit

Length of output: 138


🏁 Script executed:

#!/bin/bash
# Read the entire probe file to understand context
if [ -f "tests/integration/scenarios/multiline_interface/assets/ml_interface_probe.c" ]; then
    wc -l "tests/integration/scenarios/multiline_interface/assets/ml_interface_probe.c"
fi

Repository: fluent/fluent-bit

Length of output: 140


🏁 Script executed:

#!/bin/bash
# Read the full file to examine run_empty_append and related functions
cat -n "tests/integration/scenarios/multiline_interface/assets/ml_interface_probe.c"

Repository: fluent/fluent-bit

Length of output: 13531


Isolate the empty-group append condition by creating a valid stream first.

The run_empty_append() function currently appends to stream 1 without calling flb_ml_stream_create(), unlike all other probe functions (run_single_group(), run_multi_group(), run_truncation()). This conflates two separate conditions: missing parser groups and an invalid stream id. The regression test should isolate the empty-group case with a valid stream to properly exercise the intended behavior.

🔧 Suggested fix
 static int run_empty_append(void)
 {
     int ret;
+    uint64_t stream_id;
     struct flb_config *config;
     struct flb_ml *ml;
     struct flb_time tm;
+    struct callback_state state = {0};
 
     config = flb_config_init();
     if (!config) {
         return 2;
     }
 
     ml = flb_ml_create(config, "empty-append");
     if (!ml) {
         return teardown(config, NULL, 2);
     }
 
+    ret = flb_ml_stream_create(ml, "empty-stream", -1,
+                               capture_flush, &state, &stream_id);
+    if (ret != 0) {
+        return teardown(config, ml, 2);
+    }
+
     flb_time_get(&tm);
-    ret = flb_ml_append_text(ml, 1, &tm, "hello", 5);
+    ret = flb_ml_append_text(ml, stream_id, &tm, "hello", 5);
     fprintf(stdout, "append_ret=%d\n", ret);
 
     return teardown(config, ml, 0);
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/integration/scenarios/multiline_interface/assets/ml_interface_probe.c`
around lines 319 - 340, The test currently appends to stream id 1 without
creating a stream, conflating "no parser groups" with "invalid stream id"; in
run_empty_append() create a valid stream first (call flb_ml_stream_create(ml)
and check its return), then call flb_ml_append_text using the returned stream id
instead of the hardcoded 1, handle errors/teardown if stream creation fails, and
keep the rest of the function (time setup, append, logging, and teardown)
unchanged so the test isolates the empty-group append condition.

Comment on lines +18 to +31
def _load_probe_flags():
with COMPILE_COMMANDS.open(encoding="utf-8") as handle:
entries = json.load(handle)

for entry in entries:
if entry["file"].endswith("tests/internal/multiline.c"):
command = shlex.split(entry["command"])
return [
argument
for argument in command
if argument.startswith("-I") or argument.startswith("-D")
]

raise RuntimeError("could not find tests/internal/multiline.c in compile_commands.json")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate the test file
find . -path "*multiline_interface*" -name "test_multiline_interface_001.py" -type f

Repository: fluent/fluent-bit

Length of output: 148


🏁 Script executed:

# Get the directory structure to understand the layout
ls -la tests/integration/scenarios/multiline_interface/tests/ 2>/dev/null || echo "Path not found, searching..."
find . -type d -name "multiline_interface" 2>/dev/null

Repository: fluent/fluent-bit

Length of output: 302


🏁 Script executed:

# Search for the test file more broadly
fd -t f "test_multiline_interface" 2>/dev/null | head -20

Repository: fluent/fluent-bit

Length of output: 146


🏁 Script executed:

# Look for any Python test files in the tests directory
fd -t f "\.py$" "tests/" | grep -i multiline | head -20

Repository: fluent/fluent-bit

Length of output: 146


🏁 Script executed:

# Check repository structure
ls -la tests/ 2>/dev/null | head -30

Repository: fluent/fluent-bit

Length of output: 528


🏁 Script executed:

cat -n tests/integration/scenarios/multiline_interface/tests/test_multiline_interface_001.py

Repository: fluent/fluent-bit

Length of output: 4764


🏁 Script executed:

# Check if compile_commands.json exists and examine its structure
find . -name "compile_commands.json" -type f | head -5

Repository: fluent/fluent-bit

Length of output: 43


🏁 Script executed:

# Look for build directory or where compile_commands.json might be generated
find . -path "*/build/*" -name "compile_commands.json" 2>/dev/null | head -5

Repository: fluent/fluent-bit

Length of output: 43


🏁 Script executed:

# Check the CMakeLists.txt or build configuration to understand compile_commands.json generation
find . -name "CMakeLists.txt" | head -10

Repository: fluent/fluent-bit

Length of output: 373


🏁 Script executed:

# Check the CMakeLists.txt to see if it generates compile_commands.json
head -100 CMakeLists.txt | grep -i "compile_commands"

Repository: fluent/fluent-bit

Length of output: 98


🏁 Script executed:

# Look for any CMake configuration that sets up compile_commands.json
rg "compile_commands" --type cmake

Repository: fluent/fluent-bit

Length of output: 43


🏁 Script executed:

# Check if there's any documentation or examples of the generated compile_commands.json
find . -path "*/build/*" -type f 2>/dev/null | head -20

Repository: fluent/fluent-bit

Length of output: 1022


🏁 Script executed:

# Look at the actual probe source to understand what includes it needs
cat -n tests/integration/scenarios/multiline_interface/assets/ml_interface_probe.c | head -50

Repository: fluent/fluent-bit

Length of output: 1663


🏁 Script executed:

# Search for any existing compile_commands.json examples or test fixtures
rg "compile_commands" --type json | head -20

Repository: fluent/fluent-bit

Length of output: 43


🏁 Script executed:

# Check if the test directory has any compile_commands.json or related files
ls -la tests/integration/scenarios/multiline_interface/

Repository: fluent/fluent-bit

Length of output: 346


🏁 Script executed:

# Look for how COMPILE_COMMANDS is used elsewhere in the codebase
rg "COMPILE_COMMANDS|compile_commands\.json" | grep -v ".git"

Repository: fluent/fluent-bit

Length of output: 1424


🏁 Script executed:

# Check what the probe source file actually includes
cat tests/integration/scenarios/multiline_interface/assets/ml_interface_probe.c | head -20

Repository: fluent/fluent-bit

Length of output: 550


🏁 Script executed:

# Search for any information about what includes are needed
rg "fluent-bit" tests/integration/scenarios/multiline_interface/assets/ml_interface_probe.c

Repository: fluent/fluent-bit

Length of output: 342


🏁 Script executed:

# Check the root directory structure to understand include paths
ls -la include/ | head -20

Repository: fluent/fluent-bit

Length of output: 362


🏁 Script executed:

# Look for fluent-bit specific headers
find . -path "*/include/*" -name "flb_config.h" -o -name "flb_ml.h" 2>/dev/null | head -10

Repository: fluent/fluent-bit

Length of output: 134


🏁 Script executed:

# Check if there's documentation on compile_commands.json structure and semantics
rg -A 5 "directory.*command" tests/ 2>/dev/null | head -30

Repository: fluent/fluent-bit

Length of output: 448


🏁 Script executed:

# Understand the typical output of cmake's compile_commands.json for this project
# Look for examples in CI/build documentation
find . -name "*.md" -o -name "*.txt" | xargs grep -l "compile_commands" 2>/dev/null | head -5

Repository: fluent/fluent-bit

Length of output: 43


🏁 Script executed:

# Look for any test setup or fixture that might show compile_commands.json structure
rg "directory.*file.*command" --type json

Repository: fluent/fluent-bit

Length of output: 43


🏁 Script executed:

# Check the CMakeLists.txt to see what flags are typically set
cat CMakeLists.txt | grep -A 20 "COMPILE_COMMANDS\|include_directories\|add_compile_options" | head -50

Repository: fluent/fluent-bit

Length of output: 2073


🏁 Script executed:

# Look at what entry the code is searching for
rg "tests/internal/multiline.c" -B 5 -A 5

Repository: fluent/fluent-bit

Length of output: 2369


🏁 Script executed:

# Check if there are any documentation or examples of the build process
ls -la | grep -i readme
cat README.md | head -100

Repository: fluent/fluent-bit

Length of output: 4328


🏁 Script executed:

# Search for how CC environment variable is typically used
rg "os.environ.*CC" -B 2 -A 2

Repository: fluent/fluent-bit

Length of output: 1559


🏁 Script executed:

# Look for any Python documentation or notes about subprocess and cwd
rg "subprocess.run" -B 2 -A 5 tests/ | head -50

Repository: fluent/fluent-bit

Length of output: 4224


🏁 Script executed:

# Verify the exact structure of compile_commands.json entries
# Let's look for the file that generates this or an example
find . -name "*.json" -path "*/build/*" 2>/dev/null | grep -i compile | head -5

Repository: fluent/fluent-bit

Length of output: 43


🏁 Script executed:

# Check if there's any test or example that shows the structure
rg "directory.*command.*file" -A 2 -B 2 | head -40

Repository: fluent/fluent-bit

Length of output: 1643


🏁 Script executed:

# Look for any documentation about compile_commands.json structure
rg '"directory"' -A 3 -B 1 | head -50

Repository: fluent/fluent-bit

Length of output: 5132


🏁 Script executed:

# Verify by examining how the entry would look - check CMakeLists.txt for what would generate this
cat CMakeLists.txt | grep -i "multiline.c" -B 5 -A 5

Repository: fluent/fluent-bit

Length of output: 43


🏁 Script executed:

# Search for the actual multiline.c file being compiled
find . -name "multiline.c" -type f 2>/dev/null

Repository: fluent/fluent-bit

Length of output: 89


🏁 Script executed:

# Let's check the tests/internal directory
ls -la tests/internal/ | head -20

Repository: fluent/fluent-bit

Length of output: 1355


🏁 Script executed:

# Verify that relative paths could appear in compile_commands.json by checking how CMake handles them
# Look for any CMakeLists.txt that shows how include_directories are set
rg "include_directories|target_include_directories" tests/internal/ 2>/dev/null | head -10

Repository: fluent/fluent-bit

Length of output: 213


🏁 Script executed:

# Check how the multiline.c file is compiled in CMakeLists.txt
cat tests/internal/CMakeLists.txt | grep -A 10 -B 5 "multiline.c" 2>/dev/null

Repository: fluent/fluent-bit

Length of output: 296


🏁 Script executed:

# Let's understand the structure of compile_commands.json by checking Python subprocess.run documentation implications
# Check if there are any actual failures or issues in tests by looking at CI logs or test output
find . -name "*.log" -o -name "*test*output*" 2>/dev/null | grep -i multiline | head -5

Repository: fluent/fluent-bit

Length of output: 223


🏁 Script executed:

# Verify the subprocess.run behavior with CC environment variable
# Let's check if there are any existing tests that show the correct pattern for handling CC
rg "shlex.split.*environ\|subprocess.*CC" -A 3 -B 3 | head -40

Repository: fluent/fluent-bit

Length of output: 43


🏁 Script executed:

# Final verification: check if the probe actually compiles successfully with current code
# by looking for any build documentation or examples
cat tests/integration/scenarios/multiline_interface/README.md 2>/dev/null

Repository: fluent/fluent-bit

Length of output: 1574


🏁 Script executed:

# One final check: verify the subprocess argv behavior with spaces in CC by looking at Python documentation
# or any examples showing proper handling
rg "shlex.split.*CC|CC.*split" tests/ 2>/dev/null

Repository: fluent/fluent-bit

Length of output: 43


🏁 Script executed:

# Check if there are integration tests that might exercise CC with spaces
rg "os.environ.*CC" -B 5 -A 5 | grep -i "test\|example"

Repository: fluent/fluent-bit

Length of output: 2080


Use the compile database working directory and properly split the CC environment variable.

The _load_probe_flags() function discards the directory field from each compile_commands.json entry. CMake-generated entries are valid only relative to their directory value. When relative -I... paths appear (which is valid per the compile_commands.json schema), reinterpreting them from the test's cwd instead of the build directory causes probe compilation to fail or use incorrect paths.

Additionally, os.environ.get("CC", "cc") is passed as a single command argument. If CC contains embedded options (e.g., CC="gcc -Wall"), subprocess.run receives an invalid program name instead of separate argv elements.

🔧 Suggested fix
 def _load_probe_flags():
     with COMPILE_COMMANDS.open(encoding="utf-8") as handle:
         entries = json.load(handle)

     for entry in entries:
         if entry["file"].endswith("tests/internal/multiline.c"):
             command = shlex.split(entry["command"])
-            return [
-                argument
-                for argument in command
-                if argument.startswith("-I") or argument.startswith("-D")
-            ]
+            return entry["directory"], [
+                argument
+                for argument in command
+                if argument.startswith("-I") or argument.startswith("-D")
+            ]

     raise RuntimeError("could not find tests/internal/multiline.c in compile_commands.json")


 `@pytest.fixture`(scope="session")
 def multiline_probe(tmp_path_factory):
     assert COMPILE_COMMANDS.exists(), f"missing {COMPILE_COMMANDS}"
     assert LIBFLUENT_BIT.exists(), f"missing {LIBFLUENT_BIT}"

     build_dir = tmp_path_factory.mktemp("multiline-probe")
     binary = build_dir / "ml_interface_probe"
+    compile_cwd, probe_flags = _load_probe_flags()
     command = [
-        os.environ.get("CC", "cc"),
+        *shlex.split(os.environ.get("CC", "cc")),
         str(PROBE_SOURCE),
         "-o",
         str(binary),
-        *_load_probe_flags(),
+        *probe_flags,
         str(LIBFLUENT_BIT),
         f"-Wl,-rpath,{LIBFLUENT_BIT.parent}",
     ]

-    subprocess.run(command, check=True, capture_output=True, text=True)
+    subprocess.run(command, check=True, capture_output=True, text=True, cwd=compile_cwd)
     return binary
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@tests/integration/scenarios/multiline_interface/tests/test_multiline_interface_001.py`
around lines 18 - 31, The _load_probe_flags() routine currently ignores each
compile_commands.json entry's directory and treats CC as a single token; update
_load_probe_flags() to (1) use entry["directory"] as the working/base path when
resolving any relative -I flags produced from shlex.split(entry["command"]) so
that relative include paths are interpreted relative to entry["directory"]
(e.g., resolve "-I<path>" to an absolute or joined path when <path> is not
absolute), and (2) split the CC environment variable with
shlex.split(os.environ.get("CC", "cc")) so embedded options in CC are passed as
separate argv elements rather than a single program name; keep references to
COMPILE_COMMANDS and entries to locate the compile_commands.json parsing logic.

Comment on lines +41 to +49
command = [
os.environ.get("CC", "cc"),
str(PROBE_SOURCE),
"-o",
str(binary),
*_load_probe_flags(),
str(LIBFLUENT_BIT),
f"-Wl,-rpath,{LIBFLUENT_BIT.parent}",
]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

fd -t f 'test_multiline_interface_001.py'

Repository: fluent/fluent-bit

Length of output: 146


🏁 Script executed:

cat -n tests/integration/scenarios/multiline_interface/tests/test_multiline_interface_001.py | head -60

Repository: fluent/fluent-bit

Length of output: 2298


🏁 Script executed:

# Check if shlex is imported in the file
rg 'import shlex' tests/integration/scenarios/multiline_interface/tests/test_multiline_interface_001.py

Repository: fluent/fluent-bit

Length of output: 73


🏁 Script executed:

# Check how CC is used elsewhere in similar test files
rg -A 3 -B 3 'os.environ.get.*CC' tests/ -t py

Repository: fluent/fluent-bit

Length of output: 881


🏁 Script executed:

# Check if shlex is used elsewhere in the codebase for similar purposes
rg 'shlex.split.*CC' . -t py

Repository: fluent/fluent-bit

Length of output: 43


Split $CC before putting it in the argv.

If CC is set to a wrapper or carries flags (ccache clang, zig cc, etc.), line 42 becomes one literal executable name and the probe compile fails with ENOENT.

🔧 Suggested fix
-    command = [
-        os.environ.get("CC", "cc"),
+    command = [
+        *shlex.split(os.environ.get("CC", "cc")),
         str(PROBE_SOURCE),
         "-o",
         str(binary),
         *_load_probe_flags(),
         str(LIBFLUENT_BIT),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@tests/integration/scenarios/multiline_interface/tests/test_multiline_interface_001.py`
around lines 41 - 49, The command list construction uses os.environ.get("CC",
"cc") as a single string which can include wrappers/flags (e.g., "ccache
clang"), causing ENOENT; change building of the command list so the CC value is
split into separate argv tokens before appending other args: obtain cc_value =
os.environ.get("CC", "cc"), split it (shlex.split or str.split) into tokens,
then extend the existing command list with those tokens followed by
str(PROBE_SOURCE), "-o", str(binary), *_load_probe_flags(), str(LIBFLUENT_BIT),
and the rpath flag so the executable and its flags are separate entries.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/internal/multiline.c (1)

1894-1940: Consider using a more explicit skip indicator for Windows.

The crash tests are well-designed for Unix platforms. However, the Windows stubs only emit a TEST_MSG and return normally, which means the test passes without actually verifying anything. While this is a common pattern, some test frameworks support explicit skip indicators that make test reports clearer.

If Fluent Bit's test framework supports it, consider a pattern that explicitly marks the test as skipped rather than passing silently.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/internal/multiline.c` around lines 1894 - 1940, Replace the Windows
stub behavior that just prints TEST_MSG in
test_known_bug_empty_context_flush_crashes and
test_known_bug_empty_context_append_crashes with the test framework's explicit
skip mechanism (e.g., use the TEST_SKIP or equivalent skip macro/function
provided by the Fluent Bit test harness) so the tests are reported as skipped on
Windows rather than passing silently; update both functions to call the skip
macro with a clear reason string instead of TEST_MSG.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/internal/multiline.c`:
- Around line 1894-1940: Replace the Windows stub behavior that just prints
TEST_MSG in test_known_bug_empty_context_flush_crashes and
test_known_bug_empty_context_append_crashes with the test framework's explicit
skip mechanism (e.g., use the TEST_SKIP or equivalent skip macro/function
provided by the Fluent Bit test harness) so the tests are reported as skipped on
Windows rather than passing silently; update both functions to call the skip
macro with a clear reason string instead of TEST_MSG.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 19fddcef-a8b3-4bdc-80fd-0f812d9236fe

📥 Commits

Reviewing files that changed from the base of the PR and between 7d73ab0 and 9f51950.

📒 Files selected for processing (1)
  • tests/internal/multiline.c

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/integration/scenarios/in_tail/tests/test_in_tail_001.py`:
- Around line 966-984: The test
test_in_tail_discovers_file_after_permissions_are_restored should be guarded
when running as root: detect root by checking os.geteuid()==0 (or
os.getenv("SUDO_UID")/os.getuid() if necessary) and call pytest.skip or
pytest.xfail early in the test to avoid the permission-based race on root-run
CI; add the os and pytest imports if missing and place the guard before
writing/chmoding delayed_file or starting Service (refer to the
test_in_tail_discovers_file_after_permissions_are_restored function and
Service("tail_stat.yaml", ...) usage).
- Around line 891-905: The test
test_in_tail_ignore_active_older_files_stops_following_aged_file currently never
appends after the aging window so it can pass even if the plugin still follows
aged files; modify the test to append another line after the sleep to validate
the plugin stopped following: after the existing time.sleep(4) and
service.assert_no_new_records_for(1, quiet_period=4) call, call
write_and_sync(log_file, "second-line\n"), then wait briefly (e.g.,
service.wait_for_records with timeout or small sleep) and assert that the
records still only contain "first-line" (e.g., assert_log_set(records,
["first-line"]) or assert len(service.wait_for_records(...)) == 1) using the
same service, log_file, write_and_sync, service.wait_for_records and
service.assert_no_new_records_for identifiers.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 82d3ca83-7172-455d-9fa1-f761f150f08f

📥 Commits

Reviewing files that changed from the base of the PR and between 9f51950 and 5614f61.

📒 Files selected for processing (20)
  • tests/integration/scenarios/in_tail/README.md
  • tests/integration/scenarios/in_tail/config/parsers_tail_it.conf
  • tests/integration/scenarios/in_tail/config/tail_docker_mode.yaml
  • tests/integration/scenarios/in_tail/config/tail_exclude_path.yaml
  • tests/integration/scenarios/in_tail/config/tail_generic_encoding.yaml
  • tests/integration/scenarios/in_tail/config/tail_gzip.yaml
  • tests/integration/scenarios/in_tail/config/tail_ignore_active_older.yaml
  • tests/integration/scenarios/in_tail/config/tail_ignore_older.yaml
  • tests/integration/scenarios/in_tail/config/tail_inotify.yaml
  • tests/integration/scenarios/in_tail/config/tail_inotify_new_files_from_tail.yaml
  • tests/integration/scenarios/in_tail/config/tail_inotify_read_from_tail.yaml
  • tests/integration/scenarios/in_tail/config/tail_multiline.yaml
  • tests/integration/scenarios/in_tail/config/tail_parser.yaml
  • tests/integration/scenarios/in_tail/config/tail_rotate_wait_short.yaml
  • tests/integration/scenarios/in_tail/config/tail_skip_long_lines.yaml
  • tests/integration/scenarios/in_tail/config/tail_stat.yaml
  • tests/integration/scenarios/in_tail/config/tail_stat_db_compare_filename.yaml
  • tests/integration/scenarios/in_tail/config/tail_truncate_long_lines.yaml
  • tests/integration/scenarios/in_tail/tests/test_in_tail_001.py
  • tests/internal/multiline.c
✅ Files skipped from review due to trivial changes (18)
  • tests/integration/scenarios/in_tail/config/tail_multiline.yaml
  • tests/integration/scenarios/in_tail/config/tail_generic_encoding.yaml
  • tests/integration/scenarios/in_tail/config/tail_docker_mode.yaml
  • tests/integration/scenarios/in_tail/config/tail_rotate_wait_short.yaml
  • tests/integration/scenarios/in_tail/config/tail_parser.yaml
  • tests/integration/scenarios/in_tail/config/tail_stat_db_compare_filename.yaml
  • tests/integration/scenarios/in_tail/config/tail_gzip.yaml
  • tests/integration/scenarios/in_tail/config/tail_skip_long_lines.yaml
  • tests/integration/scenarios/in_tail/config/tail_exclude_path.yaml
  • tests/integration/scenarios/in_tail/config/tail_inotify.yaml
  • tests/integration/scenarios/in_tail/config/tail_ignore_older.yaml
  • tests/integration/scenarios/in_tail/config/tail_inotify_read_from_tail.yaml
  • tests/integration/scenarios/in_tail/config/tail_truncate_long_lines.yaml
  • tests/integration/scenarios/in_tail/config/parsers_tail_it.conf
  • tests/integration/scenarios/in_tail/config/tail_inotify_new_files_from_tail.yaml
  • tests/integration/scenarios/in_tail/config/tail_stat.yaml
  • tests/integration/scenarios/in_tail/config/tail_ignore_active_older.yaml
  • tests/integration/scenarios/in_tail/README.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/internal/multiline.c

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/internal/multiline.c`:
- Around line 498-503: The test must not call flb_time_pop_from_msgpack when
msgpack_unpack_next failed; after calling msgpack_unpacked_init(&result) and ret
= msgpack_unpack_next(&result, ...), check if ret != MSGPACK_UNPACK_SUCCESS and
return or skip the remainder of the test (and destroy/uninit result if needed)
instead of continuing; update the block around msgpack_unpack_next / TEST_CHECK
to bail out early on failure before calling flb_time_pop_from_msgpack(&tm,
&result, &map).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 927968ef-e175-40f2-979e-7b298b1a7bf3

📥 Commits

Reviewing files that changed from the base of the PR and between 5614f61 and 7651631.

📒 Files selected for processing (1)
  • tests/internal/multiline.c

Comment on lines +498 to +503
msgpack_unpacked_init(&result);
ret = msgpack_unpack_next(&result, buf_data, buf_size, &off);
TEST_CHECK(ret == MSGPACK_UNPACK_SUCCESS);

flb_time_pop_from_msgpack(&tm, &result, &map);
TEST_CHECK(flb_time_to_nanosec(&tm) != 0L);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing early return on unpack failure could access uninitialized data.

If msgpack_unpack_next fails, result.data is uninitialized but flb_time_pop_from_msgpack will still be called. While TEST_CHECK logs the failure, the test continues with potentially invalid data.

🛡️ Proposed fix
     msgpack_unpacked_init(&result);
     ret = msgpack_unpack_next(&result, buf_data, buf_size, &off);
     TEST_CHECK(ret == MSGPACK_UNPACK_SUCCESS);
+    if (ret != MSGPACK_UNPACK_SUCCESS) {
+        msgpack_unpacked_destroy(&result);
+        return 0;
+    }
 
     flb_time_pop_from_msgpack(&tm, &result, &map);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
msgpack_unpacked_init(&result);
ret = msgpack_unpack_next(&result, buf_data, buf_size, &off);
TEST_CHECK(ret == MSGPACK_UNPACK_SUCCESS);
flb_time_pop_from_msgpack(&tm, &result, &map);
TEST_CHECK(flb_time_to_nanosec(&tm) != 0L);
msgpack_unpacked_init(&result);
ret = msgpack_unpack_next(&result, buf_data, buf_size, &off);
TEST_CHECK(ret == MSGPACK_UNPACK_SUCCESS);
if (ret != MSGPACK_UNPACK_SUCCESS) {
msgpack_unpacked_destroy(&result);
return 0;
}
flb_time_pop_from_msgpack(&tm, &result, &map);
TEST_CHECK(flb_time_to_nanosec(&tm) != 0L);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/internal/multiline.c` around lines 498 - 503, The test must not call
flb_time_pop_from_msgpack when msgpack_unpack_next failed; after calling
msgpack_unpacked_init(&result) and ret = msgpack_unpack_next(&result, ...),
check if ret != MSGPACK_UNPACK_SUCCESS and return or skip the remainder of the
test (and destroy/uninit result if needed) instead of continuing; update the
block around msgpack_unpack_next / TEST_CHECK to bail out early on failure
before calling flb_time_pop_from_msgpack(&tm, &result, &map).

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
tests/internal/multiline.c (1)

498-503: ⚠️ Potential issue | 🔴 Critical

Guard the unpack failure before touching result.

TEST_CHECK() does not stop execution here, so Line 502 can still call flb_time_pop_from_msgpack() on an invalid msgpack_unpacked. That turns a failed assertion into undefined behavior inside the helper.

Proposed fix
     msgpack_unpacked_init(&result);
     ret = msgpack_unpack_next(&result, buf_data, buf_size, &off);
     TEST_CHECK(ret == MSGPACK_UNPACK_SUCCESS);
+    if (ret != MSGPACK_UNPACK_SUCCESS) {
+        msgpack_unpacked_destroy(&result);
+        return 0;
+    }
 
     flb_time_pop_from_msgpack(&tm, &result, &map);

Based on learnings TEST_CHECK in tests/internal/ logs failures and continues execution, so this path still needs an explicit guard before using result.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/internal/multiline.c` around lines 498 - 503, The test uses
msgpack_unpacked (initialized with msgpack_unpacked_init) and calls
msgpack_unpack_next into result, but TEST_CHECK does not abort on failure so
flb_time_pop_from_msgpack(&tm, &result, &map) may operate on an invalid
msgpack_unpacked; add an explicit guard after msgpack_unpack_next (check ret ==
MSGPACK_UNPACK_SUCCESS) and if it fails skip or return from the test (or call
the test framework's fatal equivalent) before calling flb_time_pop_from_msgpack,
ensuring you do not touch result when unpack failed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/in_tail/tail_db.c`:
- Around line 261-275: flb_tail_file_update_offset_marker() can fail but its
return value is ignored here (and at the other caller around the 532-539 block);
check its return and abort or handle the insert when it fails: call
flb_tail_file_update_offset_marker(file) and if it returns < 0, do not proceed
to bind/execute ctx->stmt_insert_file — instead log the error and skip or mark
the DB row write as failed so stale/empty file->db_offset_marker and
file->db_offset_marker_size are not persisted; ensure both caller sites (this
insert using sqlite3_bind_* on ctx->stmt_insert_file and the other insert block)
perform the same check and error handling.

In `@tests/internal/multiline.c`:
- Around line 1790-1842: The test function
test_known_bug_multi_group_flush_only_first_group (and the other known_bug_*
tests referenced) assert currently-broken behavior for the
"known-bug-multi-group" multiline engine; change these to xfail/skip-style
regression tests instead of making the broken behavior the expected contract:
mark test_known_bug_multi_group_flush_only_first_group (and parallel known_bug_*
cases) as expected-to-fail or skip in the test harness (or wrap them with the
test framework's xfail/skip facility) and add a short TODO referencing the
intended correct behavior so CI won't treat these as passing regressions; ensure
you update any test discovery metadata for the parser/group names
("known-bug-multi-group", "bug_group_a", "bug_group_b") so they remain present
for regression coverage but no longer make the suite green when the engine is
incorrect.

---

Duplicate comments:
In `@tests/internal/multiline.c`:
- Around line 498-503: The test uses msgpack_unpacked (initialized with
msgpack_unpacked_init) and calls msgpack_unpack_next into result, but TEST_CHECK
does not abort on failure so flb_time_pop_from_msgpack(&tm, &result, &map) may
operate on an invalid msgpack_unpacked; add an explicit guard after
msgpack_unpack_next (check ret == MSGPACK_UNPACK_SUCCESS) and if it fails skip
or return from the test (or call the test framework's fatal equivalent) before
calling flb_time_pop_from_msgpack, ensuring you do not touch result when unpack
failed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: edea8872-7856-41d8-ab4a-a88caeaf95b6

📥 Commits

Reviewing files that changed from the base of the PR and between 7651631 and 79da1f6.

📒 Files selected for processing (31)
  • plugins/in_tail/tail_config.c
  • plugins/in_tail/tail_config.h
  • plugins/in_tail/tail_db.c
  • plugins/in_tail/tail_file.c
  • plugins/in_tail/tail_file.h
  • plugins/in_tail/tail_file_internal.h
  • plugins/in_tail/tail_scan.c
  • plugins/in_tail/tail_scan.h
  • plugins/in_tail/tail_scan_glob.c
  • plugins/in_tail/tail_scan_win32.c
  • plugins/in_tail/tail_sql.h
  • tests/integration/scenarios/in_tail/README.md
  • tests/integration/scenarios/in_tail/config/parsers_tail_it.conf
  • tests/integration/scenarios/in_tail/config/tail_docker_mode.yaml
  • tests/integration/scenarios/in_tail/config/tail_exclude_path.yaml
  • tests/integration/scenarios/in_tail/config/tail_generic_encoding.yaml
  • tests/integration/scenarios/in_tail/config/tail_gzip.yaml
  • tests/integration/scenarios/in_tail/config/tail_ignore_active_older.yaml
  • tests/integration/scenarios/in_tail/config/tail_ignore_older.yaml
  • tests/integration/scenarios/in_tail/config/tail_inotify.yaml
  • tests/integration/scenarios/in_tail/config/tail_inotify_new_files_from_tail.yaml
  • tests/integration/scenarios/in_tail/config/tail_inotify_read_from_tail.yaml
  • tests/integration/scenarios/in_tail/config/tail_multiline.yaml
  • tests/integration/scenarios/in_tail/config/tail_parser.yaml
  • tests/integration/scenarios/in_tail/config/tail_rotate_wait_short.yaml
  • tests/integration/scenarios/in_tail/config/tail_skip_long_lines.yaml
  • tests/integration/scenarios/in_tail/config/tail_stat.yaml
  • tests/integration/scenarios/in_tail/config/tail_stat_db_compare_filename.yaml
  • tests/integration/scenarios/in_tail/config/tail_truncate_long_lines.yaml
  • tests/integration/scenarios/in_tail/tests/test_in_tail_001.py
  • tests/internal/multiline.c
✅ Files skipped from review due to trivial changes (18)
  • tests/integration/scenarios/in_tail/config/tail_generic_encoding.yaml
  • tests/integration/scenarios/in_tail/config/tail_gzip.yaml
  • tests/integration/scenarios/in_tail/config/tail_multiline.yaml
  • tests/integration/scenarios/in_tail/config/tail_stat_db_compare_filename.yaml
  • tests/integration/scenarios/in_tail/config/tail_skip_long_lines.yaml
  • tests/integration/scenarios/in_tail/config/tail_exclude_path.yaml
  • tests/integration/scenarios/in_tail/config/tail_parser.yaml
  • tests/integration/scenarios/in_tail/config/tail_inotify_read_from_tail.yaml
  • tests/integration/scenarios/in_tail/config/tail_inotify_new_files_from_tail.yaml
  • tests/integration/scenarios/in_tail/README.md
  • tests/integration/scenarios/in_tail/config/tail_rotate_wait_short.yaml
  • tests/integration/scenarios/in_tail/config/tail_inotify.yaml
  • tests/integration/scenarios/in_tail/config/parsers_tail_it.conf
  • tests/integration/scenarios/in_tail/config/tail_stat.yaml
  • tests/integration/scenarios/in_tail/config/tail_ignore_older.yaml
  • tests/integration/scenarios/in_tail/config/tail_ignore_active_older.yaml
  • tests/integration/scenarios/in_tail/config/tail_docker_mode.yaml
  • tests/integration/scenarios/in_tail/tests/test_in_tail_001.py
🚧 Files skipped from review as they are similar to previous changes (4)
  • plugins/in_tail/tail_file_internal.h
  • tests/integration/scenarios/in_tail/config/tail_truncate_long_lines.yaml
  • plugins/in_tail/tail_sql.h
  • plugins/in_tail/tail_file.c

edsiper added 3 commits April 9, 2026 12:42
Persist and validate DB restore state more defensively so tail does not
resume from stale positions after restart, and keep aged-out active files
from being rediscovered on the next scan.

This change adds a small offset marker derived from the bytes before the
persisted DB offset and checks that marker before trusting the saved
position on restart. If the marker no longer matches, tail resets to a
safe position instead of seeking into replacement content after
copytruncate or inode reuse.

The restore path also persists the last resumable offset when a trailing
partial line is buffered, so a restart completes that line exactly once
instead of resuming after the fragment.

Because the new restart metadata must survive restarts, the SQLite schema
is extended and upgraded automatically in place. The migration is
additive so existing databases are upgraded without manual intervention.

The active-file purge path now records aged-out path and inode state so a
later rescan keeps ignoring that same file while still allowing a new
inode at the same path to be tailed again.

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Add a comprehensive Python integration scenario for in_tail and keep the
scenario documentation with it.

The new coverage exercises startup positioning, discovery from head and
tail, rename rotation, repeated rotation, copytruncate, restart resume,
partial-line restart recovery, automatic DB schema upgrade, multiline and
parser modes, long-line handling, rotate_wait, gzip, generic encoding,
ignore_older, ignore_active_older_files, exclude_path, docker mode, and
delayed readability restoration.

The scenario also guards the permission-restoration case when running as
root and verifies that ignore_active_older_files stops ingesting new
writes after the file has aged out.

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Add internal multiline regression coverage for the known API-level defects that are not a good fit for the Python integration suite.

The new tests cover the first-group-only flush behavior, truncation dropping the overflow line, and the empty context crash paths using the existing tests/internal multiline harness.

Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
@edsiper edsiper force-pushed the integration-tail branch from 79da1f6 to bffd67d Compare April 9, 2026 18:45
@edsiper edsiper merged commit a42e322 into master Apr 9, 2026
74 of 76 checks passed
@edsiper edsiper deleted the integration-tail branch April 9, 2026 21:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant