5.4.24 by mscasso-scanoss · Pull Request #111 · scanoss/engine

mscasso-scanoss · 2026-04-09T12:21:35Z

Summary by CodeRabbit

New Features
- Added "Enable full license report" reporting option.
- Compound license expressions now emit multiple SPDX URLs plus a primary URL for clearer attribution.
Bug Fixes
- Reduced duplicate and excessive license entries in compact reports; improved license ordering and selection logic.
Documentation
- Updated built-in help text to document the new reporting flag.

Version: 5.4.24

coderabbitai · 2026-04-09T12:24:19Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Bumps SCANOSS_VERSION to 5.4.24 and adds ENABLE_LICENSE_FULL_REPORT flag; CLI/help expose the flag. License parsing and reporting logic updated (LicenseRef detection, compound-license URLs, output capping/early-exit when full report is disabled). Match selection ordering adjusted.

Changes

Cohort / File(s)	Summary
Version & Flags `inc/scanoss.h`	Bumped `SCANOSS_VERSION` to `"5.4.24"` and added `ENABLE_LICENSE_FULL_REPORT = 32768`.
CLI / Help `src/main.c`, `src/help.c`	`-F/--flags` parsing now detects `ENABLE_LICENSE_FULL_REPORT` and sets `full_license_report`; help text documents the new flag (default: disabled).
License handling `src/license.c`	Changed LicenseRef detection to substring match; tokenize compound expressions (`AND`, `OR`, `WITH`) to emit a `"urls"` map (and set `"url"` to first token); added truncation/early-exit when `full_license_report` is false; replaced special-case ID ordering with priority table.
Match selection `src/match.c`	Reordered candidate-selection checks so identification-based skip happens before `path_is_third_party()` preference comparison.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as CLI / Main
    participant Flags as Engine Flags
    participant License as License Processor
    participant Reporter as Report Generator

    CLI->>Flags: parse -F / --flags
    Flags-->>CLI: flags bitmask (includes ENABLE_LICENSE_FULL_REPORT?)
    CLI->>License: start scan (full_license_report set/cleared)
    License->>License: detect "LicenseRef" via substring
    License->>License: tokenize compound expressions -> build "urls" map
    License-->>Reporter: emit license entries (apply caps if full_report=false)
    Reporter-->>CLI: write report/output

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

5.4.20 #103: Overlaps on license handling changes and CLI flag behavior (strong code-level overlap with src/license.c and flags).
5.4.23 #110: Also bumps SCANOSS_VERSION in inc/scanoss.h; related to the version macro change.

Suggested reviewers

scanoss-qg

Poem

🐇 A flag hops in, the version climbs high,
I sniff out LicenseRef where it hides nearby,
I split the strings and map each URL,
Short reports hush when the brief bell will knell,
The rabbit twitches — the engine sighs.

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title '5.4.24' only indicates a version number and does not meaningfully describe the actual changes in the pull request.	Provide a descriptive title that summarizes the main changes, such as 'Add full license report feature' or 'Version 5.4.24: Add license report flag and improve license sorting'.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch 5.4.24

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

src/license.c (1)

316-358: Consider edge cases in license expression parsing.

The AND/OR parsing logic assumes well-formed SPDX license expressions. If the input contains malformed expressions (e.g., trailing operators, consecutive spaces, or parentheses from complex expressions like (MIT OR Apache-2.0)), the parsing may produce unexpected results.

For example, "(MIT OR Apache-2.0)" would emit "(MIT" and "Apache-2.0)" with parentheses included.

💡 Optional: Strip parentheses from tokens

 		while (token)
 		{
 			/* Skip AND/OR operators */
 			if (strcmp(token, "AND") == 0 || strcmp(token, "OR") == 0)
 			{
 				token = strtok_r(NULL, " ", &saveptr);
 				continue;
 			}
+			/* Strip leading/trailing parentheses */
+			while (*token == '(') token++;
+			char *end = token + strlen(token) - 1;
+			while (end > token && *end == ')') *end-- = '\0';
+			if (!*token) {
+				token = strtok_r(NULL, " ", &saveptr);
+				continue;
+			}
 			if (!first_entry)
 				len += sprintf(buffer + len, ",");

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/license.c` around lines 316 - 358, The parsing loop in src/license.c that
tokenizes lic_copy with strtok_r(…, " ") (variables: lic_copy, token, saveptr,
first_license, first_entry, buffer) doesn't handle parentheses, extra spaces or
malformed expressions; update the tokenizer to split on spaces and parentheses
(e.g., use delimiters " ()" or post-process each token to trim leading/trailing
'(' and ')'), skip empty tokens and skip stray operators (AND/OR) or trailing
operators, and ensure first_license is set only from a valid license token (fall
back to a cleaned single-license value if none found) so URLs emit clean SPDX
IDs without surrounding parentheses or empty entries.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/license.c`:
- Around line 316-358: The parsing loop in src/license.c that tokenizes lic_copy
with strtok_r(…, " ") (variables: lic_copy, token, saveptr, first_license,
first_entry, buffer) doesn't handle parentheses, extra spaces or malformed
expressions; update the tokenizer to split on spaces and parentheses (e.g., use
delimiters " ()" or post-process each token to trim leading/trailing '(' and
')'), skip empty tokens and skip stray operators (AND/OR) or trailing operators,
and ensure first_license is set only from a valid license token (fall back to a
cleaned single-license value if none found) so URLs emit clean SPDX IDs without
surrounding parentheses or empty entries.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cb657bb6-bd09-4541-8e6b-21c71150584f

📥 Commits

Reviewing files that changed from the base of the PR and between c99222b and b0276d9.

📒 Files selected for processing (5)

inc/scanoss.h
src/help.c
src/license.c
src/main.c
src/match.c

scanoss-qg

LGTM

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/license.c`:
- Around line 587-588: The early-exit condition inside the license iteration
incorrectly requires i > 0 so a component license at index 0 won't trigger the
break; remove the unnecessary i > 0 check in the loop that evaluates "if (i > 0
&& component_license && !full_license_report) break;" and change it to simply
check "if (component_license && !full_license_report) break;" so that when
component_license is present (even at i==0) the loop exits immediately and only
the component license is reported.
- Around line 326-352: The code can emit an invalid URL when first_license stays
empty; update the block that appends the URL (referencing variables
first_license, first_entry, lic_copy, buffer, len) to only append the ,"url":
"https://spdx.org/licenses/%s.html" fragment when first_license contains a
non-empty string (e.g., check first_license[0] != '\0'), otherwise skip adding
the URL (or add a safe fallback like omitting the url field or using a
placeholder). Ensure this check comes after the token parsing loop and before
the sprintf that writes the URL to buffer.
- Around line 561-574: The off-by-one stems from incrementing file_header_filter
and scancode_file_filter before checking; currently the code uses ">=3" which
skips the 3rd license. Update the conditions in the blocks referencing
licenses_by_type.licenses[i].id (the file_header_filter and scancode_file_filter
checks that also use full_license_report) to use ">3" instead of ">=3" so the
first three licenses are processed and only the 4th+ are skipped when
full_license_report is false.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9dc51fdc-e2da-43a7-9442-b6422e954587

📥 Commits

Reviewing files that changed from the base of the PR and between 4c1fab5 and 276f76a.

📒 Files selected for processing (1)

src/license.c

src/license.c

coderabbitai · 2026-04-09T16:45:40Z

src/license.c

+		//file header license and scancode_file liceses are limited to a maximum of 3.
+		if (licenses_by_type.licenses[i].id == 2)
+		{
+			file_header_filter++;
+			if (file_header_filter >=3 && !full_license_report)
+				continue;
+		}
+
+		if (licenses_by_type.licenses[i].id == 4)
+		{
+			scancode_file_filter++;
+			if (scancode_file_filter >=3 && !full_license_report)
+				continue;
+		}


⚠️ Potential issue | 🟠 Major

Off-by-one: allows 2 licenses instead of 3.

The comment states "limited to a maximum of 3", but the current logic skips when filter >= 3, which triggers on the 3rd item (filter values: 1→process, 2→process, 3→skip). This allows only 2.

🐛 Proposed fix

if (licenses_by_type.licenses[i].id == 2) { file_header_filter++; - if (file_header_filter >=3 && !full_license_report) + if (file_header_filter > 3 && !full_license_report) continue; } if (licenses_by_type.licenses[i].id == 4) { scancode_file_filter++; - if (scancode_file_filter >=3 && !full_license_report) + if (scancode_file_filter > 3 && !full_license_report) continue; }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

//file header license and scancode_file liceses are limited to a maximum of 3.

if (licenses_by_type.licenses[i].id == 2)

{

file_header_filter++;

if (file_header_filter >=3 && !full_license_report)

continue;

}

if (licenses_by_type.licenses[i].id == 4)

{

scancode_file_filter++;

if (scancode_file_filter >=3 && !full_license_report)

continue;

}

//file header license and scancode_file liceses are limited to a maximum of 3.

if (licenses_by_type.licenses[i].id == 2)

{

file_header_filter++;

if (file_header_filter > 3 && !full_license_report)

continue;

}

if (licenses_by_type.licenses[i].id == 4)

{

scancode_file_filter++;

if (scancode_file_filter > 3 && !full_license_report)

continue;

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/license.c` around lines 561 - 574, The off-by-one stems from incrementing file_header_filter and scancode_file_filter before checking; currently the code uses ">=3" which skips the 3rd license. Update the conditions in the blocks referencing licenses_by_type.licenses[i].id (the file_header_filter and scancode_file_filter checks that also use full_license_report) to use ">3" instead of ">=3" so the first three licenses are processed and only the 4th+ are skipped when full_license_report is false.

src/license.c

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/license.c`:
- Around line 294-295: The current check in src/license.c using
license_source_id and strstr(license, "LicenseRef") drops entire Scancode
records when any LicenseRef token appears; change it so only pure LicenseRef
entries are skipped and compound expressions keep their SPDX parts: in the block
that uses license_source_id, replace the broad strstr test with a stricter test
that returns buffer only when the whole license string is a single LicenseRef-*
token (e.g., matches start with "LicenseRef" and contains no other SPDX
operators or tokens), and for compound expressions (strings containing SPDX
operators like AND/OR/() or multiple tokens) strip/ignore only the LicenseRef-*
tokens when expanding/normalizing the expression instead of returning early;
update the logic around the current strncmp/strstr usage and any normalization
code that consumes license so it removes LicenseRef-* tokens before further
parsing.
- Around line 561-587: The loop updates file_header_filter,
scancode_file_filter, scanoss_license, and component_license regardless of
whether license_to_json actually emitted output, so enforce caps only on emitted
licenses by making emission observable and using that to drive continues and
counters: modify license_to_json (or add a helper like license_would_emit) to
return/indicate whether it will emit (a bool "emitted") without side effects,
then call that before applying the cap checks (or call license_to_json and use
its emitted return) and only increment file_header_filter/scancode_file_filter
or set scanoss_license/component_license when emitted is true; update the logic
around the id checks (licenses_by_type.licenses[i].id) and the early continue
conditions to reference emitted instead of assuming every iteration produced
output, using the existing variables file_header_filter, scancode_file_filter,
scanoss_license, component_license, full_license_report and the license_to_json
symbol to locate where to change behavior.
- Around line 333-346: The code currently treats the token after a "WITH"
operator as a license and generates a licenses URL; change the logic so that
when you encounter "WITH" (in the block referencing token, saveptr) you set a
flag (e.g., next_is_exception) and advance to the next token, then when emitting
the JSON entry (the code using token, buffer, len, first_entry, first_license)
use that flag to choose "https://spdx.org/exceptions/%s.html" instead of the
licenses path and reset the flag; ensure first_license is only set for actual
licenses (not exceptions) and clear the flag after use.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 043b8f7a-902d-45bc-a6ba-d473ec854e99

📥 Commits

Reviewing files that changed from the base of the PR and between 276f76a and 6cf3fe3.

📒 Files selected for processing (1)

src/license.c

coderabbitai · 2026-04-09T16:53:39Z

src/license.c

+	if (!strncmp(license_source_id, "scancode", 8) && strstr(license, "LicenseRef"))
 		return buffer;


⚠️ Potential issue | 🟠 Major

Don't drop mixed SPDX expressions just because they contain LicenseRef.

This now skips the whole Scancode record for values like MIT AND LicenseRef-scancode-foo, so the valid SPDX part is lost too. Restrict the skip to pure LicenseRef-* entries, or ignore only the LicenseRef-* token when expanding compound expressions.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/license.c` around lines 294 - 295, The current check in src/license.c using license_source_id and strstr(license, "LicenseRef") drops entire Scancode records when any LicenseRef token appears; change it so only pure LicenseRef entries are skipped and compound expressions keep their SPDX parts: in the block that uses license_source_id, replace the broad strstr test with a stricter test that returns buffer only when the whole license string is a single LicenseRef-* token (e.g., matches start with "LicenseRef" and contains no other SPDX operators or tokens), and for compound expressions (strings containing SPDX operators like AND/OR/() or multiple tokens) strip/ignore only the LicenseRef-* tokens when expanding/normalizing the expression instead of returning early; update the logic around the current strncmp/strstr usage and any normalization code that consumes license so it removes LicenseRef-* tokens before further parsing.

coderabbitai · 2026-04-09T16:53:39Z

src/license.c

+			/* Skip AND/OR/WITH operators */
+			if (strcmp(token, "AND") == 0 || strcmp(token, "OR") == 0 || strcmp(token, "WITH") == 0)
+			{
+				token = strtok_r(NULL, " ()", &saveptr);
+				continue;
+			}
+			if (!first_entry)
+				len += sprintf(buffer + len, ",");
+			else
+			{
+				strncpy(first_license, token, MAX_FIELD_LN - 1);
+				first_entry = false;
+			}
+			len += sprintf(buffer + len, "\"%s\": \"https://spdx.org/licenses/%s.html\"", token, token);


⚠️ Potential issue | 🟠 Major

WITH exceptions need SPDX /exceptions/ URLs.

In expressions like GPL-2.0-only WITH Classpath-exception-2.0, the token after WITH is an SPDX exception, not a license. This code currently emits spdx.org/licenses/... for it, which produces an invalid URL.

💡 Suggested fix

- char *token = strtok_r(lic_copy, " ()", &saveptr); + char *token = strtok_r(lic_copy, " ()", &saveptr); + const char *spdx_kind = "licenses"; while (token) { /* Skip AND/OR/WITH operators */ - if (strcmp(token, "AND") == 0 || strcmp(token, "OR") == 0 || strcmp(token, "WITH") == 0) + if (strcmp(token, "AND") == 0 || strcmp(token, "OR") == 0 || strcmp(token, "WITH") == 0) { + spdx_kind = (strcmp(token, "WITH") == 0) ? "exceptions" : "licenses"; token = strtok_r(NULL, " ()", &saveptr); continue; } if (!first_entry) len += sprintf(buffer + len, ","); @@ - len += sprintf(buffer + len, "\"%s\": \"https://spdx.org/licenses/%s.html\"", token, token); + len += sprintf(buffer + len, "\"%s\": \"https://spdx.org/%s/%s.html\"", token, spdx_kind, token); + spdx_kind = "licenses"; token = strtok_r(NULL, " ()", &saveptr); }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/license.c` around lines 333 - 346, The code currently treats the token after a "WITH" operator as a license and generates a licenses URL; change the logic so that when you encounter "WITH" (in the block referencing token, saveptr) you set a flag (e.g., next_is_exception) and advance to the next token, then when emitting the JSON entry (the code using token, buffer, len, first_entry, first_license) use that flag to choose "https://spdx.org/exceptions/%s.html" instead of the licenses path and reset the flag; ensure first_license is only set for actual licenses (not exceptions) and clear the flag after use.

coderabbitai · 2026-04-09T16:53:39Z

src/license.c

+		//file header license and scancode_file liceses are limited to a maximum of 3.
+		if (licenses_by_type.licenses[i].id == 2)
+		{
+			file_header_filter++;
+			if (file_header_filter >=3 && !full_license_report)
+				continue;
+		}
+
+		if (licenses_by_type.licenses[i].id == 4)
+		{
+			scancode_file_filter++;
+			if (scancode_file_filter >=3 && !full_license_report)
+				continue;
+		}
+
+		if (licenses_by_type.licenses[i].id == 5 && scanoss_license && !full_license_report)
+			continue;
+
 		buffer = license_to_json(crclist, buffer, licenses_by_type.licenses[i].text, licenses_by_type.licenses[i].id, &first);
+		//just report component license if available
+		if (licenses_by_type.licenses[i].id == 0 && !first)
+			component_license = true;
+
+		else if (licenses_by_type.licenses[i].id > 0 && !first)
+			scanoss_license = true;
+
+		if (i > 0 && component_license && !full_license_report)


⚠️ Potential issue | 🟠 Major

Count only emitted licenses when enforcing caps and break conditions.

file_header_filter, scancode_file_filter, scanoss_license, and component_license are updated per iteration, but license_to_json() can no-op for duplicates, short strings, or skipped Scancode LicenseRef values. That means non-emitted entries still consume the quota and can suppress later valid licenses.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/license.c` around lines 561 - 587, The loop updates file_header_filter, scancode_file_filter, scanoss_license, and component_license regardless of whether license_to_json actually emitted output, so enforce caps only on emitted licenses by making emission observable and using that to drive continues and counters: modify license_to_json (or add a helper like license_would_emit) to return/indicate whether it will emit (a bool "emitted") without side effects, then call that before applying the cap checks (or call license_to_json and use its emitted return) and only increment file_header_filter/scancode_file_filter or set scanoss_license/component_license when emitted is true; update the logic around the id checks (licenses_by_type.licenses[i].id) and the early continue conditions to reference emitted instead of assuming every iteration produced output, using the existing variables file_header_filter, scancode_file_filter, scanoss_license, component_license, full_license_report and the license_to_json symbol to locate where to change behavior.

coderabbitai

♻️ Duplicate comments (3)

src/license.c (3)
567-570: ⚠️ Potential issue | 🟠 Major

Check the id boundary before emitting, and only advance last_id on actual output.

As written, the first record from the next id group is already appended before the break runs. Also, a skipped/non-emitted record can still change last_id and prematurely terminate the loop.
💡 Suggested fix
 	for (int i = 0; i < licenses_by_type.count; i++)
 	{
-		buffer = license_to_json(crclist, buffer, licenses_by_type.licenses[i].text, licenses_by_type.licenses[i].id, &first);
-		if (last_id >= 0 && last_id != licenses_by_type.licenses[i].id && !first && !full_license_report)
+		if (last_id >= 0 && last_id != licenses_by_type.licenses[i].id && !first && !full_license_report)
 			break;
-		last_id = licenses_by_type.licenses[i].id;
+
+		char *next = license_to_json(crclist, buffer, licenses_by_type.licenses[i].text, licenses_by_type.licenses[i].id, &first);
+		if (next != buffer)
+			last_id = licenses_by_type.licenses[i].id;
+		buffer = next;
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/license.c` around lines 567 - 570, The loop currently calls
license_to_json(...) before checking the id boundary and always updates last_id,
which lets a non-emitted record change last_id and can append the first record
of the next id group before breaking; fix by testing the boundary (compare
licenses_by_type.licenses[i].id against last_id and respect full_license_report)
before calling license_to_json so you never append from the next id group, and
only assign last_id = licenses_by_type.licenses[i].id after license_to_json
actually emitted output (i.e., after observing the first flag/return indicating
data was written).
342-355: ⚠️ Potential issue | 🟠 Major

Use SPDX /exceptions/ URLs after WITH.

The token after WITH is an SPDX exception, so emitting it under /licenses/ produces an invalid URL. Track WITH and switch the next token to https://spdx.org/exceptions/...; only actual licenses should go under /licenses/.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/license.c` around lines 342 - 355, The code currently treats "WITH" like
a skip and misclassifies the following token as a license (producing /licenses/
URLs); change this by tracking when you see the "WITH" operator (e.g., bool
saw_with = false), and when token == "WITH" set saw_with = true and advance to
the next token via strtok_r; for that next token use the exceptions path
("https://spdx.org/exceptions/%s.html") instead of the licenses path when
building the buffer (then reset saw_with = false); keep existing handling of
"AND"/"OR" as skips and preserve first_entry/first_license logic when adding
entries to buffer.
303-304: ⚠️ Potential issue | 🟠 Major

Don't drop mixed SPDX expressions just because they contain LicenseRef.

This still skips the whole Scancode record for values like MIT AND LicenseRef-scancode-foo, so the valid SPDX part is lost too. Restrict the fast-path skip to pure LicenseRef-* entries, or strip only the LicenseRef-* token when expanding compound expressions.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/license.c`:
- Around line 567-570: The loop currently calls license_to_json(...) before
checking the id boundary and always updates last_id, which lets a non-emitted
record change last_id and can append the first record of the next id group
before breaking; fix by testing the boundary (compare
licenses_by_type.licenses[i].id against last_id and respect full_license_report)
before calling license_to_json so you never append from the next id group, and
only assign last_id = licenses_by_type.licenses[i].id after license_to_json
actually emitted output (i.e., after observing the first flag/return indicating
data was written).
- Around line 342-355: The code currently treats "WITH" like a skip and
misclassifies the following token as a license (producing /licenses/ URLs);
change this by tracking when you see the "WITH" operator (e.g., bool saw_with =
false), and when token == "WITH" set saw_with = true and advance to the next
token via strtok_r; for that next token use the exceptions path
("https://spdx.org/exceptions/%s.html") instead of the licenses path when
building the buffer (then reset saw_with = false); keep existing handling of
"AND"/"OR" as skips and preserve first_entry/first_license logic when adding
entries to buffer.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6ce1e145-0e5a-4b1a-b220-879e183108bd

📥 Commits

Reviewing files that changed from the base of the PR and between 6cf3fe3 and 692cae7.

📒 Files selected for processing (1)

src/license.c

mscasso-scanoss added 2 commits April 8, 2026 15:33

change license report behavior acording to SP-4247

dd9c505

limit file source licenses to 3

b0276d9

mscasso-scanoss requested a review from scanoss-qg April 9, 2026 12:21

mscasso-scanoss self-assigned this Apr 9, 2026

coderabbitai bot reviewed Apr 9, 2026

View reviewed changes

add WITH case for license split

4c1fab5

scanoss-qg reviewed Apr 9, 2026

View reviewed changes

coderabbitai bot reviewed Apr 9, 2026

View reviewed changes

use scancode component liceses as fallback

6cf3fe3

mscasso-scanoss force-pushed the 5.4.24 branch from 276f76a to 6cf3fe3 Compare April 9, 2026 16:48

coderabbitai bot reviewed Apr 9, 2026

View reviewed changes

adjust license logic to license service

692cae7

coderabbitai bot reviewed Apr 10, 2026

View reviewed changes

		if (!strncmp(license_source_id, "scancode", 8) && strstr(license, "LicenseRef"))
		return buffer;

Conversation

mscasso-scanoss commented Apr 9, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

scanoss-qg left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mscasso-scanoss commented Apr 9, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 9, 2026 •

edited

Loading