-
Notifications
You must be signed in to change notification settings - Fork 23
Chore/scan tunning parameters #174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
0e16a3e to
1bd454d
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
1bd454d to
c12833c
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
c12833c to
92ed9a1
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
92ed9a1 to
ebe0ef3
Compare
📝 WalkthroughWalkthroughAdded snippet-matching tuning: new CLI flags, settings schema and accessors, a ScanSettingsBuilder to merge CLI/file settings (priority: CLI > file_snippet > root), propagated a Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant User as CLI User
participant CLI as cli.py
participant File as scanoss.json
participant Settings as ScanossSettings
participant Builder as ScanSettingsBuilder
participant Scanner as Scanner
participant API as ScanossApi
participant HTTP as HTTP Server
User->>CLI: run scan with flags (e.g. --min-snippet-hits, --ranking)
CLI->>File: load scanoss.json
File-->>Settings: build ScanossSettings (root + file_snippet)
CLI->>Builder: provide CLI args + ScanossSettings
Note right of Builder: Merge precedence: CLI > file_snippet > root
Builder-->>Scanner: return merged scan_settings object
CLI->>Scanner: instantiate Scanner(scanoss_settings, CLI args)
Scanner->>API: instantiate ScanossApi(... tuning params ...)
API->>API: _build_scan_settings_header() → base64(JSON of non-default tunings)
API->>HTTP: scan() with header "scanoss-settings: <base64-payload>"
HTTP-->>API: response
API-->>Scanner: scan result
Scanner-->>CLI: final output
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/scanoss/scanossapi.py (1)
58-80: Scan settings header logic is sound; update docs to match behaviourFunctionally this looks good:
- Tuning fields are stored on the instance and
_build_scan_settings_header()emits a base64-encoded JSON blob only when values are “meaningful” (!= 0for snippet counts,!= 'unset'for booleans,!= -1for ranking threshold).scan()attaches the header asscanoss-settings, so the extra metadata rides with each request.A few minor doc cleanups would help:
__init__docstring still says the new params default toNone, buthonour_file_extsandrankingactually default to'unset'._build_scan_settings_headerdocstring referencesx-scanoss-scan-settings; the actual header key isscanoss-settings.- Consider clarifying in the docstring that
0and-1are treated as “defer to server configuration” and intentionally omitted from the header.No changes to runtime logic are needed.
Also applies to: 90-95, 173-180, 293-326
src/scanoss/scanner.py (2)
147-171: Skip-headers merge logic is correct; fix_merge_cli_with_settingsdocstringThe new logic to merge
skip_headersandskip_headers_limit:settings_skip_headers = file_snippet_settings.get('skip_headers') settings_skip_headers_limit = file_snippet_settings.get('skip_headers_limit') skip_headers = Scanner._merge_cli_with_settings(skip_headers, settings_skip_headers) skip_headers_limit = Scanner._merge_cli_with_settings(skip_headers_limit, settings_skip_headers_limit)combined with:
if settings_value is not None: return settings_value return cli_valuecorrectly gives precedence to values from
scanoss.jsonover CLI defaults, which is sensible.However, the
_merge_cli_with_settingsdocstring claims “Merged value with CLI taking priority over settings”, which is the opposite of what the function actually does (and what the code above relies on). I’d update the “Returns:” description to clearly state that settings take priority over CLI values.Also applies to: 248-260
173-238:post_processorcreation condition uses the wrong variableHere:
scan_settings = (ScanSettingsBuilder(scanoss_settings) .with_proxy(proxy) .with_url(url) .with_ignore_cert_errors(ignore_cert_errors) .with_min_snippet_hits(min_snippet_hits) .with_min_snippet_lines(min_snippet_lines) .with_honour_file_exts(honour_file_exts) .with_ranking(ranking) .with_ranking_threshold(ranking_threshold) ) ... self.post_processor = ( ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scan_settings else None )
scan_settingsis always aScanSettingsBuilderinstance, so this condition is effectively always true, even whenscanoss_settingsisNone. That’s a behavioural change from the usual “only post-process when we actually have settings” pattern and risks constructingScanPostProcessorwith aNonesettings object.If
ScanPostProcessorassumes a non-Nonesettings object, this could lead to unexpected errors; even if it handlesNone, the condition is misleading.Suggest switching the condition to check
scanoss_settingsinstead of the builder:Proposed fix
- self.post_processor = ( - ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scan_settings else None - ) + self.post_processor = ( + ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) + if scanoss_settings + else None + )
🧹 Nitpick comments (4)
tests/test_scan_settings_builder.py (1)
33-359: Good coverage of ScanSettingsBuilder behaviourThe tests exercise initialization, helper methods, all
with_*methods, priority rules, and chaining in a realistic way usingtests/data/scanoss.json. This gives solid confidence in the builder logic.If you want to harden things further, consider adding a couple of tests for sentinel values (
min_snippet_hits == 0,ranking_threshold == -1) to reflect the semantics used when building the API header (i.e., “defer to server config” cases).src/scanoss/cli.py (1)
190-222: Snippet-tuning CLI options wired correctly; consider tightening validationThe new
--min-snippet-hits,--min-snippet-lines,--ranking,--ranking-threshold, and--honour-file-extsoptions are consistent with howScannerandScanSettingsBuilderconsume them, including the “0 / -1 mean defer to server config” semantics.Two optional polish items:
--ranking-threshold: you document-1..99but don’t enforce it; a customtypeor validator would catch out-of-range values early.--honour-file-exts: help text is clear here, but its JSON-schema description currently says “Ignores file extensions”; aligning those descriptions would avoid confusion between positive/negative wording.docs/source/_static/scanoss-settings-schema.json (1)
142-260: Schema additions align with code; tweakhonour_file_extsdescriptionThe new
settings.proxy,settings.http_config,settings.file_snippet,settings.hpfm, andsettings.containerblocks line up well with the new getters and builder logic in code (same field names and types, includingranking_*,min_snippet_*,honour_file_exts, skip header controls, etc.).One small documentation inconsistency:
file_snippet.properties.honour_file_exts.descriptionsays “Ignores file extensions…”, but the field name and CLI option (--honour-file-exts) are phrased positively (“honour file extensions”).To avoid confusion, it would be clearer if this description matched the positive semantics (or explicitly stated whether
truemeans “honour” vs “ignore”).src/scanoss/scanner.py (1)
342-346: Dependency auto-enable via settings: confirm intended behaviour
is_dependency_scan()now returnsTrueeither when the dependency bit is set inscan_optionsor whensettings.file_snippet.dependency_analysisis true:if self.scan_options & ScanType.SCAN_DEPENDENCIES.value: return True file_snippet_settings = self.scanoss_settings.get_file_snippet_settings() if self.scanoss_settings else {} return file_snippet_settings.get('dependency_analysis', False)This means a settings file can implicitly turn on dependency scanning even if the user didn’t pass
--dependencies/--dependencies-only. That’s reasonable, but it is a change in behaviour.If this is intentional, it might be worth documenting somewhere that
dependency_analysisinscanoss.jsonacts as a default “enable deps scan” toggle, overridden only by explicitly disabling dependency bits inscan_options.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
CHANGELOG.md(2 hunks)CLIENT_HELP.md(1 hunks)docs/source/_static/scanoss-settings-schema.json(1 hunks)scanoss.json(1 hunks)src/scanoss/__init__.py(1 hunks)src/scanoss/cli.py(5 hunks)src/scanoss/scan_settings_builder.py(1 hunks)src/scanoss/scanner.py(9 hunks)src/scanoss/scanoss_settings.py(1 hunks)src/scanoss/scanossapi.py(6 hunks)tests/data/scanoss.json(1 hunks)tests/test_scan_settings_builder.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
tests/test_scan_settings_builder.py (3)
src/scanoss/scan_settings_builder.py (12)
ScanSettingsBuilder(31-263)_str_to_bool(220-226)_merge_with_priority(203-209)_merge_cli_with_settings(212-216)with_proxy(69-83)with_url(85-99)with_ignore_cert_errors(101-119)with_min_snippet_hits(121-134)with_min_snippet_lines(136-149)with_honour_file_exts(151-167)with_ranking(169-184)with_ranking_threshold(186-199)src/scanoss/scanoss_settings.py (1)
ScanossSettings(76-433)src/scanoss/scanner.py (1)
_merge_cli_with_settings(249-260)
src/scanoss/cli.py (1)
src/scanoss/scanoss_settings.py (4)
ScanossSettings(76-433)load_json_file(103-138)set_file_type(140-152)set_scan_type(154-161)
src/scanoss/scan_settings_builder.py (2)
src/scanoss/scanoss_settings.py (6)
ScanossSettings(76-433)get_file_snippet_settings(339-345)get_file_snippet_proxy(419-425)get_proxy(403-409)get_http_config(411-417)get_file_snippet_http_config(427-433)src/scanoss/scanner.py (1)
_merge_cli_with_settings(249-260)
🪛 markdownlint-cli2 (0.18.1)
CHANGELOG.md
776-776: Link and image reference definitions should be needed
Duplicate link or image reference definition: "1.42.0"
(MD053, link-image-reference-definitions)
🔇 Additional comments (7)
src/scanoss/__init__.py (1)
25-25: LGTM!Version bump to 1.43.0 appropriately reflects the new scan tuning parameters and
ScanSettingsBuilderadditions.scanoss.json (1)
18-21: LGTM!Adding
scanoss-winnowing.pyto the BOM includes is appropriate since it's a declared dependency used for fast winnowing functionality.CHANGELOG.md (1)
11-20: LGTM!The changelog entry for version 1.43.0 clearly documents the new scan engine tuning parameters and
ScanSettingsBuilderfunctionality.CLIENT_HELP.md (1)
262-304: LGTM!Comprehensive documentation for the new snippet tuning options. The examples are clear, and the note about using
=syntax for negative values (line 292) is a helpful detail for users.tests/data/scanoss.json (1)
28-51: LGTM!The test data structure effectively exercises the nested configuration hierarchy with distinct values at root and
file_snippetlevels (e.g., different proxy hosts and http_config settings), which enables proper testing of theScanSettingsBuildermerge priority logic.src/scanoss/cli.py (1)
1410-1437: scan / wfp / folder- settings wiring looks consistent*The refactor to use
scanoss_settings(instead of the oldscan_settings) inwfp(),scan(),folder_hashing_scan(), andfolder_hash()is coherent:
- Settings files are loaded once into
ScanossSettings, with legacy vs new file types handled inscan().- The same
scanoss_settingsinstance is threaded intoScanner,ScannerHFH, andFolderHasher.- New snippet-tuning CLI options are only passed to
Scannerfor thescancommand, which is appropriate given they affect server-side scanning, not fingerprint-only flows.No functional issues stand out in these call sites.
Also applies to: 1518-1532, 1597-1641, 2716-2831
src/scanoss/scanoss_settings.py (1)
339-433: New settings accessors are straightforward and consistentThe added getters for
file_snippet, proxy, andhttp_config(includingmin_snippet_*, ranking,honour_file_exts, skip headers, etc.) correctly mirror the underlying data structure and provide clear defaults (None/0/Falseas documented). They integrate cleanly withScanSettingsBuilderandScanner.
ebe0ef3 to
59b57ba
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
CHANGELOG.md (1)
775-776: Fix duplicate link reference definition.The duplicate link reference issue flagged in the previous review remains unresolved. Line 776 uses
[1.42.0]as the link label, but it should be[1.43.0]to match the version and prevent Markdown rendering issues.🔎 Apply this diff to fix the link label:
[1.42.0]: https://github.com/scanoss/scanoss.py/compare/v1.41.1...v1.42.0 -[1.42.0]: https://github.com/scanoss/scanoss.py/compare/v1.42.0...v1.43.0 +[1.43.0]: https://github.com/scanoss/scanoss.py/compare/v1.42.0...v1.43.0
🧹 Nitpick comments (2)
src/scanoss/cli.py (1)
190-222: Fix inconsistent spacing in choices lists.Lines 206 and 219 have inconsistent spacing in the
choicesparameter:
- Line 206:
choices=['unset' ,'true', 'false'](space before comma)- Line 219:
choices=['unset','true', 'false'](no space after first comma)🔎 Apply this diff for consistent formatting:
p_scan.add_argument( '--ranking', type=str, - choices=['unset' ,'true', 'false'], + choices=['unset', 'true', 'false'], default='unset', help='Enable or disable ranking (optional - default: server configuration)', ) p_scan.add_argument( '--honour-file-exts', type=str, - choices=['unset','true', 'false'], + choices=['unset', 'true', 'false'], default='unset', help='Honour file extensions during scanning. When not set, defers to server configuration (optional)', )src/scanoss/scanossapi.py (1)
75-79: Fix spacing before comma in parameter default.Line 77 has a space before the comma:
honour_file_exts = 'unset' ,🔎 Apply this diff:
- honour_file_exts = 'unset' , + honour_file_exts='unset',
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
CHANGELOG.md(2 hunks)CLIENT_HELP.md(1 hunks)scanoss.json(1 hunks)src/scanoss/__init__.py(1 hunks)src/scanoss/cli.py(5 hunks)src/scanoss/scan_settings_builder.py(1 hunks)src/scanoss/scanner.py(9 hunks)src/scanoss/scanoss_settings.py(1 hunks)src/scanoss/scanossapi.py(6 hunks)tests/data/scanoss.json(1 hunks)tests/test_scan_settings_builder.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- CLIENT_HELP.md
🚧 Files skipped from review as they are similar to previous changes (3)
- src/scanoss/scan_settings_builder.py
- tests/data/scanoss.json
- tests/test_scan_settings_builder.py
🧰 Additional context used
🧬 Code graph analysis (1)
src/scanoss/cli.py (2)
src/scanoss/scanoss_settings.py (4)
ScanossSettings(76-433)load_json_file(103-138)set_file_type(140-152)set_scan_type(154-161)src/scanoss/inspection/utils/file_utils.py (1)
load_json_file(29-44)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (4)
src/scanoss/__init__.py (1)
25-25: LGTM!Version bump is consistent with the new features introduced in this PR.
scanoss.json (1)
18-21: LGTM!Adding the scanoss-winnowing.py library to the BOM is appropriate for dependency tracking.
src/scanoss/scanoss_settings.py (1)
339-433: LGTM!The new getter methods provide clean access to file_snippet settings with consistent patterns, appropriate defaults, and clear documentation.
src/scanoss/scanner.py (1)
173-182: TheScanSettingsBuilderimplementation does not match the stated priority order. The builder implements file_snippet > root settings > CLI, where CLI arguments are the lowest priority (used as fallback). This is the opposite of the PR objectives which require CLI > file_snippet > root settings. The priority order logic must be reversed in the merge methods to ensure CLI arguments take precedence when provided.Likely an incorrect or invalid review comment.
59b57ba to
0fb75e9
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/scanoss/scanner.py (1)
147-157: Priority order contradicts PR objectives.Lines 153-155 use
_merge_cli_with_settingswhich implements "settings > cli" priority (as seen in lines 258-260). However, the PR objectives explicitly state: "priority: CLI > file_snippet > root settings".This means CLI arguments should override settings file values, but the current implementation does the opposite for
skip_headersandskip_headers_limit.🔎 Apply this diff to fix the priority order:
@staticmethod def _merge_cli_with_settings(cli_value, settings_value): - """Merge CLI value with settings value (two-level priority: settings > cli). + """Merge CLI value with settings value (two-level priority: cli > settings). Args: cli_value: Value from CLI argument settings_value: Value from scanoss.json file_snippet settings Returns: - Merged value with CLI taking priority over settings + Merged value with CLI taking priority over settings file """ + if cli_value is not None: + return cli_value if settings_value is not None: return settings_value - return cli_value + return None
🧹 Nitpick comments (3)
src/scanoss/scanossapi.py (1)
75-79: Consider type enforcement for boolean parameters.The
honour_file_extsandrankingparameters useUnion[bool, str, None]to support both boolean values and the 'unset' sentinel string. While theScanSettingsBuildercorrectly converts string inputs to booleans before passing them toScanossApi, direct instantiation ofScanossApicould lead to strings like 'true'/'false' being serialized as JSON strings rather than booleans.Consider adding runtime validation in
_build_scan_settings_header()to ensure these values are booleans when included:🔎 Suggested fix
# honour_file_exts: None = unset, don't send if self.honour_file_exts is not None and self.honour_file_exts != 'unset': - settings['honour_file_exts'] = self.honour_file_exts + settings['honour_file_exts'] = bool(self.honour_file_exts) if not isinstance(self.honour_file_exts, bool) else self.honour_file_exts # ranking: None = unset, don't send if self.ranking is not None and self.ranking != 'unset': - settings['ranking_enabled'] = self.ranking + settings['ranking_enabled'] = bool(self.ranking) if not isinstance(self.ranking, bool) else self.rankingtests/test_scan_settings_builder.py (1)
36-38: Consider usingsetUpClassfor shared fixtures.The
scan_settingsfixture is created at class definition time. While this works, usingsetUpClassis the conventional approach for shared test fixtures in unittest:🔎 Suggested refactor
class TestScanSettingsBuilder(unittest.TestCase): """Tests for the ScanSettingsBuilder class.""" - script_dir = os.path.dirname(os.path.abspath(__file__)) - scan_settings_path = Path(script_dir, 'data', 'scanoss.json').resolve() - scan_settings = ScanossSettings(filepath=scan_settings_path) + @classmethod + def setUpClass(cls): + script_dir = os.path.dirname(os.path.abspath(__file__)) + cls.scan_settings_path = Path(script_dir, 'data', 'scanoss.json').resolve() + cls.scan_settings = ScanossSettings(filepath=str(cls.scan_settings_path))src/scanoss/cli.py (1)
190-222: Fix inconsistent spacing in choices lists.Lines 206 and 219 have inconsistent spacing in the
choicesparameter:
- Line 206:
choices=['unset' ,'true', 'false']has a space before the comma after 'unset'- Line 219:
choices=['unset','true', 'false']is missing a space after 'unset' comma🔎 Apply this diff to fix the spacing:
p_scan.add_argument( '--ranking', type=str, - choices=['unset' ,'true', 'false'], + choices=['unset', 'true', 'false'], default='unset', help='Enable or disable ranking (optional - default: server configuration)', ) p_scan.add_argument( '--ranking-threshold', type=int, default=None, help='Ranking threshold value. Valid range: -1 to 99. A value of -1 defers to server configuration (optional)', ) p_scan.add_argument( '--honour-file-exts', type=str, - choices=['unset','true', 'false'], + choices=['unset', 'true', 'false'], default='unset', help='Honour file extensions during scanning. When not set, defers to server configuration (optional)', )
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
CHANGELOG.md(2 hunks)CLIENT_HELP.md(1 hunks)scanoss.json(1 hunks)src/scanoss/__init__.py(1 hunks)src/scanoss/cli.py(5 hunks)src/scanoss/scan_settings_builder.py(1 hunks)src/scanoss/scanner.py(9 hunks)src/scanoss/scanoss_settings.py(1 hunks)src/scanoss/scanossapi.py(6 hunks)tests/data/scanoss.json(1 hunks)tests/test_scan_settings_builder.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- scanoss.json
- CHANGELOG.md
- src/scanoss/scan_settings_builder.py
- src/scanoss/init.py
🧰 Additional context used
🧬 Code graph analysis (3)
src/scanoss/scanossapi.py (2)
src/scanoss/scanossbase.py (1)
print_debug(58-63)src/scanoss/spdxlite.py (1)
print_debug(61-66)
tests/test_scan_settings_builder.py (3)
src/scanoss/scan_settings_builder.py (12)
ScanSettingsBuilder(31-266)_str_to_bool(223-229)_merge_with_priority(203-209)_merge_cli_with_settings(212-219)with_proxy(69-83)with_url(85-99)with_ignore_cert_errors(101-119)with_min_snippet_hits(121-134)with_min_snippet_lines(136-149)with_honour_file_exts(151-167)with_ranking(169-184)with_ranking_threshold(186-199)src/scanoss/scanoss_settings.py (1)
ScanossSettings(76-433)src/scanoss/scanner.py (1)
_merge_cli_with_settings(249-260)
src/scanoss/scanner.py (2)
src/scanoss/scan_settings_builder.py (10)
ScanSettingsBuilder(31-266)_merge_cli_with_settings(212-219)with_proxy(69-83)with_url(85-99)with_ignore_cert_errors(101-119)with_min_snippet_hits(121-134)with_min_snippet_lines(136-149)with_honour_file_exts(151-167)with_ranking(169-184)with_ranking_threshold(186-199)src/scanoss/scanoss_settings.py (1)
get_file_snippet_settings(339-345)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (14)
tests/data/scanoss.json (1)
29-51: LGTM!The test data file is well-structured with a clear hierarchy between root settings and file_snippet settings, enabling comprehensive testing of the priority-based configuration merging logic.
src/scanoss/scanoss_settings.py (1)
339-433: LGTM!The new accessor methods for file_snippet settings are well-implemented with:
- Consistent patterns across all getters
- Proper Optional return types
- Clear docstrings documenting expected return values
- Clean separation between root-level and file_snippet-level configuration access
src/scanoss/scanossapi.py (2)
176-179: LGTM!The integration of the
scanoss-settingsheader into the scan method is clean and correct. The header is only added when meaningful settings are configured.
293-326: LGTM!The
_build_scan_settings_headermethod correctly:
- Excludes sentinel values (0, 'unset', -1, None) from the serialized payload
- Uses base64 encoding for safe header transmission
- Logs the settings for debugging before encoding
- Returns None when no settings need to be sent
CLIENT_HELP.md (1)
262-304: LGTM!The documentation for the new snippet tuning options is comprehensive and well-structured:
- Clear explanations of each flag's purpose
- Practical usage examples
- Helpful note about
=syntax for negative threshold values- Example showing how to combine multiple options
tests/test_scan_settings_builder.py (2)
44-117: LGTM!Excellent test coverage for:
- Initialization with None and with settings object
- Static helper methods (
_str_to_bool,_merge_with_priority,_merge_cli_with_settings)- Edge cases like all-None inputs and case-insensitive boolean parsing
119-358: LGTM!Comprehensive test coverage for all
with_*methods including:
- CLI-only scenarios (no settings file)
- Settings file values taking priority over CLI
- Method chaining verification
- All tuning parameters: proxy, url, ignore_cert_errors, min_snippet_hits, min_snippet_lines, honour_file_exts, ranking, ranking_threshold
src/scanoss/cli.py (3)
1410-1418: LGTM!Settings loading is correctly refactored to use
scanoss_settingsinstead ofscan_settings, maintaining consistency with the rest of the codebase.
1518-1535: LGTM!Settings loading logic correctly handles different scenarios (identify, ignore, and standard settings) while maintaining backward compatibility.
1631-1641: LGTM!New snippet tuning parameters are correctly wired from CLI arguments to the Scanner constructor, enabling configuration of snippet matching behavior.
src/scanoss/scanner.py (4)
110-117: LGTM!Constructor signature correctly updated to accept the new snippet tuning parameters and renamed
scanoss_settingsfor clarity. The new parameters enable fine-grained control over snippet matching behavior.
186-206: LGTM!The
ScanossApiis correctly initialized with merged settings fromScanSettingsBuilder, properly wiring through the new snippet tuning parameters.
337-345: LGTM!The
is_dependency_scanmethod correctly falls back to checkingdependency_analysisin thefile_snippetsettings when the explicit scan option is not set, providing flexible configuration.
173-184: Correct the stated priority order in review comment.The review comment incorrectly states the PR objectives. The ScanSettingsBuilder class documentation specifies priority as: file_snippet section > settings section > CLI arguments, and the
_merge_with_priorityimplementation correctly follows this priority order. The code is working as designed; the review comment's stated objectives are reversed.Likely an incorrect or invalid review comment.
0fb75e9 to
c62cf42
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
src/scanoss/cli.py (1)
1511-1516: Fix typo in conflicting options error message.The error message currently says
--skip-file-settings, but the actual flag is--skip-settings-file. Updating the string will avoid confusion for users trying to fix their invocation.Suggested diff
- if args.settings and args.skip_settings_file: - print_stderr('ERROR: Cannot specify both --settings and --skip-file-settings options.') + if args.settings and args.skip_settings_file: + print_stderr('ERROR: Cannot specify both --settings and --skip-settings-file options.')src/scanoss/scanpostprocessor.py (2)
112-121: Resolve PLW2901 by avoiding reassignment of loop variable in _load_component_info.
resultis used as the loop variable and then reassigned, which triggersPLW2901and breaks lint. Use a separate local name for the normalized result.Suggested diff
- for _, result in self.results.items(): - result = result[0] if isinstance(result, list) else result - purls = result.get('purl', []) - for purl in purls: - self.component_info_map[purl] = result + for _, result in self.results.items(): + normalized_result = result[0] if isinstance(result, list) else result + purls = normalized_result.get('purl', []) + for purl in purls: + self.component_info_map[purl] = normalized_result
159-164: Resolve PLW2901 in _replace_purls by not overwritingresult.Same pattern here: the loop variable
resultis reassigned, triggering the lint error. Use a separate variable to hold the normalized result before passing it downstream.Suggested diff
- for result_path, result in self.results.items(): - result = result[0] if isinstance(result, list) else result - should_replace, to_replace_with_purl = self._should_replace_result(result_path, result, to_replace_entries) - if should_replace: - self.results[result_path] = [self._update_replaced_result(result, to_replace_with_purl)] + for result_path, result in self.results.items(): + normalized_result = result[0] if isinstance(result, list) else result + should_replace, to_replace_with_purl = self._should_replace_result( + result_path, + normalized_result, + to_replace_entries, + ) + if should_replace: + self.results[result_path] = [self._update_replaced_result(normalized_result, to_replace_with_purl)]src/scanoss/scanner.py (1)
174-238: Fix post-processor initialisation to avoid None scanoss_settings crash.
ScanSettingsBuilder(scanoss_settings)is always truthy, soscan_settingsis truthy even whenscanoss_settingsisNone. As a result:
self.post_processoris created withscanoss_settings=None.- Later,
post_process()callsself.scanoss_settings.is_legacy(), which will raiseAttributeErrorwhen no settings file is used (e.g.--skip-settings-file).You should gate the post-processor on
scanoss_settingsitself, not on the merged builder object.Suggested diff
- scan_settings = (ScanSettingsBuilder(scanoss_settings) + scan_settings = ( + ScanSettingsBuilder(scanoss_settings) .with_proxy(proxy) .with_url(url) .with_ignore_cert_errors(ignore_cert_errors) .with_min_snippet_hits(min_snippet_hits) .with_min_snippet_lines(min_snippet_lines) .with_honour_file_exts(honour_file_exts) .with_ranking(ranking) .with_ranking_threshold(ranking_threshold)) @@ - self.post_processor = ( - ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scan_settings else None - ) + self.post_processor = ( + ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) + if scanoss_settings + else None + )
♻️ Duplicate comments (3)
src/scanoss/scan_settings_builder.py (1)
101-120: Docstring priority description doesn’t match implementation.
with_ignore_cert_errors’s docstring says “CLI True > file_snippet > settings > False”, but the code uses_merge_with_priority(cli, file_snippet, root)which choosesfile_snippetfirst, then root, then CLI. This matches the class-level description (“file_snippet > settings > CLI”), not the method docstring. Please update the docstring to reflect the actual precedence.src/scanoss/scanner.py (2)
147-171: Skip-headers merge logic is fine; doc for helper should emphasise “settings > CLI”.The
_merge_cli_with_settingshelper is used here so thatfile_snippetsettings override CLI values forskip_headersandskip_headers_limit, with CLI acting as a fallback when the settings file doesn’t specify them. That behaviour is consistent and useful.However, the helper’s docstring later in the file still contains contradictory text (it both says “settings > cli” and “CLI taking priority over settings”). Consider cleaning that up to state clearly that settings take precedence over CLI for callers like this.
248-260: Clarify _merge_cli_with_settings docstring to match behaviour.
_merge_cli_with_settingsreturnssettings_valuewhen it is notNone, otherwise it falls back tocli_value, so the effective priority is settings > CLI. The docstring currently mixes both “settings > cli” and “CLI taking priority over settings”, which is contradictory and can mislead future maintainers. It would be better to describe it unambiguously as “settings override CLI; CLI is only used as a fallback when settings are unset.”
🧹 Nitpick comments (3)
src/scanoss/scanossapi.py (2)
58-95: Align constructor docstrings with actual default values.The
__init__docstrings still describehonour_file_exts,ranking, andranking_thresholdas defaulting toNone, but the function signature uses'unset'(for the booleans) andNonefor the threshold. It would be clearer to document the actual sentinel defaults and what they mean (e.g.'unset'→ “do not send; defer to server config”,Nonevs-1forranking_threshold).
293-326: Header construction is sound; consider updating the docstring header name.The
_build_scan_settings_headerlogic around sentinels (0,'unset',-1) and base64-encoded JSON looks correct and matches the CLI help. The docstring still refers to anx-scanoss-scan-settingsheader, whilescan()actually sendsscanoss-settings; updating the wording would avoid confusion.src/scanoss/scan_settings_builder.py (1)
63-67: Tighten type hints for honour_file_exts and ranking.
self.honour_file_extsandself.rankingare annotated asOptional[any], which refers to the builtin function rather than a meaningful type. Given the usage, something likeOptional[bool | str](to coverTrue/Falseand'unset') would be clearer for type-checkers and readers.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
CHANGELOG.md(2 hunks)CLIENT_HELP.md(1 hunks)scanoss.json(1 hunks)src/scanoss/__init__.py(1 hunks)src/scanoss/cli.py(5 hunks)src/scanoss/scan_settings_builder.py(1 hunks)src/scanoss/scanner.py(9 hunks)src/scanoss/scanoss_settings.py(1 hunks)src/scanoss/scanossapi.py(6 hunks)src/scanoss/scanpostprocessor.py(4 hunks)tests/data/scanoss.json(1 hunks)tests/test_scan_settings_builder.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- CHANGELOG.md
- scanoss.json
- CLIENT_HELP.md
- tests/test_scan_settings_builder.py
- src/scanoss/init.py
🧰 Additional context used
🧬 Code graph analysis (4)
src/scanoss/scan_settings_builder.py (2)
src/scanoss/scanoss_settings.py (5)
get_file_snippet_settings(339-345)get_file_snippet_proxy(419-425)get_proxy(403-409)get_http_config(411-417)get_file_snippet_http_config(427-433)src/scanoss/scanner.py (1)
_merge_cli_with_settings(249-260)
src/scanoss/scanpostprocessor.py (1)
src/scanoss/scanoss_settings.py (3)
is_legacy(315-317)get_bom_remove(210-218)get_bom_replace(220-228)
src/scanoss/cli.py (1)
src/scanoss/scanoss_settings.py (4)
ScanossSettings(76-433)load_json_file(103-138)set_file_type(140-152)set_scan_type(154-161)
src/scanoss/scanner.py (4)
src/scanoss/scan_settings_builder.py (10)
ScanSettingsBuilder(31-266)_merge_cli_with_settings(212-219)with_proxy(69-83)with_url(85-99)with_ignore_cert_errors(101-119)with_min_snippet_hits(121-134)with_min_snippet_lines(136-149)with_honour_file_exts(151-167)with_ranking(169-184)with_ranking_threshold(186-199)src/scanoss/scanoss_settings.py (1)
get_file_snippet_settings(339-345)src/scanoss/scanossapi.py (1)
ScanossApi(52-341)src/scanoss/scanpostprocessor.py (1)
ScanPostProcessor(76-289)
🪛 GitHub Actions: Lint
src/scanoss/scanpostprocessor.py
[error] 117-119: PLW2901: 'for' loop variable 'result' overwritten by assignment target. CI: make lint failed (exit code non-zero).
[error] 159-161: PLW2901: 'for' loop variable 'result' overwritten by assignment target. CI: make lint failed (exit code non-zero).
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (4)
tests/data/scanoss.json (1)
28-50: Test settings structure matches code expectations.The new
settings→file_snippetstructure (proxy/http_config plus snippet and ranking fields) aligns with the new getters inScanossSettingsand the merge logic inScanSettingsBuilder. Looks good for exercising the new path.src/scanoss/cli.py (1)
190-222: Snippet tuning flags are well-wired.The new
--min-snippet-hits,--min-snippet-lines,--ranking,--ranking-threshold, and--honour-file-extsoptions are plumbed cleanly through toScannerand ultimately intoScanSettingsBuilder/ScanossApi. Semantics (0 / -1 / 'unset' as “defer to server config”) match the help text and header-building logic.src/scanoss/scanoss_settings.py (1)
339-433: New settings accessors are consistent with schema and callers.The added getters for
file_snippetsettings, skip-headers values, and root/file_snippet proxy/http_config cleanly wrap the underlying JSON structure and match the keys used inScanSettingsBuilderand the testscanoss.json. No issues spotted here.src/scanoss/scanner.py (1)
342-346: dependency_analysis fallback is a nice extension.Using
file_snippet_settings.get('dependency_analysis', False)as a fallback whenSCAN_DEPENDENCIESis not set lets the settings file enable dependency scanning without extra CLI flags. This is a reasonable extension of behaviour and fits the overall configuration model.
c62cf42 to
9fb4a2b
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
src/scanoss/scanner.py (2)
236-238: Condition checks wrong variable - still unfixed from past review.The condition checks
scan_settings(the builder object, which is always truthy after construction) but passesscanoss_settingstoScanPostProcessor. This means the post-processor will be created even whenscanoss_settingsisNone.🔎 Apply this diff to fix the condition:
self.post_processor = ( - ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scan_settings else None + ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scanoss_settings else None )
248-260: Priority order still contradicts PR objectives and has internal inconsistency.The PR objectives state: "priority: CLI > file_snippet > root settings", meaning CLI should override settings file values. However:
- The implementation (lines 258-260) gives
settings_valuepriority overcli_value- The docstring is internally inconsistent: line 250 says "settings > cli" but line 256 says "CLI taking priority"
This affects
skip_headersandskip_headers_limitoptions at lines 153-155.🔎 Apply this diff to fix the priority order:
@staticmethod def _merge_cli_with_settings(cli_value, settings_value): - """Merge CLI value with settings value (two-level priority: settings > cli). + """Merge CLI value with settings value (two-level priority: cli > settings). Args: cli_value: Value from CLI argument settings_value: Value from scanoss.json file_snippet settings Returns: - Merged value with CLI taking priority over settings + Merged value with CLI taking priority over settings file """ - if settings_value is not None: - return settings_value - return cli_value + if cli_value is not None: + return cli_value + if settings_value is not None: + return settings_value + return None
🧹 Nitpick comments (1)
src/scanoss/cli.py (1)
190-222: Minor formatting inconsistencies in choices lists.Lines 206 and 219 have inconsistent spacing in the
choicesarrays:
- Line 206:
['unset' ,'true', 'false']- space before comma- Line 219:
['unset','true', 'false']- missing space after 'unset'While this doesn't affect functionality, it's inconsistent with Python style conventions.
🔎 Apply this diff to fix formatting:
p_scan.add_argument( '--ranking', type=str, - choices=['unset' ,'true', 'false'], + choices=['unset', 'true', 'false'], default='unset', help='Enable or disable ranking (optional - default: server configuration)', ) p_scan.add_argument( '--ranking-threshold', type=int, default=None, help='Ranking threshold value. Valid range: -1 to 99. A value of -1 defers to server configuration (optional)', ) p_scan.add_argument( '--honour-file-exts', type=str, - choices=['unset','true', 'false'], + choices=['unset', 'true', 'false'], default='unset', help='Honour file extensions during scanning. When not set, defers to server configuration (optional)', )
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
CHANGELOG.md(2 hunks)CLIENT_HELP.md(1 hunks)scanoss.json(1 hunks)src/scanoss/__init__.py(1 hunks)src/scanoss/cli.py(5 hunks)src/scanoss/scan_settings_builder.py(1 hunks)src/scanoss/scanner.py(9 hunks)src/scanoss/scanoss_settings.py(1 hunks)src/scanoss/scanossapi.py(6 hunks)src/scanoss/scanpostprocessor.py(5 hunks)tests/data/scanoss.json(1 hunks)tests/test_scan_settings_builder.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (7)
- src/scanoss/scan_settings_builder.py
- tests/data/scanoss.json
- CHANGELOG.md
- CLIENT_HELP.md
- src/scanoss/scanoss_settings.py
- src/scanoss/scanpostprocessor.py
- tests/test_scan_settings_builder.py
🧰 Additional context used
🧬 Code graph analysis (2)
src/scanoss/scanossapi.py (2)
src/scanoss/scanossbase.py (1)
print_debug(58-63)src/scanoss/spdxlite.py (1)
print_debug(61-66)
src/scanoss/cli.py (1)
src/scanoss/scanoss_settings.py (3)
ScanossSettings(76-433)set_file_type(140-152)set_scan_type(154-161)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (11)
src/scanoss/__init__.py (1)
25-25: LGTM!Version bump to 1.43.0 aligns with the new scan engine tuning parameters feature.
scanoss.json (1)
18-21: LGTM!Adding the
scanoss-winnowing.pypackage tobom.includealigns with the optional fast winnowing dependency used in the codebase.src/scanoss/cli.py (3)
1410-1436: LGTM!The
wfpfunction correctly usesscanoss_settingsconsistently throughout, and properly loads and passes settings to the Scanner.
1631-1641: LGTM!The new tuning parameters (
min_snippet_hits,min_snippet_lines,ranking,ranking_threshold,honour_file_exts) are correctly wired from CLI arguments to the Scanner constructor.
2821-2830: LGTM!The
get_scanoss_settings_from_argshelper function provides a clean, reusable pattern for loading scan settings from CLI arguments, maintaining consistency with other command flows.src/scanoss/scanossapi.py (2)
293-326: LGTM!The
_build_scan_settings_headermethod correctly builds a base64-encoded JSON payload for scan tuning settings. The logic properly excludes default/unset values from the payload.Note: The double space issue flagged in a past review appears to be resolved in this version.
176-179: LGTM!The scan settings header is correctly integrated into the request flow, only adding the
scanoss-settingsheader when tuning parameters are configured.src/scanoss/scanner.py (4)
147-156: LGTM!The settings merging logic for
skip_headersandskip_headers_limitcorrectly retrieves values fromfile_snippet_settingsand merges them with CLI values. The debug output helps trace the final merged values.
172-184: LGTM!Clean use of the builder pattern to construct merged scan settings, properly chaining all tuning parameters from CLI and settings file sources.
344-345: LGTM!The fallback to
file_snippet_settings.get('dependency_analysis', False)correctly extends dependency scan detection to honor settings file configuration when not explicitly set via scan options.
409-421: LGTM!The
FileFiltersinstantiation correctly passesself.scanoss_settingsfor consistent settings propagation throughout the scanning workflow.
9fb4a2b to
c402c34
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
c402c34 to
c36da93
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (3)
src/scanoss/scanossapi.py (1)
295-332: Align header docstring/name and normalize boolean-like fields before JSON.
- The docstring says this builds an
x-scanoss-scan-settingsheader, butscan()actually sendsscanoss-settings. Either rename the header or update the docstring/comment so they agree.honour_file_extsandranking_enabledare written to JSON without normalization; through the normal Scanner + ScanSettingsBuilder path they are booleans, but direct callers can still pass string values ("true","false","unset"), which would be serialized as JSON strings.To make the API more robust and type-consistent, consider coercing these two fields to real booleans in
_build_scan_settings_header(similar to_str_to_boolin the builder) before dumping to JSON, while still respecting the'unset'sentinel.src/scanoss/scanner.py (2)
251-263: Docstring for_merge_cli_with_settingscontradicts implementation.The helper currently returns
settings_valuewhen it is notNone, otherwisecli_value, i.e. settings take priority over CLI. The docstring summary and “Returns” text still say “CLI taking priority over settings”, which is the opposite.Please update the docstring to reflect the actual behavior (settings > CLI), or, if the intent really is CLI > settings, adjust the implementation accordingly.
239-241: FixScanPostProcessorcreation condition to avoid crashes without settings.
self.post_processoris created whenscan_settings(the builder) is truthy, but you passscanoss_settingsintoScanPostProcessor. Whenscanoss_settingsisNone(no settings file), this still instantiatesScanPostProcessor(None), and laterpost_process()will fail onself.scanoss_settings.is_legacy().Condition should reflect the object actually required:
Proposed fix
- self.post_processor = ( - ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scan_settings else None - ) + self.post_processor = ( + ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) + if scanoss_settings + else None + )
🧹 Nitpick comments (3)
src/scanoss/scan_settings_builder.py (1)
63-68: Tighten types forhonour_file_extsandrankingattributes.
self.honour_file_extsandself.rankingare annotated asOptional[any], but in practice they areOptional[bool | str](bool from settings,'true'/'false'/'unset'from CLI). Narrowing these toOptional[bool | str](orOptional[Any]if you prefer) will make static analysis clearer and better document the intended value space.src/scanoss/cli.py (1)
218-222: Optionally enforce--ranking-thresholdrange at CLI.Help text documents a valid range of -1 to 99, but argparse doesn’t enforce it. Consider adding a simple range check (or
typewrapper) so invalid values are rejected early instead of relying on server-side validation.src/scanoss/scanpostprocessor.py (1)
178-205: Remove duplicateurl_statspop for clarity.
result.pop('url_stats', None)is called twice; the second call is redundant. It’s harmless at runtime but can be dropped to avoid confusion.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
CHANGELOG.mdCLIENT_HELP.mddocs/source/_static/scanoss-settings-schema.jsonscanoss.jsonsrc/scanoss/__init__.pysrc/scanoss/cli.pysrc/scanoss/data/scanoss-settings-schema.jsonsrc/scanoss/scan_settings_builder.pysrc/scanoss/scanner.pysrc/scanoss/scanoss_settings.pysrc/scanoss/scanossapi.pysrc/scanoss/scanpostprocessor.pytests/data/scanoss.jsontests/test_scan_settings_builder.py
🚧 Files skipped from review as they are similar to previous changes (5)
- CLIENT_HELP.md
- CHANGELOG.md
- scanoss.json
- docs/source/_static/scanoss-settings-schema.json
- tests/test_scan_settings_builder.py
🧰 Additional context used
🧬 Code graph analysis (3)
src/scanoss/scan_settings_builder.py (2)
src/scanoss/scanoss_settings.py (5)
get_file_snippet_settings(339-345)get_file_snippet_proxy(427-433)get_proxy(411-417)get_http_config(419-425)get_file_snippet_http_config(435-441)src/scanoss/scanner.py (1)
_merge_cli_with_settings(252-263)
src/scanoss/scanner.py (2)
src/scanoss/scan_settings_builder.py (9)
ScanSettingsBuilder(31-282)_merge_cli_with_settings(228-235)with_ignore_cert_errors(102-120)with_min_snippet_hits(122-135)with_min_snippet_lines(137-150)with_snippet_range_tolerance(152-165)with_honour_file_exts(167-183)with_ranking(185-200)with_ranking_threshold(202-215)src/scanoss/scanoss_settings.py (1)
get_file_snippet_settings(339-345)
src/scanoss/scanpostprocessor.py (1)
src/scanoss/scanoss_settings.py (3)
is_legacy(315-317)get_bom_remove(210-218)get_bom_replace(220-228)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (3)
src/scanoss/__init__.py (1)
25-25: LGTM! Version bump is appropriate.The minor version increment from 1.42.0 to 1.43.0 correctly reflects the addition of new features (snippet tuning parameters, ScanSettingsBuilder, extended CLI options) without breaking backward compatibility.
tests/data/scanoss.json (1)
28-50: Test settings fixture matches new schema and builder usage.The new
settingsblock (proxy, http_config, file_snippet) is consistent with the schema and ScanSettingsBuilder expectations and should provide good coverage for merged settings behavior.src/scanoss/scanoss_settings.py (1)
339-441: New file_snippet and proxy/http_config accessors look consistent and safe.The added getters (
get_file_snippet_settings, snippet tuning accessors, skip_headers helpers, and proxy/http_config accessors) cleanly mirror the schema structure and are side-effect free, which fits well with the builder/Scanner usage.
c36da93 to
a3ffb5a
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (5)
src/scanoss/data/scanoss-settings-schema.json (1)
167-245: Correcthonour_file_extsdescription to match behavior.The schema currently documents
honour_file_extsas “Ignores file extensions…”, which contradicts both the field name and the CLI/settings behavior (“True = honour, False = ignore, None/unset = defer to server configuration”). Please update the description text accordingly so schema and runtime semantics stay aligned.src/scanoss/scanner.py (4)
148-157: Guard file_snippet settings access for legacy settings files.The code unconditionally calls
get_file_snippet_settings()whenscanoss_settingsis not None, without checking if the settings file is legacy format. For legacy-style settings wheresettings.file_snippetmay not exist, this risks attribute or key errors.Consider adding a guard:
file_snippet_settings = ( scanoss_settings.get_file_snippet_settings() if scanoss_settings and not scanoss_settings.is_legacy() else {} )
240-240: Fix variable name in condition check.Line 240 passes
scanoss_settingstoScanPostProcessorbut the condition checksif scan_settings(the builder result). The condition should verify the parameter being passed:- ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scan_settings else None + ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scanoss_settings else None
251-263: Priority order contradicts PR objectives.The
_merge_cli_with_settingsmethod gives priority tosettings_valueovercli_value(lines 261-263), and the docstring on line 253 states "settings > cli". However, the PR objectives explicitly require: "priority: CLI > file_snippet > root settings".CLI arguments should override settings file values, but the current implementation does the opposite:
@staticmethod def _merge_cli_with_settings(cli_value, settings_value): - """Merge CLI value with settings value (two-level priority: settings > cli). + """Merge CLI value with settings value (two-level priority: cli > settings). Args: cli_value: Value from CLI argument settings_value: Value from scanoss.json file_snippet settings Returns: - Merged value with CLI taking priority over settings + Merged value with CLI taking priority over settings file """ + if cli_value is not None: + return cli_value if settings_value is not None: return settings_value - return cli_value + return None
347-348: Guard file_snippet settings access for legacy settings files.Similar to lines 148-157, this code unconditionally calls
get_file_snippet_settings()without checkingis_legacy(). For legacy settings files, this could cause errors:- file_snippet_settings = self.scanoss_settings.get_file_snippet_settings() if self.scanoss_settings else {} - return file_snippet_settings.get('dependency_analysis', False) + if not self.scanoss_settings or self.scanoss_settings.is_legacy(): + return False + file_snippet_settings = self.scanoss_settings.get_file_snippet_settings() + return file_snippet_settings.get('dependency_analysis', False)
🧹 Nitpick comments (3)
CLIENT_HELP.md (1)
262-315: Snippet tuning CLI docs look consistent; consider adding afalseexample for--honour-file-exts.The new section correctly matches the implemented flags and semantics (0/‑1 sentinels, aliases like
-srt). As a polish, you might add a short example showing--honour-file-exts falseto mirror therankingtrue/false examples.src/scanoss/cli.py (1)
1417-1444: Settings loading and propagation to Scanner/ScannerHFH are coherent; consider centralising the logic.The new
ScanossSettingsloading inwfp,scan, and viaget_scanoss_settings_from_argsforfolder-scan/folder-hashis consistent with the schema and preserves legacy vs new file handling. Passing the samescanoss_settingsobject intoScanner/ScannerHFHalongside the CLI tuning parameters gives the builder a single source of truth.As a future cleanup, you could reuse
get_scanoss_settings_from_args()(or a shared lower-level helper) inwfp/scanto avoid three slightly different settings-loading paths drifting over time.Also applies to: 1525-1539, 1604-1647, 2717-2783, 2829-2838
src/scanoss/scanpostprocessor.py (1)
81-101: Post‑processing wiring toScanossSettingsis solid; clarify/guard the non‑None invariant.The rename to
scanoss_settingsand use ofis_legacy(),get_bom_remove(), andget_bom_replace()makes the intent much clearer, and the new handling of list vs dict results in_load_component_info/_replace_purlsis a nice robustness improvement.This relies on
scanoss_settingsalways being a validScanossSettingsinstance; if a caller ever passesNone(for example, in a--skip-settings-filepath),post_process()would raise onself.scanoss_settings.is_legacy(). Either document thatscanoss_settingsis required and ensure all call sites honour that, or add a small guard (e.g. early‑return whenscanoss_settingsis falsy) to make the class safer to use.Also applies to: 112-121, 129-156, 159-164
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
CHANGELOG.mdCLIENT_HELP.mddocs/source/_static/scanoss-settings-schema.jsonscanoss.jsonsrc/scanoss/__init__.pysrc/scanoss/cli.pysrc/scanoss/data/scanoss-settings-schema.jsonsrc/scanoss/scan_settings_builder.pysrc/scanoss/scanner.pysrc/scanoss/scanoss_settings.pysrc/scanoss/scanossapi.pysrc/scanoss/scanpostprocessor.pytests/data/scanoss.jsontests/test_scan_settings_builder.py
🚧 Files skipped from review as they are similar to previous changes (4)
- src/scanoss/scan_settings_builder.py
- src/scanoss/init.py
- scanoss.json
- tests/test_scan_settings_builder.py
🧰 Additional context used
🧬 Code graph analysis (4)
src/scanoss/scanner.py (2)
src/scanoss/scan_settings_builder.py (11)
ScanSettingsBuilder(31-282)_merge_cli_with_settings(228-235)with_proxy(70-84)with_url(86-100)with_ignore_cert_errors(102-120)with_min_snippet_hits(122-135)with_min_snippet_lines(137-150)with_snippet_range_tolerance(152-165)with_honour_file_exts(167-183)with_ranking(185-200)with_ranking_threshold(202-215)src/scanoss/scanoss_settings.py (2)
get_file_snippet_settings(339-345)get_sbom(230-241)
src/scanoss/scanossapi.py (2)
src/scanoss/scanossbase.py (1)
print_debug(58-63)src/scanoss/spdxlite.py (1)
print_debug(61-66)
src/scanoss/cli.py (1)
src/scanoss/scanoss_settings.py (3)
ScanossSettings(76-441)load_json_file(103-138)set_file_type(140-152)
src/scanoss/scanpostprocessor.py (1)
src/scanoss/scanoss_settings.py (3)
is_legacy(315-317)get_bom_remove(210-218)get_bom_replace(220-228)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (10)
CHANGELOG.md (1)
11-22: Changelog entry and compare links are consistent.The 1.43.0 notes accurately describe the new tuning flags and settings, and the [1.42.0]/[1.43.0] compare links now point to the correct version ranges.
Also applies to: 776-777
tests/data/scanoss.json (1)
28-50: Test settings fixture matches the new schema and tuning semantics.The added
settingsblock (proxy, http_config, file_snippet) is structurally consistent with the schema and provides sensible non-default values for exercising merging and header construction.src/scanoss/cli.py (1)
190-229: Snippet tuning CLI flags are well-scoped and consistent with behavior.The new
scanoptions (--min-snippet-hits,--min-snippet-lines,--snippet-range-tolerance/-srt,--ranking,--ranking-threshold,--honour-file-exts) line up with the documented semantics (0/‑1 as “defer to server”, tri‑state ranking and honour flags) and only affect the scan subcommand, which is appropriate.src/scanoss/scanossapi.py (1)
58-113: Scan settings header construction looks correct; fix the header name in the docstring.The constructor and
_build_scan_settings_headercorrectly capture the tuning parameters, drop sentinel values (0/-1/'unset'), and emit a base64‑encoded JSON blob thatscan()attaches as thescanoss-settingsheader.The
_build_scan_settings_headerdocstring still refers tox-scanoss-scan-settings; updating it toscanoss-settingswill keep docs aligned with the actual header being sent.[suggest_minor_issue]
Also applies to: 176-183, 296-333
src/scanoss/scanner.py (5)
43-43: LGTM! Clean parameter additions for snippet tuning.The new imports and constructor parameters correctly support the snippet-matching tuning feature as described in the PR objectives.
Also applies to: 110-110, 113-118
174-186: Builder pattern usage looks good.The ScanSettingsBuilder integration follows good practice with method chaining and debug logging of the merged configuration.
193-208: API initialization correctly uses merged settings.The ScanossApi is properly initialized with all the merged settings from the builder, passing the tuning parameters as intended.
245-249: SBOM retrieval correctly updated.The SBOM is now properly read from
self.scanoss_settingsusing the newget_sbom()accessor.
416-416: Parameter threading looks correct.The
scanoss_settingsparameter is properly passed toFileFiltersin all three locations.Also applies to: 687-687, 1108-1108
src/scanoss/scanoss_settings.py (1)
339-441: LGTM! Clean accessor implementation.All new accessor methods follow a consistent pattern with:
- Clear docstrings explaining purpose and return values
- Proper type hints (Optional[int], Optional[bool], Optional[dict])
- Sensible defaults where appropriate (e.g.,
get_skip_headers_limitreturns 0,get_skip_headersreturns False)- Correct navigation of the data structure via
get_file_snippet_settings()The separation between root-level (
get_proxy,get_http_config) and file_snippet-level accessors (get_file_snippet_proxy,get_file_snippet_http_config) properly supports the priority hierarchy described in the PR objectives.
What's Changed
Added
--min-snippet-hits- Minimum snippet hits required (0 defers to server config)--min-snippet-lines- Minimum snippet lines required (0 defers to server config)--ranking- Enable/disable result ranking (unset/true/false)--ranking-threshold- Ranking threshold value (-1 to 99, -1 defers to server config)--honour-file-exts- Honour file extensions during matching (unset/true/false)file_snippetsection to scanoss.json settings schema for configuring tuning parametersScanSettingsBuilderclass for merging CLI and settings file configurations with priority: CLI > file_snippet > root settingsSummary by CodeRabbit
New Features
Configuration
Documentation
Tests
Other
✏️ Tip: You can customize this high-level summary in your review settings.