Skip to content

feat(modernize): Add PreferRemoteFile, PreferArchiveFile cops + CI fixes#1051

Open
SuhasKumar01 wants to merge 5 commits intochef:mainfrom
SuhasKumar01:feature/insecure-curl-wget-download
Open

feat(modernize): Add PreferRemoteFile, PreferArchiveFile cops + CI fixes#1051
SuhasKumar01 wants to merge 5 commits intochef:mainfrom
SuhasKumar01:feature/insecure-curl-wget-download

Conversation

@SuhasKumar01
Copy link
Copy Markdown

@SuhasKumar01 SuhasKumar01 commented Feb 2, 2026

Summary

Refactored based on maintainer feedback to replace the initial Security Cop with a comprehensive Modernization Suite, plus fixes CI/CD validation failures.

1. Chef/Modernize/PreferRemoteFile

Goal Discourage shell-based downloads (curl/wget)
Fix Suggests native remote_file resource
Scope 24 Tests
# ❌ Bad
execute 'curl https://example.com/file.tar.gz -o /tmp/file.tar.gz'

# ✅ Good
remote_file '/tmp/file.tar.gz' do
  source 'https://example.com/file.tar.gz'
  checksum 'sha256checksum'
end

2. Chef/Modernize/PreferArchiveFile

Goal Discourage shell-based extraction (tar, unzip, gunzip, gzip, 7z)
Fix Suggests native archive_file resource (Chef Infra Client 15.0+)
Scope 30 Tests
# ❌ Bad
execute 'tar -xzf /tmp/app.tar.gz -C /opt'

# ✅ Good
archive_file '/tmp/app.tar.gz' do
  destination '/opt'
end

3. Chef/Correctness/FileUtilsRMRF (Fix)

  • Fixed offense highlighting to target only FileUtils.rm_rf (not entire expression)
  • Fixed spec for RuboCop 1.84.2 compatibility (:config metadata)
  • Added missing doc YAML file

4. CI/CD Fixes

Missing Cop Registrations

Registered 6 cops in config/cookstyle.yml that caused rake validate_config to fail:

  • Chef/Correctness/FileUtilsRMRF
  • Chef/Ruby/GemspecRequireRubygems
  • Chef/Ruby/LegacyPowershellOutMethods
  • Chef/Ruby/GemspecLicense
  • Chef/Ruby/RequireNetHttps
  • Chef/Ruby/UnlessDefinedRequire

Lint Fixes

  • Removed extra blank line in fileutils_rm_rf_spec.rb (Layout/EmptyLinesAroundBlockBody)
  • Added .rubocop.yml exclusions for Chef/Ruby/UnlessDefinedRequire on infrastructure files (Rakefile, tasks/**/*)

Deep Dive Cleanup (Standardization)

  • Changed RSpec.describedescribe in fileutils_rm_rf_spec.rb (match all cop specs)
  • Standardized @example comments to use # bad / # good across all new cops
  • Added missing StyleGuide key to FileUtilsRMRF config entry
  • Created missing doc YAML for Chef/Correctness/FileUtilsRMRF
  • Aligned VersionAdded to 8.6.5 for all new cops (matching current release)

Design Decisions

No Autocorrect

Both cops intentionally disable autocorrect to prevent breaking:

  • Piped commands (e.g., curl | bash)
  • Complex flags (e.g., tar -xzf vs tar -cvf)

Performance

Both use RESTRICT_ON_SEND to limit scanning to shell resources only:

  • execute, bash, powershell_script, sh, csh, perl, python, ruby, zsh

False Positive Prevention

  • PreferRemoteFile: Avoids flagging libcurl, curlcache, wgetrc
  • PreferArchiveFile: Avoids flagging .tar.gz, tarball, stargazer, guitar

Test Results

Component Count Status
PreferRemoteFile 24 ✅ Passing
PreferArchiveFile 30 ✅ Passing
FileUtilsRMRF 3 ✅ Passing
rake validate_config ✅ All Cops found
Full Suite 1022 ✅ 0 failures
Cookstyle Lint 548 files ✅ 0 offenses
SonarQube ✅ Quality Gate Passed

Files Changed

File Status
lib/rubocop/cop/chef/modernize/prefer_remote_file.rb ➕ Added
spec/rubocop/cop/chef/modernize/prefer_remote_file_spec.rb ➕ Added
lib/rubocop/cop/chef/modernize/prefer_archive_file.rb ➕ Added
spec/rubocop/cop/chef/modernize/prefer_archive_file_spec.rb ➕ Added
lib/rubocop/cop/chef/correctness/fileutils_rm_rf.rb 🔧 Fixed
spec/rubocop/cop/chef/correctness/fileutils_rm_rf_spec.rb 🔧 Fixed
config/cookstyle.yml 📝 Modified (new cops + 6 missing registrations)
.rubocop.yml 📝 Modified
docs-chef-io/assets/cookstyle/cops_chef_modernize_preferremotefile.yml ➕ Added
docs-chef-io/assets/cookstyle/cops_chef_modernize_preferarchivefile.yml ➕ Added
docs-chef-io/assets/cookstyle/cops_chef_correctness_fileutilsrmrf.yml ➕ Added

Checklist

  • DCO signoff included
  • Tests pass locally (1022/1022, 0 failures)
  • rake validate_config passes — all cops registered
  • Configuration added to cookstyle.yml with StyleGuide keys
  • Uses RESTRICT_ON_SEND for performance
  • Code review feedback addressed (method deduplication)
  • Extraneous files removed
  • CI/CD validation issues fixed
  • Doc YAML files for all new cops
  • @example comments use standard # bad / # good format
  • VersionAdded aligned to 8.6.5 for all new cops

@SuhasKumar01 SuhasKumar01 requested review from a team and jaymzh as code owners February 2, 2026 06:07
@dafyddcrosby
Copy link
Copy Markdown
Collaborator

Hmm, this actually seems like it should be two different cops that this PR kind of wants as an end goal.

  • 1st cop - if you see wget/curl/etc in an execute block, you should be suggesting the remote_file resource: https://docs.chef.io/resources/remote_file/
  • 2nd cop - if you're trying to disable TLS in remote_file resources, flag for that.

Trying to track and parse the flags for every single curl/wget tool is a losing game in the long run. But the two cops I suggest are much easier to implement and maintain :-)

@dafyddcrosby dafyddcrosby added the Status: Waiting on Contributor A pull request that has unresolved requested actions from the author. label Feb 3, 2026
@SuhasKumar01
Copy link
Copy Markdown
Author

@dafyddcrosby,
Thanks for the review! That makes perfect sense. Trying to maintain regex for every possible CLI flag is definitely a losing battle long-term.

I agree that Cop #1 (Suggesting remote_file over curl/wget) is the cleaner, more "Chef-idiomatic" approach.

I will refactor this PR right now to implement that specific check—flagging any use of curl/wget in execute blocks and recommending remote_file instead. Updating shortly!

@SuhasKumar01 SuhasKumar01 changed the title Add Chef/Security/InsecureCurlWgetDownload cop Add Chef/Modernize/PreferRemoteFile cop Feb 4, 2026
@SuhasKumar01 SuhasKumar01 changed the title Add Chef/Modernize/PreferRemoteFile cop Add Chef/Modernize Cops: PreferRemoteFile & PreferArchiveFile (Modernization Suite) Feb 5, 2026
@SuhasKumar01 SuhasKumar01 changed the title Add Chef/Modernize Cops: PreferRemoteFile & PreferArchiveFile (Modernization Suite) feat(modernize): Add Modernization Suite (PreferRemoteFile + PreferArchiveFile) Feb 5, 2026
@SuhasKumar01 SuhasKumar01 force-pushed the feature/insecure-curl-wget-download branch from d707da9 to e289a9b Compare February 5, 2026 11:00
@SuhasKumar01
Copy link
Copy Markdown
Author

@jaymzh
Ready for review!

Based on the feedback that parsing CLI flags is a "losing battle," I have refactored this PR into a complete Modernization Suite that encourages native Chef resources over shell commands.

  1. Refactored: Chef/Modernize/PreferRemoteFile

Goal: Flag curl/wget usage in shell blocks.

Fix: Suggests remote_file resource.

Safety: Autocorrect disabled to preventing breaking piped commands (e.g., curl | bash).

  1. Added: Chef/Modernize/PreferArchiveFile

Goal: Flag tar, unzip, 7z usage.

Fix: Suggests archive_file resource (Chef 15+).

Context: Since curl is usually followed by extraction, this completes the artifact handling pipeline.

Summary:

Performance: Both cops use RESTRICT_ON_SEND to limit scanning to shell resources.

Verification: Added 57 tests covering download patterns, archive patterns, and false positive prevention.

Looking forward to your thoughts on this architectural pivot!

@jaymzh
Copy link
Copy Markdown
Collaborator

jaymzh commented Feb 10, 2026

I haven't reviewed the code yet, I will, but just wanted to clarify here:

Safety: Autocorrect disabled to preventing breaking piped commands (e.g., curl | bash).

Just to be clear... piping commands like that in an execute block is bad practice. If you want to do that, you use an remote_file, which can notify an execute resource (or an archive_file resource). We shouldn't be making rules to make bad code slightly less bad, we should be making rules to make bad code good.

All of these things can and should be done with native resources, not with shell blocks.

But yes, this is a hard one to autocorrect.

Comment thread .github/instructions/codacy.instructions.md Outdated
Comment thread lib/rubocop/cop/chef/modernize/prefer_archive_file.rb Outdated
return unless curl_or_wget_command?(first_arg.value)

add_offense(first_arg)
end
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, these two are the same.

Comment thread Video_Script_FINAL.md Outdated
Comment thread InsecureCurlWgetDownload_Report.md Outdated
Comment thread InsecureCurlWgetDownload_Presentation.md Outdated
Comment thread .gitignore Outdated
@SuhasKumar01
Copy link
Copy Markdown
Author

@jaymzh Thanks for the review!

Garbage Files: My apologies, I've removed the extra markdown files and .gitignore.

Refactoring: Good catch on the duplicate methods. I have collapsed check_resource_name and check_command_property into a single helper method to keep it DRY.

Safety: And yes, agreed on the philosophy regarding "bad code." I've left the Autocorrect disabled as discussed.

Pushed the fixes! Ready for another look.

@SuhasKumar01 SuhasKumar01 force-pushed the feature/insecure-curl-wget-download branch from 434b0e5 to 1e6a3ca Compare February 11, 2026 09:10
@SuhasKumar01
Copy link
Copy Markdown
Author

@jaymzh

Thanks for the coaching. You were right—having duplicate check methods was sloppy, and "making bad code slightly less bad" is the wrong goal.

Updates:

  • Refactored (DRY): Collapsed the logic into a single check_offense helper in both cops.
  • Git Hygiene: I realized I had committed some local config files by mistake. I performed a squash and force-push to clean up the commit history. The PR diff is now clean.
  • Philosophy: I reviewed the MSG (error messages) to ensure they explain why native resources are better (idempotency/retries), reinforcing the "make code good" mindset.

Ready for re-review!

@jaymzh
Copy link
Copy Markdown
Collaborator

jaymzh commented Feb 11, 2026

@SuhasKumar01 SuhasKumar01 force-pushed the feature/insecure-curl-wget-download branch from 1e6a3ca to 915bba5 Compare February 12, 2026 11:30
@SuhasKumar01
Copy link
Copy Markdown
Author

@jaymzh

check again now i think everything has been resolved

@jaymzh
Copy link
Copy Markdown
Collaborator

jaymzh commented Feb 12, 2026

you still have many failures.

Chef/Modernize/PreferRemoteFile is being triggered in a Rakefile, which it shouldn't. which is what was failing before.

I see no delta between your last push and this one.

There's line ending failures.

@SuhasKumar01
Copy link
Copy Markdown
Author

@jaymzh

check again now i think everything has been resolved.

…chiveFile)

Adds two new cops to discourage shell-based artifact handling in favor of native resources.

Signed-off-by: Suhas Kumar <theshahmedia1@gmail.com>
Signed-off-by: Suhas Kumar <theshahmedia1@gmail.com>
@SuhasKumar01 SuhasKumar01 force-pushed the feature/insecure-curl-wget-download branch from 481132c to baebdaa Compare February 16, 2026 11:33
… spec

- Add 6 missing cop entries to config/cookstyle.yml:
  Chef/Correctness/FileUtilsRMRF, Chef/Ruby/GemspecRequireRubygems,
  Chef/Ruby/LegacyPowershellOutMethods, Chef/Ruby/GemspecLicense,
  Chef/Ruby/RequireNetHttps, Chef/Ruby/UnlessDefinedRequire
- Fix FileUtilsRMRF cop to highlight only the method call, not args
- Fix FileUtilsRMRF spec to use :config metadata for RuboCop 1.84.2
- validate_config now passes; full suite: 1022 examples, 0 failures

Signed-off-by: Suhas Kumar <theshahmedia1@gmail.com>
@SuhasKumar01 SuhasKumar01 changed the title feat(modernize): Add Modernization Suite (PreferRemoteFile + PreferArchiveFile) feat(modernize): Add PreferRemoteFile, PreferArchiveFile cops + CI fixes Feb 18, 2026
@SuhasKumar01
Copy link
Copy Markdown
Author

@jaymzh

Hi, I updated my PR several times figuring out every possible error and solved it so please guide further.

- Remove extra blank line in fileutils_rm_rf_spec.rb (Layout/EmptyLinesAroundBlockBody)
- Exclude project infrastructure files from Chef/Ruby/UnlessDefinedRequire cop
  (Rakefile, cleanup_lint_roller.rb, tasks/) in .rubocop.yml

Signed-off-by: Suhas Kumar <theshahmedia1@gmail.com>
- Change RSpec.describe -> describe in fileutils_rm_rf_spec.rb (match all cop specs)
- Standardize @example comments to use # bad / # good across all new cops
- Add missing StyleGuide key to FileUtilsRMRF config entry
- Create missing doc YAML for Chef/Correctness/FileUtilsRMRF
- Align VersionAdded to 8.6.5 for all new cops (matching current release)

Signed-off-by: Suhas Kumar <theshahmedia1@gmail.com>
@sonarqubecloud
Copy link
Copy Markdown

@jaymzh jaymzh self-requested a review February 18, 2026 20:39
@jaymzh jaymzh removed the Status: Waiting on Contributor A pull request that has unresolved requested actions from the author. label Feb 18, 2026
@jaymzh
Copy link
Copy Markdown
Collaborator

jaymzh commented Feb 18, 2026

Thanks, looking much better. We'll give this a (hopefully) final review at next week's PR review meeting.

@SuhasKumar01
Copy link
Copy Markdown
Author

Thank you very much @jaymzh

Copy link
Copy Markdown
Collaborator

@dafyddcrosby dafyddcrosby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting closer!

'Native resources handle idempotence and multiple formats natively.'

# Performance: only trigger on_send for shell resources and command/code property setters
RESTRICT_ON_SEND = %i(execute bash sh csh perl python ruby zsh powershell_script command code).freeze
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RESTRICT_ON_SEND = %i(execute bash sh csh perl python ruby zsh powershell_script command code).freeze
RESTRICT_ON_SEND = %i(execute bash csh perl python ruby powershell_script command code).freeze

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see sh or zsh as documented resources in docs.chef.io, am I missing something here?
This appears to be missing the ksh resource. There's no code resource, so this could end up in a false positive where code isn't a resource property.


# Matches extraction tools as whole words.
# Excludes filenames like 'file.tar.gz' by ensuring a leading space/start and trailing space/end.
ARCHIVE_COMMAND_REGEX = /(?:^|\s)(?:tar|unzip|gunzip|gzip|7z)(?:\s|$)/.freeze
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benefit of having this as a constant is that you can test the strings directly against it


# Matches 'curl' or 'wget' as whole words to avoid false positives (e.g., 'libcurl')
# Regex: start-of-string OR whitespace, then curl/wget, then whitespace OR end-of-string
DOWNLOAD_COMMAND_REGEX = /(?:^|\s)(?:curl|wget)(?:\s|$)/.freeze
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

'Native resources ensure idempotence, support retries, and handle permissions correctly.'

# Performance: only trigger on_send for shell resources and command/code property setters
RESTRICT_ON_SEND = %i(execute bash sh csh perl python ruby zsh powershell_script command code).freeze
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

describe RuboCop::Cop::Chef::Modernize::PreferArchiveFile, :config do
let(:msg) { 'Use the `archive_file` resource instead of shell-based extraction (`tar`, `unzip`, etc.). Native resources handle idempotence and multiple formats natively.' }

# ==========================================================================
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's several permutations that are missing. If you loop this through the expected resources+shellout stings instead you can be exhaustive in your tests. See https://www.rubydoc.info/gems/rubocop/RuboCop/RSpec/ExpectOffense for how to do dynamic offense lengths

context 'when using execute with curl' do
it 'registers an offense for execute resource with curl command' do
expect_offense(<<~RUBY)
execute 'curl https://example.com/file.tar.gz -o /tmp/file.tar.gz'
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

@jaymzh
Copy link
Copy Markdown
Collaborator

jaymzh commented Apr 21, 2026

@SuhasKumar01 - following up - do you still want to work on this?

@jaymzh jaymzh added the Status: Waiting on Contributor A pull request that has unresolved requested actions from the author. label Apr 21, 2026
@SuhasKumar01
Copy link
Copy Markdown
Author

SuhasKumar01 commented Apr 22, 2026 via email

@jaymzh
Copy link
Copy Markdown
Collaborator

jaymzh commented Apr 22, 2026

@SuhasKumar01 - OK, then please address David's comments. I'll keep this open for a bit longer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Waiting on Contributor A pull request that has unresolved requested actions from the author.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants