mcurl: support passing curl options with various new features and fixes#10
Open
tomlau10 wants to merge 12 commits intoWanghongLin:masterfrom
Open
mcurl: support passing curl options with various new features and fixes#10tomlau10 wants to merge 12 commits intoWanghongLin:masterfrom
tomlau10 wants to merge 12 commits intoWanghongLin:masterfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
mcurl確實是個好東西! 👍在我的一個項目流程中,需要在一台運行舊 linux 的 vm 上對使用 Digest Auth 驗證的網站做下載。
但大多數的多線程下載工具比如
aria2都不支持,而wget2又無法在這台 vm 上安裝,就只剩wget和curl是支持: https://curl.se/docs/comparison-table.html所以我就轉成尋找讓 wget / curl 變成多線程下載的方法,最後找到這個
mcurl!然而當前版本的
mcurl沒有選項支持傳入自定義 curl 選項,比如--digest -u "${USER}:${PASSWD}"等。我本來是直接修改到 script 內使用,但這樣的用法並不 general。
於是我就嘗試對 script 進行更多改動,測試過程中也 fix 了些 bug。
越改越多東西,所以不如也開個 PR 😃
改動有點多,我盡可能將 commit 拆分得仔細和清晰一些
新功能
url後的所有 argument 都當成 curl options./mcurl.sh -s4 https://some.url -L --digest -u "${USER}:${PASSWD}"-L --digest -u "${USER}:${PASSWD}"都會傳到底層的 curl 調用ctrl+c中斷 script 時做 clean upkill -- -$$連同 background 的 curl process 一併 kill 掉rm -i方式進行詢問,不回覆y的話會取消操作-f|--force選項,讓底層轉用rm -f不作提示downloaded file size * 100 / total size以顯示百分比gitbashpgrep,我在 AI 建議下改用了jobs,應該是 portable 的pkill "^curl"來測試優化
mv就可以了,節省一次cat修復
du 1024 blocksize方式計算 size 不準確wc -c🤔strace查看過,wc -c {files}底層是會用stat()方式而不會實際 read filefstat()後再lseek(): https://github.com/coreutils/coreutils/blob/3a5c9c5537227eafc38c5657024584cdad63112a/src/wc.c#L340C1-L369C34running jobs == 0的判斷失效:當前確實是0,但實際上還有 jobs 未被 spawnrunning jobs == 0測試過的環境
GNU bash, version 5.1.16(1)-release (x86_64-pc-linux-gnu)GNU bash, version 5.2.37(1)-release (x86_64-pc-msys)GNU bash, version 5.2.21(1)-release (x86_64-pc-cygwin)GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin20)GNU bash, version 4.2.46(1)-release (x86_64-redhat-linux-gnu)另這個網站有不同 size 的 test file,有需要可以做更多測試: https://www.thinkbroadband.com/download
English Version (translated by deepseek-v3)
click to toggle
mcurlis truly a great tool! 👍During one of my project workflows, I needed to download from a website using Digest Auth on an old Linux VM. However, most multi-threaded download tools like
aria2don't support this, andwget2couldn't be installed on this VM, leaving onlywgetandcurlas options: https://curl.se/docs/comparison-table.htmlSo I started looking for ways to make wget/curl work with multi-threaded downloads, and eventually found
mcurl! However, the current version ofmcurldoesn't support passing custom curl options like--digest -u "${USER}:${PASSWD}", etc. I initially modified the script directly for my use case, but this approach wasn't generalizable. Then I attempted to make more modifications to the script, fixing some bugs along the way during testing.The changes kept growing, so I thought it would be better to open a PR 😃 There are quite a few modifications, and I've tried to keep the commits as detailed and clear as possible.
New Features
urlas curl options./mcurl.sh -s4 https://some.url -L --digest -u "${USER}:${PASSWD}"-L --digest -u "${USER}:${PASSWD}"will be passed to the underlying curl callsctrl+cinterruptionkill -- -$$to also kill background curl processesrm -ifor confirmation, canceling operation if response isn'ty-f|--forceoption to userm -fwithout promptingdownloaded file size * 100 / total sizeto show percentagegitbashpgrepbuilt-in, replaced withjobs(more portable)pkill "^curl"Optimizations
mvfor the first slice to save onecatoperationFixes
du 1024 blocksizegave inaccurate size calculationswc -cseems more portablestracethatwc -c {files}usesstat()without actual file readingfstat()thenlseek(): https://github.com/coreutils/coreutils/blob/3a5c9c5537227eafc38c5657024584cdad63112a/src/wc.c#L340C1-L369C34running jobs == 0check to fail (technically correct but jobs not spawned yet)running jobs == 0in main loopTested Environments
GNU bash, version 5.1.16(1)-release (x86_64-pc-linux-gnu)GNU bash, version 5.2.37(1)-release (x86_64-pc-msys)GNU bash, version 5.2.21(1)-release (x86_64-pc-cygwin)GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin20)GNU bash, version 4.2.46(1)-release (x86_64-redhat-linux-gnu)This site has test files of various sizes if more testing is needed: https://www.thinkbroadband.com/download