Note
Almost every release contains bugfixes, but these are not usually included in the changelog. If a release contains only bugfix, it is marked as a 'bugfix release'. Otherwise, the changelog entries highlight only new or changed functionality.
- The entire scripting deobfuscation system was rewritten from scratch.
The old deobfuscation units (
deob-ps1-*,deob-vba-*,deob-js-*) have been replaced by full parsers and emulators for PowerShell, VBA, and JavaScript. The new units areps1,vba,js, and the universaldefuwhich auto-detects the language. - The new
cmdargunit extracts and unescapes arguments passed towmic,cmd,start, orpowershellcommands. - The
xtnsisunit now includes a script decompiler that outputs human-readable NSIS source. - The
swfunit was added for extracting resources from Flash SWF files. - The
baseunit is now purely block-based (RFC 4648 style). Big-integer encoding has been split into a newbigintunit. - The
escvbunit and the correspondingvbastrpattern forcarvenow support concatenations of quoted string literals and vb constants. - Several compression libraries (
aplib,blz,lzjb,lzw,xpress,mscf) were cythonized for improved performance.
- Improves
xtrarperformance by cythonizing parts of the code. - Fixes a regression where
xor filewould no longer read the contents of a file namedfilefrom disk.
- Improved help text descriptions for
carveandxtpexposing named patterns with description.
- The modular dependency system was simplified;
there are now simply the options
default,extended, andallon top of base refinery. - The
unqunit for restoring AV-quarantined files was added. It currently has near zero test coverage and is based entirely on unquarantining lore.
The release contains the new -g switch in binref for AI agents.
This is a test release for enabling trusted publishing with PyPi.
- The
xtsfunit was extended with support for SetupFactory 10. - The
hostnameregex pattern was renamed tohost. - The
py7zrdependency was eliminated and refinery's 7zip unpacker now supports all archives. - The
pngunit was added for extracting PNG data chunks. - The
xlmdeobfbackend code was inlined as a third party module that ships with refinery, while swapping out its recursive dependencies for what we already have: This further trims refinery's dependency tree. - A
--brief(-b) mode was added to thebinrefcommand. This is primarily intended as a reference document to get an overview of all units. - This command together with a newly authored
SKILL.mdfile are the first attempts to teach language model based AI agents to use binary refinery. - Compatibility with PowerShell was increased: Previously, complex framing syntax at the end of a unit would have to be quoted in PowerShell to work, this is no longer the case.
- Adds the
xtsfunit for extracting SetupFactory binaries. - Adds the
cborunit for parsing the CBOR format. - Adds the
asn1unit for generic ASN.1 structure parsing. - The
argon2iunit was renamed toargon2and supports all variants of Argon2 now. - Adds the
xtrarunit for extracting RAR archives. - Adds the
xtdmgunit for extracting DMG volumes. - Adds
carve-elf,carve-pdf, - Adds the
a3xsunit for extracting only the script part of an AutoIt3 binary. - Adds the
sm3,aria,twofish,scrypt, andsimoncryptographic units. - Adds the
plist,puny, andqpunits for various common encodings. - Adds the
dsstoreunit for parsing MacOS'.DS_Storefiles. - The
mscfunit now also supports LZMS. - Adds
dncodeunit for extracting the MSIL code from functions in a .NET assembly. - The
xtxsunit was renamed toxtmdband is now supported viaxt. - The
lnkunit now supports URL shortcuts much better. - The
xtvbaunit was renamed tovbamcto better align withvbapc.
A large number of external dependencies was eliminated from the code base, replacing them with internal parsers. This includes:
asn1cryptoLnkParse3phpdeserializeaccess-parserlibzbarpycdlibolefileoletoolsextract-msg
In most cases, this improves the support of those formats with one exception:
The qr unit can no longer parse arbitrary bar codes but is restricted to QR codes.
- Switches
-Tin the unitsaluandpemetawere removed to add a new global option named--try: This option allows forwarding the input when a unit fails to process it. - Adds a proper parser and emulator for Batch files which is exposed via the
batunit. The unit now has some switches to control how emulated commands are synthesized. - A number of CPU-intensive tasks have been extended with a parallel Cython implementation, specifically PKZIP crypto, AutoIt decryption & decompression, and the 7zip ported code for LZX and deflate.
- Execution performance has been improved by several hundred milliseconds by delaying imports and entirely avoiding unnecessary imports in some cases.
- The
idbmetaunit was added to read metadata from older IDA databases. - The unit
carve-tarfor carving TAR files was added. - The unit
carve-pngfor carving PNG files was added.
- The
docmetaunit now exposes more of the embedded metadata of Word documents. . Thexj0unit no longer has size restrictions on extracted strings.
- Adds the
jpegunit for parsing out JPG data streams and some metadata.
- The
editunit was added which allows overwriting any part of the input data with a binary string. - The
xtzipunit now supports Deflate64.
- Adds the
xtrpaunit for extracting RenPy archives, because Karsten did it and why not.
- The
xtinno(and therebyxt) now support Inno Setup up to version 6.6.1. - The
innopwdunit has improved logging and can intercept passwords in more cases.
- The
xtdmpunit was added for extracting files from Minidumps. - Refinery was extended by a custom ZIP archive parser which can detect data caves and should support all archives (and probably more) than what was previously supported.
- The
hlg,hls, andhlbunits were added for easier access to different style flavors of syntax highlighting. - The
bruteforceunit now accepts integer intervals as argument rather than Python slices. This means that1:4includes4. - Various units for FNV hashing were added, i.e.
fnv0,fnv1,fnv1a, and shortcuts for various bit sizes. - This release also includes significant performance improvements to the .NET parser.
- The
hlunit was added for highlighting source code on the terminal. - The
xtmsiunit now supports Advanced Installer binaries which include the MSI with obfuscated header. - The .NET parsing library was completely refactored, so there will certainly be bugs.
- The
sqliteunit was added, thanks to @oxitocin. - The
codebookunit was added. - Several
carveunit formats were renamed to shorter versions.
- The
d2punit was added.
- The
carveunit now supports thestrarraypattern for arrays of strings, but no decoder for it is implemented yet because it is unclear what that should be. - The
vstackunit has changed to allow for slice-style address arguments, more easily allowing you to specify a stop address. - The unicorn and icicle-backed emulators in refinery (specifically
vstack) now also have rudimentary support for API hooking and logging. - The
binrefcommand now prints the current binary-refinery version.
- The
xtmsiunit now exposes several metadata streams that were discarded before, one important example being the digital signature. - Changes have been made to how the
xtxmlunit constructs paths into the document structure, in cases where tag names are unique these will now be prioritized. - The backend for the
batunit was swapped for a new and rudimentary emulator; this will be improved upon in future releases. - The
decompressunit was extended with a new heuristic tier: If any decompression produces output that is recognized as matching a known format, the unit will pick the best result from among all results with this property. - Type hints across the refinery code base have undergone massive refactoring and should be better compatible with modern type checkers now.
- The
xtchmunit was added for extracting CHM (Windows Help) files.
- The change to
--join-pathfrom v0.7.4 was reverted; this switch for path extraction units no longer alters the joined path based on files on disk. This makes its behavior completely deterministic again. Deconflicting paths is now done bydump: The unit implements the same logic in trying to find compatible paths on disk that do not conflict with existing files. - The
pdfunit has been extended with support for extracting images based on themupdflibrary. - The
pdfcryptunit was added for removing & adding passwords from & to PDF documents. - There is now a
qrunit for decoding barcodes (especially QR codes). It requires the ZBar shared library to work, but that really seems to be the only way it can be done. - The
z85unit for yet another Base85 encoding scheme was added. - The
flzunit implementing FastLZ was added and is now included in the universal decompressor. - The
xtzipunit now allows extracting archives partially, i.e. a single password-protected file does not completely fail the extraction process. When decryption fails, it is also possible to extract the raw encrypted chunk using the--lenientoption.
- Adds the
pbufunit to decode ProtoBuf messages heuristically.
- The
djb2unit was added for computing the DJB2 hash. - The
mscfdecompression algorithms were added to the universal decompressor.
- The
xtinnounit now supports Inno Setup up to version 6.4.3.
- The
xtunit now also extracts AutoIt3 samples. - API tracing via SpeakEasy in
vstackcan now be switched on and off with a separate switch.
- The
cfmtunit was renamed topf, short for "Print Format". - The
carve-derunit was added. - The
argon2idkey derivation unit was added.
- The
pecdbandpefixunits were added. - The
pkwunit for decompressing PKWare was added. - In the
structunit, it is now possible to peek a struct entry by specifying an alignment value of zero.
- Adds the
--tagand--aadflag to cipher units that support message authentication.
- Thanks to @larsborn, the units
dnasmanddnopcwere added for disassembling MSIL. - The
iffcunit was added; for filtering chunks in a frame by size constraints. - The
xtxsunit was added; for extracting data from Microsoft Access databases. - The
pymunit for unmarshaling Python data has been improved with a cross-version parser. - The
carve-jsonunit now defaults to carving only dictionary values. - The
mapunit was extended with the "default" parameter. - The
popunit now supports using a single meta variable as the input source. - The
imgdbandimgtpunits for image processing were added, and the transposition option was consequently removed fromstego. - The
b2f(back to front) unit was added, a shortcut forpick ::-1. - The
HKDFkey derivation unit was renamed tohkdf, in line with all other units now being lowercase.
This version is functionally equivalent to the previous one, but refinery starts using the LIEF parser with this version. Switching from other executable parses to LIEF was the only change from the last version to this one, see #84.
This is a bit of a botched release, don't use it; use 0.8.17 instead.
- The
teaandxteaunits now offer the option to specify the number of rounds. - The
coupleunit made stdout/stderr merging optional and discards stderr by default. - The
coupleunit received the--noinputoption to toggle this mode explicitly. - The standard path formatting for
xtxmlandxthtmlwas changed to allow filtering for all elements of a certain tag easier.
- The
rtfcunit was added. - A regular expression pattern named
datewas added and is now available inxtp. - The output of the
lnkunit was limited to essential information by default. - The
csbandcsdshortcuts forcarvecan now use the--stripspaceargument. - Grouping into blocks in
hexloadandpeekwill now add separators in the ASCII preview.
- The
innopwdunit was added. It can emulate an Inno Setup installer in order to extract passwords that are encoded within the IFPS script.
- The unit
ps1strhas been renamed toescpsto match it's partner unitesc. - The unit
escvbwas added to escape and unescape VB strings. - An input forward format character was added to
rexto support this common use case better. - The
dnfieldsunit was reworked to extract prettier paths based on the method, type, and namespace.
- Adds the
xtsimunit to extract smart install maker archives. - Adds the
lzxunit for LZX decompression. - Adds LZX support to the
xtcabunit. - The
xtcabunit now also suports multi-disk cabinet archives.
- The
lzmaunit now supports a lot more (especially custom) LZMA formats. - The
xtinnounit underwent further improvements and can now extract embedded images, as well as decompression and decryption libraries. - The
vstackunit does again operate onunicornversion2.0.1.post1.
- The
xtinnounit was added to extracting files from InnoSetup installers. - Related; the
ifpsandifpsstrunits now accept a string encoding argument. - The
jvdasmunit now has colored output. - Thanks to @s3ven6, the
speckcipher andmaruhash units were added. - The
dnarraysunit was added for extracting hard-coded arrays from compiled .NET code. - The
iffunit was extended with an-neswitch. - The
xjlunit can now also collect the contents of one frame into a JSON list.
- The
binrefcommand was changed to use conjunctive search logic by default. - The
copy:andcut:multibin handlers now accept arguments of the form[offset]:[length]:[step]instead of[start]:[end]:[step]. - The
vstackunit now supports 3 emulator engines:unicorn,speakeasy,icicle. This is somewhat experimental andunicornremains the default. - As part of the changes to
vstack, thevmemrefunit was changed to usesmdarather thanangr.
- Adds the
morseunit for Morse code encoding and decoding.
- The
u16unit is no longer limited to the little-endian variant of UTF-16. - The
snipunit was given a new argument--streamwhich allows each offset to be relative to the end of the previously extracted data. - Path extraction units will now match paths case-insensitively when this does not cause ambiguity.
- The
xtmsiunit now extracts all MSI tables as CSV on top of the JSON blob. - The
xtnsisunit now extractssetup.binalongsidesetup.nsis, the former containing a full binary copy of the extracted header. - The
dedupunit now has an optional argument which can specify a meta variable to deduplicate by.
- Adds the
httprequestunit for parsing HTTP requests. - Adds the
b62unit (thanks to @lukaskuzmiak). - The
uuencunit was updated to remove reliance on the now deprecateduumodule. - Adds support for aPLib compressed data with headers.
- Adds the
brotlidecompression unit. - The
pymunit was added which provides an interface to Python's marshal serialization. - The units
xsalsa,xchacha, andchacha20poly1305were added. The latter only performs the decryption part of the scheme. - Refinery pipelines used in Python code will now preserve the scope of a
Chunkobject when one is provided as input. - The argument handlers
prngandrngwere added for random number generation.
- The
b65536unit was added (thanks to @alphillips-lab).
- The
--joinoption of all path extraction units has been improved for producing paths that can always be used for dumping data to disk. This includes units to unpack archives, resources, or other embedded data that can be referenced by a name. - The
efunit has a new option that specifies whether to follow (directory) symlinks / junctions or not. - The key scaling method for
autoxorwas adjusted to produce less false positives when scanning for larger keys. - Thanks to @alphillips-lab, the
a3xunit is now capable of decrypting EA05 formatted scripts, previously only EA06 was supported.
- The
loopunit was enhanced with more options to abort execution based on regular expression patterns. It now also offers better control over terminating the execution when an error occurs. - Conditional units (
iff/iffp/iffx/iffs) were reworked to have less magic behavior. The-Rswitch now controls boolean negation and a separate switch controls whether chunks are hidden instead of being discarded. The-sswitch was also removed from conditional units. - The
cullunit was removed from refinery. - The units
p1,p2, andp3were added, which are shortcuts for picking the first 1, 2, or 3 chunks from a frame, respectively. - Regular expression arguments now have a new handler
f:, which initializes the regular expression entirely from one of the formats used incarve.
- The global
--iffoption was added to units; this allows you to apply the unit only to formats that it knows it can handle. - When using refinery in code, it is now possible to pipe a
Chunkobject directly to a pipeline. - The
csbandcsdshortcuts were added for common applications ofcarve. - The
loopunit was added; it allows repeated application of a multibin suffix to the input data. - To match the
loopunit, thereduceunit now also works with a multibin suffix rather than with a pipeline string. - The
vstackunit now attempts to detect stack cookies and ignores them by default. - Adds a deobfuscator for the
kramerobfuscator. - The
xtmsiunit now automatically extracts embedded CAB files and infers the file names of these subfiles from the MSI manifest.
- Raises minimum Python requirement to 3.8.
- Removes automatic escapes from
cfmt; this now has to be done explicitly. - The
rsaunit can now output keys in Microsoft BLOB format. - Adds the
urnunit. - Adds several multibin handlers to modify file system paths (
pp/pb/pn/px).
- Adds the hash units
sha3-224,sha3-256,sha3-384,sha3-512, andkeccak256.
- Adds the
b92unit for Base92 encoding and decoding. - Improves the performance of AutoIt3 unpacking in
a3x. - Adds the
SymHashfield to themachometaunit.
- Adds the
xtmachounit which can unpack MachO fat binaries.
- Adds the
nrv2b,nrv2d, andnrv2edecompression units. - Adds the
fernetunit to decrypt messages in Fernet format.
- The
chopunit has a second argument now that allows to specify the step size. Also, The--intoargument has been removed because this can be done more succinctly using thesizemeta variable and long division. - The
aluunit has been extended with a new helper function calledM; it can be used to mask a value down to a certain number of bits.
- The
structunit was extended with an additional format string character,g, for reading GUID values.
- The
reducesignature was changed; it is no longer possible to specify an initialization value, instead the first chunk in the frame is always used. Additionally, there is now an option to consume only a limited number of chunks. - The
queueunit has been removed in favor of two unitsqf(queue front) andqb(queue back) to queue chunks into the current frame.
- The key derivation units
DESDerive,CryptDeriveKey, andPasswordDeriveByteshave been renamed todeskd,mscdk, andmspdb, respectively, in order to match the common refinery unit naming convention of using indecipherable and consonant-heavy abbreviations. - When passing integer arguments to the units
xor,add, andsub, the block size is now automatically adjusted to the smallest size that will contain the given argument.
- Thanks to @EricFaehrmann,
xtzip(andxt) now support doubly-loaded ZIP archives.
- Fixes bugs that caused errors in Python 3.12 environments.
- The paths extracted by
xthtml,xtxmlandxtjsonnow avoid the use of parentheses to work better on Bash. - Adds the
sosemanukcipher unit. - Improves the capabilities of the
vbastrunit.
- The
peekunit in--decodemode now truncates long lines by default. Specifying the option twice has the same effect as the previous default, which is to wrap lines. - The
stegounit has been modified to generate a single output by default and provides a switch to generate individual rows or columns.
- Adds the
xtzpaqunit to unpack ZPAQ archives.
- Includes the preliminary fix for the PowerShell problem. PowerShell versions 7.4 and beyond support native to native pipelines.
- The
b85unit is now resilient against white space. - The
vsectunit can now extract "synthesized" sections. This also affectsvsnip; it can now also extract data from, e.g., the header of an executable based on virtual addresses. - The possible extras for the
binary-refineryPython package have been expanded and the default install has been slimmed even further to avoid having to install too many dependencies for just the core utilities.
- Adds the
opcunit and removes the Angr option fromasm.
- The path formatting feature has been isolated in the
xthtmlandxtxmlunits. - The
vstackunit no longer extracts byte patches that consist exclusively of zero bytes because these were common false positives.
- The
vstackunit has received further improvements. CPU register initialization now works via meta variables instead of shell environment variables, more options have been added, and new heuristics: Values written to the stack that represent addresses into any mapped segment are now ignored by default. - This release adds the "shell like" interface; by importing units from
refinery.shell, they can be instantiated in Python by using string arguments that are interpreted as if the corresponding unit was being assembled from a shell command line. - The
lzwdecompression unit was added. - the
xtmagtapeunit was added to extract files from SIMH tape files. But why, you ask? It may forever remain a mystery. - The
hc256cipher unit was added. - The
--moreoption was added to thestructunit to give access to unparsed rest data. - The
--lengthoption was added to thesnipunit as a qualit of life feature. - The
btoihandler can now receive a second argument that allows reading interlaced integers from a byte stream. - Thanks to @alphillips-lab, the
dnsfxunit was added for extracting .NET file bundles. - The
pestripunit was renamed topedebloatandpetrimwas renamed back topestrip.
- The
trimunit can now remove padding and also perform case-insensitive trimming. - The
ngramsandbruteforceunit were added for simple brute forcing tasks. - The
vstackunit can now execute shellcode blobs. It also gained the ability to skip calls entirely, and registers can now be initialized by using shell environment variables. - The
salsaandchachaunits can now be initialized with a 64-byte "key" which represents the entire initial state matrix. - The
percunit now extracts resource languages. - The
yara:handler for regular expression arguments now has the even shorter shortcutY:because I use it so much. - A bug was fixed in the
urlunit which incorrectly decoded when using the--plusswitch.
- Adds the
sm4cipher unit. - Adds the
blablacipher unit.
- The
machometaunit was added thanks to @cxiao. - The
pestripunit was extended with more features, and the unitpetrimwas introduced as a unit to simply remove overlays. - The
xtnuitkaunit was added to extract Nuitka archives.
- Adds the
pycunit to decompile Python bytecode directly. - Adds more options to the still quite experimental
vstackunit.
- The coloring in
peekon Windows is now applied even ifpeekis not the last unit in the pipeline. This previously caused a bug, but in recent versions the bug was not reproducible. - The
bitsnipunit was added. - PowerShell deobfuscation was augmented by two units to decode base64.
- The
pestripunit has received some improvements and bugfixes; it should work more reliably now against bloated sections and resources. - All stream cipher units have been given the
--discardoption which allows you to discard an arbitrary number of initial bytes from the keystream. - Call tracing has been removed from
vstack; it never really worked in practice and would require a lot more effort to do properly.
- The
packunit can now also pack lists of floating-point numbers. - The unit
chaskeywas added to support this cipher; it is used by the Donut framework.
- A minor bug was fixed in
pemetathat prevented some signatures from being parsed correctly. - Archive extraction utilities now escalate fuzziness in 3 stages rather than just 2.
- Slightly improves the script extraction and formatting in
xtmsi.
- The
xtdocunit now demangles file names in MSI archives correctly. - The
xtmsiunit was added for extracting MSI files and also stream metadata in a synthesized JSON document. - The
csvunit now has a reverse operation to convert simple JSON documents back to CSV format. - Thanks to @larsborn, the
tnetmtmunit was added for parsing MITMProxy traffic capture files.
- The
xtnodeunit was added for extracting the contents of Node.js executables created withpkgornexe. - The
xtzipunit now supports AES-encrypted archives via thepyzippermodule.
- The AutoIt decompiler unit
a3xwas added. - Path extractor units have been reworked to be more consistent about when and when they do not use fuzzy matching on paths. Switches have been added to control this behavior.
- The
teaandxteaunits now have a--swapswitch which allows to switch them from little endian to big endian mode. - The
xxteaunit was re-worked to support being used as a proper block cipher. This is enabled by specifying the block size using the--block-sizeargument. By default,xxteawill continue to operate on the input as a single block: This is how XXTEA is often used in malicious samples. - The
rc5andrc6units have been updated to support the--segment-sizeoption for CFB mode.
- The
pestripandpeoverlaydefault settings are the same again.
- The
pestripunit has been extended with the capability to strip bloated resources and sections. - The
xtoneunit was added to extract embedded files from OneNote documents. - The color legend of the
iemapunit is now optional and can be enabled with a switch.
- Bugfix to account for changes in macOS libmagic which lead to not correctly identifying
exeanddllextensions. - Importing refinery no longer changes the names of log levels globally.
- The
xthtmlunit can now extract attributes of HTML tags. - The
rijndaelcipher unit was added.
- The
lzgunit was added. - The
lzfunit received several bugfixes and now supports the chunked format produced by the command-linelzftool. - The
ntlmhash unit was added (thanks to @m0rv4i for the contribution) - The
vmemrefandvstackunits were added; both are still experimental and not thoroughly tested. - The
minandmaxunits were added to simplify the patternsorted [| pick 0 ]to a single unit.
This release changes the way in which meta variables are handled, they now have a scope:
- By default, variables cease to exist when the frame ends in which they were defined.
- Variables remain visible in child frames.
- When a variable is re-defined in a child frame, this definition shadows the previous one: When the child frame ends, the variable is restored to the value it had in the parent frame.
- Some units like
popcan propagate variables to the parent scope as well. - The units
mvgandmvcwere introduced to manage scoping of variables, the unitwmwas removed.
Changes unrelated to meta variable redesign:
- The unit
vaddrwas added to convert integer meta variables from virtual address to file offset and vice versa.
- The
pkcs7sigunit was added. - The
pemetaunit now also displays the module name stored in the export directory. - The
dedupunit now uses MD5 instead of Python's built-in hash function because of the high risk of collisions.
- Block cipher unit backed by pycryptodome (i.e.
aes,des,des3) now support additional arguments for some of the less commonly used block cipher modes. - The
rsaandrsakeyunit now also support a simple key format of the form[modulus]:[exponent]where bothmodulusandexponentare hex-encoded numbers in big endian representation for a textbook RSA round. - The
percunit now has the--prettyoption to fix bitmap and icon resources by adding the necessary headers (which are missing from the raw resource data). - The
pcapandpcap-httpunit now sort streams by the occurrence of the first packet.
- By default, the
efunit does no longer use glob-patterns on posix systems. The behavior can be explicitly adjusted using new command-line flags. - Adds the
queueunit. - The names of urlencode patterns for
carvewere shortened. - Adds the
xtnsisunit to the units used inxt. - The
pemetaunit has improved RICH header data and displays RICH header counts.
- Adds the
lzfunit for LZF compression and decompression.
- Adds the
qlzunit for QuickLZ decompression.
- Adds the
carve-lnkunit to carve Windows Shortcut files. - Adds the
carve-rtfunit to carve RTF documents. - Adds the
subfilesunit which unifies all structured file format carvers. - The
b64unit now automatically detects and switches to the urlsafe encoding variant. - Adds the
xtnsisunit which can extract files from NSIS archives and provide a rudimentary disassembly of the setup script. - Adds the
ifpsandifpsstrunits to disassemble and extract strings from compiled Pascal script files.
- Adds the
vbapcandvbastrunits which can extract (decompiled) VBA p-code and VBA strings from (potentially stomped) Word documents. - Adds the somewhat experimental
xkeyandautoxorunits that can (sometimes) automatically decrypt XOR-encrypted files using frequency analysis. These units are still work in progress, though. - Adds the
mscfunit which implements part of the Microsoft Compression API formats, with LZMS currently missing. - Adds the
b58unit which does base58 encoding (used to encode Bitcoin addresses, for example). Simultaneously, thebaseunit was adjusted to no longer strip leading zero bytes unless explicitly instructed to do so. - Adds variable conversions to
pop: It is now possible to prefix a variable with a sequence of multibin handlers to convert input data before storing it in the variable. - When executing the
putunit without a second argument, it now stores the contents of the current chunk in the specified variable.
Fixes a critical bug in the meta variable propagation logic.
- Adds the
jcalgunit. - Adds the
byteswapunit.
- Adds the
lzipunit. - Reworks the
serpentunit to work with real-world examples and adds a--swapoption to change the block byte order to become compatible with other implementations. - Changes the
peekdesign and fixes problems with colored output on Windows.
- The (still somewhat experimental)
xtunit was added which attempts to extract data from known archive formats. - The
xtasarunit was added which can extract data from ASAR files. - The
lnkunit was added which is a thin wrapper around the LnkParse3 library which extracts metadata from Windows shortcut files. - The
urlfixunit was added which can strip URL indicators of fragments and query strings. - The
iffunit has gained several new features. - The
xjlunit was added, it converts JSON-lists to a sequence of JSON chunks. - The
xvar:handler was renamed toeat:(it is similar tovar:, except that it removes the variable after use). - The
xlxtrunit now supports XLSB format by virtue of the pyxlsb2 library. - The
base32unit was made more robust against invalid paddings. - The
peekunit design was changed yet again and colorization was added to the hexdump preview. It can be disabled through the-gswitch.
- Unit execution time has been improved significantly.
- The
rc5,rc6, andxxteacipher units were added.
- Adds the option to completely disable the PowerShell band-aid introduced in 0.4.27 to allow using the
Use-RawPipelinemodule.
- Adds several VBA/VBS deobfuscation units and a
deob-vbaunit that applies all of them, similar todeob-ps1. - Adds the
camelliacipher unit. - Adds the new
structunit format characterwfor decoded wide strings. - The
dnfieldsunit was extended and now also extracts string fields which are assigned a unique value. - Implements a better PowerShell band-aid and displays a warning message.
Adds various convenience output options in the Python REPL and adds documentation for those.
- Adds the
szdddecompression unit. - Adds the
lzjbdecompression unit. - Adds an option to the
iffunit to check for the existence of a certain meta variable. - The
xtpyiunit now uses bothuncompyle6anddecompyle3, even though they currently appear to have feature parity at best - there is some hope that one of them will support Python 3.9 in the future. - Adds the
groupbyunit. - Adds the
isaaccipher unit. - Adds the
batunit for deobfuscating batch scripts.
- Adds the
ripemd160andripemd128units. - Adds the
xtwunit for extracting cryptocurrency wallet addresses. - Adds the
iemapunit to display a colored entropy heatmap. - Introduces new syntax to the
structunit for handling byte alignment. - The
rsakeyunit supports a new option to output the public key portion of a private key. - The
pemetaunit now computes the size of the PE file based on header information. - Several switches for comparison operators were added to the
iffunit.
- Thanks to @baderj, the unit
xlmdeobfwas added which wraps the extremely useful XLMMacroDeobfuscator tool for extracting and deobfuscating Excel V4 macros. - Adds the
carve-7zunit for carving 7zip archives from blobs.
- Renames the
blockopunit toalu. - Removes the shortcut unit
carveb64z. - Renames a number of command-line switches for
carve,xtp, and other pattern extraction units. - Adds a default argument to
resubthat makes it strip whitespace from the input by default.
Improves performance by replacing an import of pkg_resources with equivalent functionality from importlib. On a test machine, this removes between 250 and 500 milliseconds from the execution time of any single unit.
Changes the format for the binary formatter used in struct, rex, resub, and cfmt. It now uses a reverse multibin handler instead of parsing the modifier like a command-line pipeline.
- Adds the
lzounit
- The
winregunit is now able to extract data from Windows registry editor exports (i.e..regfiles). - The key derivation units
pbkdf2andpbkdf1use a more forgiving decoder to better cover theRfc2898DeriveBytesclass, which offers a call signature that receives an arbitrary byte string as password. - The
stringregular expression pattern now excludes literal line breaks within the string.
- Base64 regular expression patterns were improved to account for correct character counts.
- The
dexstrunit was added. - The
indexmeta variable is now automatically populated within frames. - The
n40string decryption unit was added. - The
xtpyiunit now extracts Python disassembly when decompilation fails. - The
lzmaunit now correctly decompresses output produced by PyLZMA.
- The
doctxtunit was added; courtesy of @baderj
- Adds the
serpentunit.
- Adds the
xtpdfunit for extracting embedded objects from PDF documents. - The
accu:handler now supports pre-configured finite state machines for well-knownrand()implementations.
- The
officectyptunit now supports the Excel default passwordVelvetSweatshop. - The
ciproperty has been removed from the output ofpeek --meta. - The following units were added:
xj0,evtx - The
hexdmpunit was renamed once more tohexload, and its pattern matching was improved. - The
asmunit was completely redesigned using an Angr-based fallback to produce better disassembly. - The
pcap-httpunit now extracts the URL from whence the data was downloaded. - The
repunit received some performance improvements. - The refinery dependencies were cleaned up considerably.
- Blockwise operations no longer require numpy to be reasonably fast by implementing a dynamic inlining step.
- Adds the
cswapunit. - The index counter of
blockopnow starts at zero. - An option was added to the
swapunit to swap the contents of two meta variables. This can also be used to rename a meta variable. - An option was added to
xtpyito unpack, but not decompile the contents of a PYZ. - Adds the
--bareoption toescand uses it inpeek. - Adds the
--metaoption toef. Theefunit now also descends into dot-directories and lists dot-files. - The
__init__.pklfile containing the unit lookup cache was moved into the distribution.
- Adds the
xtvbaunit to extract Office document macros. - Adds the
pcapunit to extract TCP streams from packet capture files. - Adds the
xthtmlunit to extract components of HTML documents. - The
htmunit has been renamed tohtmlesc. - The default sort order of
sortedhas been changed to descending. - The
pemetaandpkcs7units now also extract certificate thumbprints.
- Fixes an issue with applying
ppjscriptto obfuscated JavaScript files. - Adds Murmur Hash units
- Adds
xtpyiunit to extract PyInstaller-packed archives. - Logging now uses the Python
loggingmodule.
- Significantly improves unit loading time which had regressed due to the changes in 0.4.0.
This release removes the setup-venv helper scripts and instead uses a slightly less ugly hack to resolve dependencies before running the refinery setup by declaring every dependency a build dependency in pyproject.toml. Any kind of installation should work seamlessly through pip.
Updates build system.
- Fixes critical bug in deployment.
- Adds the
msgpackunit. - Adds the
cullunit and changes the behaviour of conditional units to make filtered chunks invisible instead of removing them. Conditional units have been renamed toiff,iffs,iffx, andiifp.
- Adds the
xfccunit, which replaces theintersectionunit. - The
cmunit can now be used to remove meta variables. - JSON dumps no longer use hex encoding for big integers as JSON has no size limit on integer expressions.
- The
structunit was significantly redesigned and thelprefixunit removed because it can now be trivially implemented withstruct. - The
ifexprunit has been renamed toiffand theiffpunit was added. - The field names in
dnfieldshave been altered to more closely resemble file names. - Adds a list of default passwords to archive units.
- Renames the
freadunit toef. - Metadata / Format string expression parsing is now more flexible.
- Adds the
intersectionunit.
- Adds the
xtjsonandxtxmlunits for extracting data from JSON and XML files. - Slight redesigns of
lprefix,peek,xtmail, andcfmt. - Refinery now has (very weak) support for PowerShell.
- Adds the
--tabularoption toppjsonto produce a flattened jason output. - Changes to the in-code pipe syntax:
data | unit | unitis an iterable over output chunksdata | unit | unit | callableinvokescallablewith a bytearray containign all concatenated chunks- connected pipelines (
data | unit | ... | unit) can be passed tostrandbytes
- Path extraction units (like
fread,xtzip) offer better control over the path variable. - Variable merging was added to the
popunit. - The
cmunit only populatessizeandindexby default, never performing a full scan unless explicitly requested.
- Meta variables are now allowed in
structformats, andstructassumes no alignment by default. - The
pemetaunit now has support for RICH header data. - The
rsakeyunit was added. - The
popunit was extended by an option to discard chunks. - Several new archive extractors are now available:
xt7z,xtace,xtiso, andxtcpio. - The
xlxtrunit was refactored and generates more metadata. - The
sortedunit can sort by metadata variables now. - The
swapunit can now swap with an empty variable, which will empty the chunk body.
- The
triviaunit was renamed tocmfor "common meta". - The
pemetaunit can now display PE header information, .NET header flags, and supports a table view instead of the JSON output. - Python expressions all across multibin arguments no longer restrict the operators that can be used.
- The domain regular expression was updated with new TLDs and the artificial TDLs
.coinand.bazar. - The
terminateunit was added. - The
structunit was added.
- Adds the
ifexprandifstrunits for filtering framed data. - The
pemetaunit now also extracts theEntryPointTokenfield from the .NET header.
- The
hexviewunit was removed, instead thehexdmpunit was created. By default, this unit converts hexdumps back to binary, the previous functionality ofhexviewis now available as the reverse operation ofhexdmp. - Adds the
dnblobunit. - The
drpunit underwent major refactoring with the goal to improve both speed and quality of results. Two options were added to help control these new settings.
- Adds the
xtrtfunit to extract embedded objects from RTF documents. - Adds the
officecryptunit to decode password-protected Office documents. - Improves PKCS7 parsing and fixes some cases where
pemetadid not display the details of the digital signature. - Adds brieflz support to the universal
decompressunit.
- Unification of (nearly) all multibin handlers. Only the
yara:andescape:handlers remain to regular expression type arguments. - Adds the multibin handlers
accu,reduce,cycle, andtake. - Alters the
leandbehandlers to support both conversion from integer to byte string and vice versa. - Renames the
unpackhandler tobtoiand adds thebtoihandler which performs the inverse operation. - Command line switches for the
lprefixunit changed. - Adds the global
--lenientoption which is now required to admit partial results as output.
- Adds the
blzunit for BriefLZ compression and decompression.
- Adds the
xtdocunit which can extract more files from Office documents thanxtzip. - Adds the
triviaunit which can be used to attach certain meta variables. Moving forward, this will be the preferred way to access simple invariants of a binary chunk. For now, it can attach the integer variablessizeandindex, containing the size of the data in bytes and the chunk index within the current frame, respectively. Theeval:handler for numeric multibin values no longer accepts the special variableNto represent the chunk size as this functionality can be recovered by preprocessing each chunk withtriviaand using the variablesizeinstead ofN. - The
carve-peunit is now a path extractor unit (TL/DR: More command line options).
- Changes the interface for the frame squeeze mechanic
- Adds option to
pefileto compute carve size based on virtual section sizes & offsets.
- Using hex escape sequences in the replacement string for
resubnow works as expected. - The
yara:modifier for regular expression based units now accepts lowercase hex characters. - The
imphashunit's performance was improved slightly. - Additional options for the
pecarveunit. - Adds the
ppjscriptunit (wrapper around jsbeautifier). - The
vsnipunit can now extract more than one memory region. - Adds a count restriction to the
resplitandresubunits.
- The interface for cipher units has been changed; the encryption mode is no longer a mandatory argument. Better handling of various cipher block chaining modes has been implemented.
- Conservative option added to
peoverlayandpestrip.
- The
salsaandchachacipher units now have pure Python implementations that allow you to specify the number of rounds. The PyCryptodome interfaces still exist, now as unitssalsa20andchacha20. - The
HMACunit was added to support simple HMAC based key derivation. - The
dumpunit stream mode has been adjusted so that it is possible to write consecutive data to a file inside a nested frame.
- The
cfmtunit has been reworked to support more common modern Python format string syntax. - The output of
crc32andadler32checksum hashes has been altered to use the correct byte order. - The
rabbitunit was added which implements the RABBIT stream cipher.
- The
mpush,mpop, andmputunits have been renamed to simplypush,pop, andput. - The
autoxorunit has been transformed into thedrpunit, the behavior ofautoxorcan be achieved usingxor drp:copy:all. - Data types of .NET fields are better detected by
dnfieldsnow, but a proper parser for type signatures is still missing.
- The
gzunit was deprecated because thezlunit covers its usecase (and does a better job at it). - The
lprefixunit for parsing length-prefixed data was added. - Parsing of managed .NET string resources via the
dnmrunit was fixed, these would previously be returned unparsed. - The
binpngunit has been improved and renamed tostego, a more flexible unit to extract data from images.
- The
peslice,elfslice, andpesectunits have been removed. - In their place, the cross-format units
vsnipandvsectcan now be used to extract data from virtual addresses and sections of PE, ELF, and MachO files.
- adds
md2andmd4hashing algorithms - the
CryptDeriveKeyunit now also mirrors the API call for SHA2 based hashing algorithms - message type attachments in Outlook email formats are now supported by
xtmail
- The interface of the memory slicing units
pesliceandelfslicehas changed. - Python expression parser and numeric arguments have been refactored.
- Removes the
--install-optioncapability introduced in 0.3.5, see pip/#8748 for more information. - The
xttarunit was added. - The
lzmaunit can now return partial results for buffers with junk bytes at the end.
- The
ifrexunit was added. - The
jvstrunit was added. - A source distribution manifest was added to fix errors that occurred during source installs.
- Using
pip install --install-option=library binary-refineryor aREFINERY_PREFIXenvironment variable with value!will now install the binary refinery without any command line scripts, only as a library.
- It is now possible to use local refinery units (i.e. a Python script in the current director which contains a refinery unit that is not abstract) for multibin prefixes and in any other situation where units are dynamically loaded.
- The
pesectunit was added. - The
resubandresplitunits no longer offer options that have no bearing on their behavior. - The
lz4unit was added with a pure Python implementation of LZ4 decompression. - The
jvdasmunit for disassembling Java class files was added.
- The
autoxorunit was added. - The
cfmtunit was added. - The License of Binary Refinery was changed to 3-Clause BSD.
- The
netbiosunit was added. - The
stretchunit was added. - The
hc128cipher unit was added. - The unit
dnrcwas split intodnrcfor extracting .NET resources anddnmrfor unpacking managed .NET resources. - Several units that extract items from container formats have received a unified interface. So far, this interface applies to
xtmail,xtzip,winreg,dnfields,dnrc, anddnmr. - When using named match groups for the
rexunit, these matches are now forwarded as metadata within frames. - The
xtzipunit was given an optional archive password parameter. - The
xtmailunit can now extract headers in text and json format.
- Test coverage was increased
- The
recodeunit can now autodetect input encoding. - Several bugfixes were performed on the
vbeunit. - More bandaids were added to PowerShell deobfuscation.
- The
pestripandpeoverlayunits were added. - Interface retrofitting was completed.
- Fixes a tiny bug in the PyPI display of the readme file, and completes changelog from previous version.
- The
rsaunit was improved and can handle the Microsoft blob format now. - PowerShell deobfuscation was improved, but that doesn't change the fact that this would be much better with a proper parser.
- The
b32for base32 encoding and decoding was added. - Preliminary support for meta variables has been added with the
mpush,mpop, andmputunits. This feature is experimental and not well documented yet. - The
--squeeze/-Zoption was added to all units that produce multiple outputs: It disables the default separation of these outputs by line breaks. - Pattern extraction units such as
rexwill now preserve the order of extracted strings, even when the--longestoption is used. - The suggested
PATHenvironment variabe modification from the Linux installer script was corrected; The previous variant would make the refinery virtual environment take precedence over the global python executables.
- The
dumpunit has been refactored to make it easier to use; Formatting of file names is done automatically now unless the flag-por--plainis specified to prevent string formatting. - The
snipunit can now remove bytes from the input. - The
dnfieldsunit was added. - The
ppjsonunit can now minify json by specifying0as the desired indentation width. - The
dsjavaunit was improved, although it remains a work in progress. - The
freadunit received a linewise mode.
- After some incomplete attempts to improve backwards compatibility, the package now simply requires Python 3.7.
- Units can now be written with a Python
__init__constructor and deduce the command line interface from this constructor. A decorator class was added to help enriching the parameter list of the constructor with information on how to translate these into command line parameters. The goal is to eventually retrofit all units to follow this standard. - The
pemetaunit has more features now. - The
coupleunit was added; it is an adapter to turn any stdin/stdout based command line tool into a refinery unit. - The
carve-xmlunit was added. - The
dnstrunit was added.
- All hashing prefixes for multibin expressions have been implemented as separate units, i.e.
sha256andmd5are now units that output the corresponding hash of the input data. - The
xtmailunit was added which can extract the body and attachments of email documents, both Outlook and MIME formats. - The framed format was extended with rudimentary support for metadata in framed chunks. This is currently used by the
xtzipandxtmailunits to attach anameproperty to emitted chunks which contains the file name information from the parsed data. Thedumpunit now has a--metaoption to read thisnameproperty and use it as the file name for dumping. The--metaoptions defaults to using the SHA256 hash of the data as the file name if no corresponding metadata is present. - The
pemetaunit was added. - The
carve-jsonunit was added. - The
pesliceandelfsliceunits were given a unified interface. - The
b85for base 85 encoding and decoding was added.
- Fixes a bug in the .NET header parser where the tables were sometimes parsed in the wrong order.
- The
xtzipunit has been added, which can extract data from zip archives. - The
carve-zipunit has been added. It can carve ZIP files from buffers, similar tocarve-pefor PE files. - The
rsaunit has finally been added. - The
rncryptunit has been added. - The
dncfxunit has been added; it extracts the strings from ConfuserEx obfuscated .NET binaries. - Adds support for TrendMicro Clicktime URL guards in the
urlguardsunit.
- Several tests were added, testing now uses malshare to test units against real world samples. To properly execute tests, the environment variable
MALSHARE_APIneeds to contain a valid malshare API key. - A
numpyimport that always occured during any unit load was moved into thepeekunit code to reduce import time of other units. - Issues with wheel installation on Windows were fixed.
- It is now possible to instantiate units in code with arguments of type
bytesand have it work as expected, i.e.xor(B's3cr3t')will construct axorunit that decrypts using the byte string keys3cr3t. - The
rexunit can now apply an arbitrary number of transformations to each match and return the results as separate outputs. - The
urlguardsunit now supports ProofPoint V3 guarded URLs. - Thanks to the recent fix of #29 in javaobj, the
dsjava(deserialize Java serialized data) unit should now work. However, since there are currently no tests, bugs should be expected.
- Processing of data in frames is no longer interrupted by errors in one unit.
- The global
--lenient(or-L) flag has been added: It allows refinery units to return partial results. This behavior is disabled by default because it usually means that an error occurred during processing. - The virtual environment setup script has received bug fixes for problems with absolute paths.
- This changelog was added.
- The unit
jsonfmthas been renamed toppjson(for pretty-print json). - The unit
ppxml(pretty-print xml) was added. - The unit
carve-pe(carve PE files) was added. - The unit
winreg(read Windows registry hives) was added, also adding a dependency on the python-registry package (also on GitHub). - .NET managed resource extraction was improved, although it is still not perfect.
- The unit
sortednow only sorts the chunks of the input stream that are in scope. - The unit
dedupcan no longer sort the input stream becausesortedcan do this. - PowerShell deobfuscation and their test coverage was improved.
- Cryptographic units have been refactored; the
salsaandchachaunits now take a--nonceparameter rather than an--ivparameter, as they should.