Skip to content

Latest commit

 

History

History
169 lines (125 loc) · 6.2 KB

File metadata and controls

169 lines (125 loc) · 6.2 KB

Stringy Logo

Stringy

License Sponsors

CI dependency status

codecov Issues Last Commit OpenSSF Scorecard


A smarter alternative to strings that uses binary format knowledge and semantic classification to extract the strings that actually matter from ELF, PE, and Mach-O executables.

The standard strings command dumps every printable byte sequence it finds -- padding, table data, interleaved garbage. Stringy is section-aware, encoding-aware, and semantically intelligent: it knows where strings live in a binary, what they mean, and which ones you care about.

Quick Start

Installation

Pre-built binaries are available on the Releases page for Linux, macOS, and Windows.

From source:

git clone https://github.com/EvilBit-Labs/Stringy
cd Stringy
cargo build --release
./target/release/stringy --help

Basic Usage

# Ranked output with semantic tags
stringy target_binary

# Filter by semantic tags
stringy --only-tags url target_binary
stringy --only-tags url --only-tags filepath target_binary

# Exclude noisy tags
stringy --no-tags format_string target_binary

# Control extraction
stringy --min-len 8 target_binary
stringy --enc ascii target_binary
stringy --top 50 target_binary

# Output formats
stringy --json target_binary
stringy --yara target_binary
stringy --json target_binary | jq '.[] | select(.tags[] | contains("Url"))'

# Raw extraction (no classification or ranking)
stringy --raw target_binary

# Debug and summary modes
stringy --debug target_binary
stringy --summary target_binary

# Read from stdin
cat target_binary | stringy -

Example Output

TTY table:

String                                   | Tags       | Score | Section
-----------------------------------------|------------|-------|--------
https://api.example.com/v1/              | url        |    95 | .rdata
{12345678-1234-1234-1234-123456789abc}   | guid       |    87 | .rdata
/usr/local/bin/stringy                   | filepath   |    82 | __cstring
Error: %s at line %d                     | fmt        |    78 | .rdata

JSON (JSONL):

{
  "text": "https://api.example.com/v1/",
  "offset": 4096,
  "rva": 4096,
  "section": ".rdata",
  "encoding": "utf-8",
  "length": 28,
  "tags": [
    "Url"
  ],
  "score": 95,
  "display_score": 95,
  "source": "SectionData",
  "confidence": 0.98
}

Features

  • Format-aware parsing: ELF, PE, and Mach-O via goblin, with section-level weight prioritization
  • Encoding support: ASCII, UTF-8, UTF-16LE/BE with confidence scoring
  • Semantic classification: URLs, domains, IPv4/IPv6, file paths, registry keys, GUIDs, user agents, format strings, Base64, crypto constants
  • Symbol demangling: C++, Rust, and other mangled symbol name recovery
  • PE resources: VERSIONINFO, STRINGTABLE, and MANIFEST extraction
  • Import/export analysis: Symbol extraction from all supported formats
  • Ranking: Section-aware scoring with band-mapped 0-100 normalization
  • Deduplication: Canonical string grouping with configurable similarity threshold
  • Output formats: TTY table, plain text, JSONL, YARA rules
  • Pipeline architecture: Configurable orchestrator with filtering, encoding selection, and top-N support

Security

Verifying Releases

All release artifacts are signed via Sigstore using GitHub Attestations:

gh attestation verify <artifact> --repo EvilBit-Labs/Stringy

Documentation

Full documentation is available at evilbitlabs.io/stringy.

Quick links: Installation | Quick Start | CLI Reference | Architecture | Troubleshooting

Contributing

See CONTRIBUTING.md for development setup, coding guidelines, and submission process.

License

Licensed under the Apache License, Version 2.0.

Acknowledgments

  • Inspired by strings(1) and the need for better binary analysis tools
  • Built with goblin, bstr, regex, and rustc-demangle
  • My coworkers, for their excellent input on the original name selection