Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 14 additions & 20 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,38 +8,32 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).

### Added

- **VERS parser**: First JavaScript implementation of the VERS (VErsion Range Specifier) companion spec to PURL. Supports parsing, serialization, and containment checking for semver-based schemes (npm, cargo, golang, gem, hex, pub, cran, swift)
- **URL-to-PURL conversion**: `UrlConverter.fromUrl()` converts registry URLs to PackageURLs across 27 hostnames and 17 purl types (npm, pypi, maven, cargo, nuget, github, gitlab, bitbucket, docker, hex, pub, cocoapods, hackage, conda, cpan, luarocks, huggingface, swift, cran, vscode)
- **`toSpec()` method**: Returns the package identity without the `pkg:type/` prefix (the npm "spec" equivalent)
- **VERS parser**: First JavaScript implementation of the VERS (VErsion Range Specifier) companion spec to PURL
- **URL-to-PURL conversion**: `UrlConverter.fromUrl()` converts registry URLs to PackageURLs
- **`toSpec()` method**: Returns the package identity without the `pkg:type/` prefix
- **`isValid()` static method**: Quick validation without throwing
- **`fromUrl()` static method**: Convenience wrapper for `UrlConverter.fromUrl()`
- **Immutable copy methods**: `withVersion()`, `withNamespace()`, `withQualifier()`, `withQualifiers()`, `withSubpath()` return new instances
- **PurlBuilder factories**: Added 18 new type factories (bitbucket, cocoapods, conan, conda, cran, deb, docker, github, gitlab, hackage, hex, huggingface, luarocks, oci, pub, rpm, swift, vscode-extension)
- **Injection character detection**: `containsInjectionCharacters()` utility for shell metacharacter detection
- **PurlBuilder factories**: Added type factories for common ecosystems
- **Input validation utilities**: Character detection for dangerous input
- **`vers` qualifier**: Added 6th standard qualifier per purl spec
- **`./exists` entry point**: Registry existence checks available via `@socketregistry/packageurl-js/exists`

### Changed

- **Bundle size reduced 95%**: Core bundle is 178 KB (was 3.3 MB). Exists functions moved to separate entry point to avoid bundling HTTP dependencies
- **Primordials module**: All 43 built-in references captured at module load time via `uncurryThis` pattern (mirrors Node.js internals). Zero raw prototype method calls remain
- **Frozen constants**: Module-level Maps, Sets, regex patterns, and arrays are frozen
- **Null prototype objects**: All user-facing object literals use `__proto__: null`
- **Flyweight cache**: `fromString()` caches up to 1024 instances; `toString()` memoized
- **Bundle size reduced 95%**: Exists functions moved to separate entry point to avoid bundling HTTP dependencies
- **Hardened against prototype pollution**: Built-in references captured at module load time
- **Frozen constants**: Module-level data structures are immutable
- **Null prototype objects**: All user-facing object literals use null prototypes
- **Performance**: Instance caching for `fromString()`; `toString()` memoized
- **Version lowercasing**: Added for oci, pypi, and vscode-extension per upstream spec

### Fixed

- **ReDoS prevention**: Consecutive `.*` groups collapsed in wildcard regex
- **Null byte rejection**: All string components reject `\x00` to prevent truncation in C-based consumers
- **VERS resource limits**: 1000 constraint maximum, MAX_SAFE_INTEGER validation
- **vscode-extension validation**: Rejects illegal characters in namespace, name, version, and platform qualifier

### Security

- Prototype pollution resilience via primordials (captured String, Array, RegExp, Object, Reflect methods)
- Global tampering protection verified (replacing `global.URL` after import has no effect)
- Inline regex patterns hoisted to frozen module-scope constants
- **ReDoS prevention**: Fixed potential denial-of-service in pattern matching
- **Input validation**: Reject dangerous characters in string components
- **VERS resource limits**: Constraint and value bounds enforced
- **vscode-extension validation**: Improved input validation

## [1.3.5](https://github.com/SocketDev/socket-packageurl-js/releases/tag/v1.3.5) - 2025-11-02

Expand Down
6 changes: 6 additions & 0 deletions src/compare.ts
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,13 @@ const WILDCARD_CACHE_MAX = 1024
* Supports * (match any chars), ? (match single char), ** (match anything including empty).
* Designed for version strings and package names, not file paths.
*/
const MAX_PATTERN_LENGTH = 4096

function matchWildcard(pattern: string, value: string): boolean {
// Reject excessively long patterns to prevent regex compilation DoS
if (pattern.length > MAX_PATTERN_LENGTH) {
return false
}
let regex = wildcardRegexCache.get(pattern)
if (regex === undefined) {
// Convert glob pattern to regex
Expand Down
9 changes: 8 additions & 1 deletion src/normalize.ts
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,14 @@ function normalizeQualifiers(
let qualifiers: Record<string, string> | undefined
// Use for-of to work with entries iterators
for (const { 0: key, 1: value } of qualifiersToEntries(rawQualifiers)) {
const strValue = typeof value === 'string' ? value : String(value)
// Only coerce primitive types — reject objects/functions that could
// execute arbitrary code via toString() during coercion.
const strValue =
typeof value === 'string'
? value
: typeof value === 'number' || typeof value === 'boolean'
? `${value}`
: ''
const trimmed = StringPrototypeTrim(strValue)
// A key=value pair with an empty value is the same as no key/value
// at all for this key
Expand Down
8 changes: 7 additions & 1 deletion src/package-url.ts
Original file line number Diff line number Diff line change
Expand Up @@ -508,7 +508,13 @@ class PackageURL {
}
}
const purl = new PackageURL(...PackageURL.parseString(purlStr))
// Cache the result for future lookups
// Eagerly populate the toString cache before freezing
purl.toString()
// Deep freeze the instance and its nested qualifiers object to prevent
// cache poisoning via mutation of shared cached instances.
recursiveFreeze(purl)
// Cache the frozen result for future lookups — freezing prevents
// cache poisoning via property mutation on shared instances.
if (typeof purlStr === 'string') {
if (flyweightCache.size >= FLYWEIGHT_CACHE_MAX) {
// Evict oldest entry (first key in Map iteration order)
Expand Down
3 changes: 2 additions & 1 deletion src/purl-types/conda.ts
Original file line number Diff line number Diff line change
Expand Up @@ -116,8 +116,9 @@ export async function condaExists(

const fetchResult = async (): Promise<ExistsResult> => {
try {
const encodedChannel = encodeComponent(channelName)
const encodedName = encodeComponent(name)
const url = `https://api.anaconda.org/package/${channelName}/${encodedName}`
const url = `https://api.anaconda.org/package/${encodedChannel}/${encodedName}`

const data = await httpJson<{
latest_version?: string
Expand Down
5 changes: 4 additions & 1 deletion src/purl-types/docker.ts
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,10 @@ export async function dockerExists(

const fetchResult = async (): Promise<ExistsResult> => {
try {
const encodedRepo = encodeComponent(repo)
// Encode each path segment separately to preserve the / delimiter
const encodedRepo = namespace
? `${encodeComponent(namespace)}/${encodeComponent(name)}`
: encodeComponent(name)
const url = `https://hub.docker.com/v2/repositories/${encodedRepo}`

const data = await httpJson<{
Expand Down
13 changes: 8 additions & 5 deletions src/purl-types/golang.ts
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ import { httpJson } from '@socketsecurity/lib/http-request'
import { PurlError } from '../error.js'
import {
ArrayPrototypeJoin,
encodeComponent,
StringPrototypeCharCodeAt,
StringPrototypeIncludes,
StringPrototypeReplace,
Expand Down Expand Up @@ -108,10 +109,12 @@ export async function golangExists(
// Go proxy uses case-encoded paths where uppercase letters are !lowercase
const parts = StringPrototypeSplit(modulePath, '/' as any)
for (let i = 0; i < parts.length; i++) {
parts[i] = StringPrototypeReplace(
parts[i]!,
/[A-Z]/g,
letter => `!${StringPrototypeToLowerCase(letter)}`,
parts[i] = encodeComponent(
StringPrototypeReplace(
parts[i]!,
/[A-Z]/g,
letter => `!${StringPrototypeToLowerCase(letter)}`,
),
)
}
const encodedPath = ArrayPrototypeJoin(parts, '/')
Expand All @@ -126,7 +129,7 @@ export async function golangExists(
const latestVersion = data.Version

if (version) {
const versionUrl = `https://proxy.golang.org/${encodedPath}/@v/${version}.info`
const versionUrl = `https://proxy.golang.org/${encodedPath}/@v/${encodeComponent(version)}.info`
try {
await httpJson(versionUrl)
} catch {
Expand Down
8 changes: 8 additions & 0 deletions src/purl-types/npm.ts
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ import {
StringPrototypeTrim,
} from '../primordials.js'
import { isBlank, lowerName, lowerNamespace } from '../strings.js'
import { validateNoInjectionByType } from '../validate.js'

import type { TtlCache } from '@socketsecurity/lib/cache-with-ttl'

Expand Down Expand Up @@ -434,6 +435,13 @@ export function parseNpmSpecifier(specifier: unknown): NpmPackageComponents {
*/
export function validate(purl: PurlObject, throws: boolean): boolean {
const { name, namespace } = purl
// Validate name and namespace for injection characters
if (!validateNoInjectionByType('npm', 'name', name, throws)) {
return false
}
if (!validateNoInjectionByType('npm', 'namespace', namespace, throws)) {
return false
}
const hasNs = namespace && namespace.length > 0
const id = getNpmId(purl)
const code0 = StringPrototypeCharCodeAt(id, 0)
Expand Down
175 changes: 154 additions & 21 deletions src/strings.ts
Original file line number Diff line number Diff line change
Expand Up @@ -203,55 +203,187 @@ function replaceUnderscoresWithDashes(str: string): string {
* space (0x20), DEL (0x7f)
*/
function isInjectionCharCode(code: number): boolean {
// C0 control characters (0x00-0x1f) — includes NUL, tab, newline, CR,
// ESC (0x1b, terminal escape sequences), and all other control chars.
// Also catches vertical tab (0x0b), form feed (0x0c), and bell (0x07)
// which can be used for log injection and terminal manipulation.
// C0 control characters (0x00-0x1f)
if (code <= 0x1f) {
return true
}
// biome-ignore format: newlines
if (
// space — argument splitting in shell contexts
// space
code === 0x20 ||
// " — breaks double-quoted shell/SQL/URL contexts
// !
code === 0x21 ||
// "
code === 0x22 ||
// # — URL fragment injection, shell comments
// #
code === 0x23 ||
// $ — shell variable expansion, command substitution $()
// $
code === 0x24 ||
// & — shell background execution, URL parameter delimiter
// %
code === 0x25 ||
// &
code === 0x26 ||
// ' — breaks single-quoted shell/SQL contexts
// '
code === 0x27 ||
// ( — shell subshell, command grouping
// (
code === 0x28 ||
// ) — shell subshell, command grouping
// )
code === 0x29 ||
// ; — shell command separator
// *
code === 0x2a ||
// ;
code === 0x3b ||
// < — shell input redirection, XML/HTML injection
// <
code === 0x3c ||
// > — shell output redirection, XML/HTML injection
// =
code === 0x3d ||
// >
code === 0x3e ||
// \ — shell escape character, path traversal on Windows
// ?
code === 0x3f ||
// [
code === 0x5b ||
// \
code === 0x5c ||
// ` — shell command substitution (legacy backtick form)
// ]
code === 0x5d ||
// `
code === 0x60 ||
// { — shell brace expansion
// {
code === 0x7b ||
// | — shell pipe
// |
code === 0x7c ||
// } — shell brace expansion
// }
code === 0x7d ||
// DEL (0x7f) — control character, terminal manipulation
// ~
code === 0x7e ||
// DEL
code === 0x7f
) {
return true
}
// C1 control characters (0x80-0x9f)
if (code >= 0x80 && code <= 0x9f) {
return true
}
// Unicode dangerous characters
// biome-ignore format: newlines
if (
// Zero-width space
code === 0x200b ||
// Zero-width non-joiner
code === 0x200c ||
// Zero-width joiner
code === 0x200d ||
// Left-to-right mark
code === 0x200e ||
// Right-to-left mark
code === 0x200f ||
// Left-to-right embedding
code === 0x202a ||
// Right-to-left embedding
code === 0x202b ||
// Pop directional formatting
code === 0x202c ||
// Left-to-right override
code === 0x202d ||
// Right-to-left override
code === 0x202e ||
// Word joiner
code === 0x2060 ||
// BOM / zero-width no-break space
code === 0xfeff ||
// Object replacement character
code === 0xfffc ||
// Replacement character
code === 0xfffd
) {
return true
}
return false
}

/**
* Test whether a character code enables command execution.
*
* A narrower scanner than isInjectionCharCode, targeting characters that
* enable shell command execution and code injection. Allows characters
* that are legitimate in version strings and URL-based qualifier values
* (like !, +, ?, &, =, %, :, /, #, space) while still blocking the
* most dangerous execution vectors.
*
* Used for version, subpath, and qualifier value validation where the
* full injection scanner would cause false positives.
*/
function isCommandInjectionCharCode(code: number): boolean {
// C0 control characters except tab (0x09) — tab is used in some
// version metadata but other controls are never legitimate
if (code <= 0x1f && code !== 0x09) {
return true
}
// biome-ignore format: newlines
if (
// $ — command substitution $()
code === 0x24 ||
// ; — command separator
code === 0x3b ||
// < — input redirection
code === 0x3c ||
// > — output redirection
code === 0x3e ||
// \ — escape character
code === 0x5c ||
// ` — command substitution (backtick form)
code === 0x60 ||
// | — pipe
code === 0x7c ||
// DEL
code === 0x7f
) {
return true
}
// C1 control characters
if (code >= 0x80 && code <= 0x9f) {
return true
}
// Unicode dangerous characters (same set as isInjectionCharCode)
// biome-ignore format: newlines
if (
code === 0x200b ||
code === 0x200c ||
code === 0x200d ||
code === 0x200e ||
code === 0x200f ||
code === 0x202a ||
code === 0x202b ||
code === 0x202c ||
code === 0x202d ||
code === 0x202e ||
code === 0x2060 ||
code === 0xfeff ||
code === 0xfffc ||
code === 0xfffd
) {
return true
}
return false
}

/**
* Find the first command injection character in a string.
* Like findInjectionCharCode but uses the narrower command injection set.
* Returns the character code found, or -1.
*/
function findCommandInjectionCharCode(str: string): number {
for (let i = 0, { length } = str; i < length; i += 1) {
const code = StringPrototypeCharCodeAt(str, i)
if (isCommandInjectionCharCode(code)) {
return code
}
}
return -1
}

/**
* Find the first injection character in a string.
* Returns the character code of the first dangerous character found, or -1.
Expand Down Expand Up @@ -310,6 +442,7 @@ function trimLeadingSlashes(str: string): string {

export {
containsInjectionCharacters,
findCommandInjectionCharCode,
findInjectionCharCode,
formatInjectionChar,
isBlank,
Expand Down
Loading
Loading