Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 38 additions & 2 deletions internal/db/mysql.go
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,9 @@ func (m *mysqlDatabase) EnsureSchema(ctx context.Context) error {
-- Trace
trace JSON,

-- CPE (Common Platform Enumeration)
cpe JSON,

INDEX idx_timestamp (timestamp),
INDEX idx_url (url(255)),
INDEX idx_host (host),
Expand All @@ -163,9 +166,38 @@ func (m *mysqlDatabase) EnsureSchema(ctx context.Context) error {
return fmt.Errorf("failed to create schema: %w", err)
}

// Back-compat for databases whose schema was created before CPE support.
// New installs already get this column via the CREATE TABLE above; this
// path only matters for in-place upgrades.
// TODO: replace these ad-hoc ensureColumn calls with a proper migration
// framework (e.g. golang-migrate / goose) once more schema changes accumulate.
if err := m.ensureColumn(ctx, "cpe", "JSON"); err != nil {
return fmt.Errorf("failed to ensure cpe column: %w", err)
}
Comment on lines +169 to +176
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

cd internal/db && head -200 mysql.go | tail -50

Repository: projectdiscovery/httpx

Length of output: 1521


🏁 Script executed:

cat -n internal/db/mysql.go | sed -n '169,200p'

Repository: projectdiscovery/httpx

Length of output: 1329


🏁 Script executed:

# Check the import structure to understand current state
cat -n internal/db/mysql.go | head -20

Repository: projectdiscovery/httpx

Length of output: 526


🏁 Script executed:

# Verify MySQL error code 1060
web_search

Repository: projectdiscovery/httpx

Length of output: 117


🌐 Web query:

MySQL error code 1060 duplicate column

💡 Result:

MySQL error code 1060 (ER_DUP_FIELDNAME, SQLSTATE 42S21) occurs with the message "Duplicate column name '%s'".[1][2][3] It is triggered when attempting to define a column with a name that already exists in the table or result set, enforcing unique column names within tables and query results.[4][5][1] Common scenarios: - CREATE TABLE with duplicate column definitions, e.g., two 'email' columns.[4] - ALTER TABLE ADD COLUMN where the column already exists.[6] - SELECT * from joined tables with matching column names (e.g., both have 'id').[7] - CREATE VIEW without unique column aliases.[8] - Subqueries or INSERT...SELECT without column aliases.[9][10] Fixes: 1. Check the error message for the specific duplicate column name '%s'. 2. For CREATE/ALTER: Remove or rename the duplicate (e.g., ADD COLUMN new_email).[4][6] 3. For SELECT/JOIN/VIEW: Use aliases, e.g., SELECT t1.id AS t1_id, t2.id AS t2_id.[7][8] Example: SELECT t.id AS tips_id FROM tips t LEFT JOIN tip_usage ON tip_usage.tip_id = t.id GROUP BY t.id;[7] 4. Verify table schema with SHOW CREATE TABLE or DESCRIBE table to confirm existing columns.[6] This behavior is consistent across MySQL versions including 8.0+ and 9.x, and also in MariaDB.[3][1][2]

Citations:


🏁 Script executed:

# Check if errors package is already imported
rg "^import|\"errors\"" internal/db/mysql.go

Repository: projectdiscovery/httpx

Length of output: 75


🏁 Script executed:

# Verify go-sql-driver/mysql exports MySQLError type
# Search for usage patterns in the codebase
rg "MySQLError|mysql\.MySQLError" --type go

Repository: projectdiscovery/httpx

Length of output: 48


🏁 Script executed:

# Check the actual go-sql-driver/mysql module to understand the error type
web_search

Repository: projectdiscovery/httpx

Length of output: 117


🌐 Web query:

go-sql-driver mysql MySQLError type documentation

💡 Result:

In github.com/go-sql-driver/mysql, MySQL server errors are represented by the struct type MySQLError (pointer receiver form is used): - Type: type MySQLError struct { Number uint16; SQLState [1]byte; Message string } [2] - String formatting: func (me *MySQLError) Error string {... } [2] - Error matching: func (me *MySQLError) Is(err error) bool { if merr, ok:= err.(*MySQLError); ok { return merr.Number == me.Number } return false } [2] So, for documentation/usage you typically: 1) Import the driver and run queries via database/sql. [3] 2) Type-assert / extract *mysql.MySQLError from returned errors when you want the MySQL error number and SQLState; the driver supports matching on MySQLError, including via errors.Is-style number matching because MySQLError defines Is. [2][4] Example (extract typed MySQL error): var mysqlErr *mysql.MySQLError if errors.As(err, &mysqlErr) { log.Printf("MySQL Error %d (%s): %s", mysqlErr.Number, string(mysqlErr.SQLState[:]), mysqlErr.Message) } This works because MySQLError carries Number/SQLState/Message. [2] Notes on exported sentinel errors: - The driver also defines various exported error variables (e.g., ErrInvalidConn, ErrMalformPkt, ErrNoTLS, etc.) for non–server-returned error scenarios, and these can be checked directly. [2] If you want the exact code definition, see errors.go in the driver repository. [2]

Citations:


Make the column migration safe for concurrent upgrades.

The COUNT(*) + ALTER TABLE sequence at lines 181-199 is racy. If two instances start against the same pre-existing table, both can observe the column missing; one ADD COLUMN succeeds and the other fails with MySQL error 1060 (duplicate column), so the migration is not idempotent during rolling deploys.

💡 Suggested fix
 import (
 	"context"
 	"database/sql"
 	"encoding/json"
+	"errors"
 	"fmt"
 	"strings"
 
-	_ "github.com/go-sql-driver/mysql"
+	mysql "github.com/go-sql-driver/mysql"
 	"github.com/projectdiscovery/httpx/runner"
 )
@@
 func (m *mysqlDatabase) ensureColumn(ctx context.Context, column, definition string) error {
 	var count int
 	err := m.db.QueryRowContext(ctx,
 		`SELECT COUNT(*) FROM information_schema.columns
 		 WHERE table_schema = DATABASE() AND table_name = ? AND column_name = ?`,
 		m.cfg.TableName, column,
 	).Scan(&count)
 	if err != nil {
 		return err
 	}
 	if count > 0 {
 		return nil
 	}
 	_, err = m.db.ExecContext(ctx,
 		fmt.Sprintf("ALTER TABLE %s ADD COLUMN %s %s",
 			quoteIdentifier(m.cfg.TableName), quoteIdentifier(column), definition),
 	)
+	if err != nil {
+		var mysqlErr *mysql.MySQLError
+		if errors.As(err, &mysqlErr) && mysqlErr.Number == 1060 {
+			return nil
+		}
+	}
 	return err
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/db/mysql.go` around lines 169 - 176, The current ad-hoc migration in
m.ensureColumn uses a COUNT(*) then ALTER TABLE which races during concurrent
upgrades; update ensureColumn to make the schema change idempotent by performing
the ALTER TABLE ADD COLUMN using "ADD COLUMN IF NOT EXISTS" when supported or,
if not, by executing the ALTER TABLE and treating MySQL error 1060 (duplicate
column) as a non-fatal success. Locate the ensureColumn implementation and
change the logic that does COUNT(*) + ALTER to instead attempt the ALTER
directly and ignore/handle the duplicate-column error (or prefer ADD COLUMN IF
NOT EXISTS) so concurrent callers do not return an error.


return nil
}

func (m *mysqlDatabase) ensureColumn(ctx context.Context, column, definition string) error {
var count int
err := m.db.QueryRowContext(ctx,
`SELECT COUNT(*) FROM information_schema.columns
WHERE table_schema = DATABASE() AND table_name = ? AND column_name = ?`,
m.cfg.TableName, column,
).Scan(&count)
if err != nil {
return err
}
if count > 0 {
return nil
}
_, err = m.db.ExecContext(ctx,
fmt.Sprintf("ALTER TABLE %s ADD COLUMN %s %s",
quoteIdentifier(m.cfg.TableName), quoteIdentifier(column), definition),
)
return err
}

func (m *mysqlDatabase) InsertBatch(ctx context.Context, results []runner.Result) error {
if len(results) == 0 {
return nil
Expand Down Expand Up @@ -193,7 +225,8 @@ func (m *mysqlDatabase) InsertBatch(ctx context.Context, results []runner.Result
words, `+"`lines`"+`, header, extracts, extract_regex,
chain, chain_status_codes,
headless_body, screenshot_bytes, screenshot_path, screenshot_path_rel, stored_response_path,
knowledgebase, link_request, trace
knowledgebase, link_request, trace,
cpe
) VALUES (
?, ?, ?, ?, ?, ?, ?, ?, ?,
?, ?, ?, ?, ?, ?,
Expand All @@ -205,7 +238,8 @@ func (m *mysqlDatabase) InsertBatch(ctx context.Context, results []runner.Result
?, ?, ?, ?, ?,
?, ?,
?, ?, ?, ?, ?,
?, ?, ?
?, ?, ?,
?
)`, tableName)

stmt, err := tx.PrepareContext(ctx, query)
Expand Down Expand Up @@ -236,6 +270,7 @@ func (m *mysqlDatabase) InsertBatch(ctx context.Context, results []runner.Result
kbJSON, _ := json.Marshal(r.KnowledgeBase)
linkReqJSON, _ := json.Marshal(r.LinkRequest)
traceJSON, _ := json.Marshal(r.Trace)
cpeJSON, _ := json.Marshal(r.CPE)

_, err = stmt.ExecContext(ctx,
r.Timestamp, r.URL, r.Input, r.Host, r.Port, r.Scheme, r.Path, r.Method, r.FinalURL,
Expand All @@ -249,6 +284,7 @@ func (m *mysqlDatabase) InsertBatch(ctx context.Context, results []runner.Result
chainJSON, chainStatusJSON,
r.HeadlessBody, r.ScreenshotBytes, r.ScreenshotPath, r.ScreenshotPathRel, r.StoredResponsePath,
kbJSON, linkReqJSON, traceJSON,
cpeJSON,
)
if err != nil {
return fmt.Errorf("failed to insert result: %w", err)
Expand Down
22 changes: 19 additions & 3 deletions internal/db/postgres.go
Original file line number Diff line number Diff line change
Expand Up @@ -150,15 +150,27 @@ func (p *postgresDatabase) EnsureSchema(ctx context.Context) error {
link_request JSONB,

-- Trace
trace JSONB
trace JSONB,

-- CPE (Common Platform Enumeration)
cpe JSONB
);

-- Back-compat for databases whose schema was created before CPE support.
-- New installs already get this column via the CREATE TABLE above; this
-- statement only matters for in-place upgrades.
-- TODO: replace these ad-hoc ALTER TABLE statements with a proper
-- migration framework (e.g. golang-migrate / goose) once more schema
-- changes accumulate.
ALTER TABLE %s ADD COLUMN IF NOT EXISTS cpe JSONB;

CREATE INDEX IF NOT EXISTS %s ON %s(timestamp DESC);
CREATE INDEX IF NOT EXISTS %s ON %s(url);
CREATE INDEX IF NOT EXISTS %s ON %s(host);
CREATE INDEX IF NOT EXISTS %s ON %s(status_code);
CREATE INDEX IF NOT EXISTS %s ON %s USING GIN(tech);
`,
tableName,
tableName,
idxTimestamp, tableName,
idxURL, tableName,
Expand Down Expand Up @@ -201,7 +213,8 @@ func (p *postgresDatabase) InsertBatch(ctx context.Context, results []runner.Res
words, lines, header, extracts, extract_regex,
chain, chain_status_codes,
headless_body, screenshot_bytes, screenshot_path, screenshot_path_rel, stored_response_path,
knowledgebase, link_request, trace
knowledgebase, link_request, trace,
cpe
) VALUES (
$1, $2, $3, $4, $5, $6, $7, $8, $9,
$10, $11, $12, $13, $14, $15,
Expand All @@ -213,7 +226,8 @@ func (p *postgresDatabase) InsertBatch(ctx context.Context, results []runner.Res
$48, $49, $50, $51, $52,
$53, $54,
$55, $56, $57, $58, $59,
$60, $61, $62
$60, $61, $62,
$63
)`, tableName)

stmt, err := tx.PrepareContext(ctx, query)
Expand All @@ -235,6 +249,7 @@ func (p *postgresDatabase) InsertBatch(ctx context.Context, results []runner.Res
kbJSON, _ := json.Marshal(r.KnowledgeBase)
linkReqJSON, _ := json.Marshal(r.LinkRequest)
traceJSON, _ := json.Marshal(r.Trace)
cpeJSON, _ := json.Marshal(r.CPE)

_, err = stmt.ExecContext(ctx,
r.Timestamp, r.URL, r.Input, r.Host, r.Port, r.Scheme, r.Path, r.Method, r.FinalURL,
Expand All @@ -248,6 +263,7 @@ func (p *postgresDatabase) InsertBatch(ctx context.Context, results []runner.Res
chainJSON, pq.Array(r.ChainStatusCodes),
r.HeadlessBody, r.ScreenshotBytes, r.ScreenshotPath, r.ScreenshotPathRel, r.StoredResponsePath,
kbJSON, linkReqJSON, traceJSON,
cpeJSON,
)
if err != nil {
return fmt.Errorf("failed to insert result: %w", err)
Expand Down
9 changes: 4 additions & 5 deletions runner/options.go
Original file line number Diff line number Diff line change
Expand Up @@ -801,11 +801,10 @@ func (options *Options) ValidateOptions() error {
var resolvers []string
for _, resolver := range options.Resolvers {
if fileutil.FileExists(resolver) {
chFile, err := fileutil.ReadFile(resolver)
if err != nil {
return errors.Wrapf(err, "Couldn't process resolver file \"%s\"", resolver)
}
for line := range chFile {
for line, err := range fileutil.Lines(resolver) {
if err != nil {
return errors.Wrapf(err, "Couldn't process resolver file \"%s\"", resolver)
}
line = strings.TrimSpace(line)
if line != "" && strings.Contains(line, ",") {
for item := range strings.SplitSeq(line, ",") {
Expand Down
27 changes: 12 additions & 15 deletions runner/runner.go
Original file line number Diff line number Diff line change
Expand Up @@ -743,11 +743,10 @@ func (r *Runner) streamInput() (chan string, error) {
return
}
} else {
fchan, err := fileutil.ReadFile(r.options.InputFile)
if err != nil {
return
}
for item := range fchan {
for item, err := range fileutil.Lines(r.options.InputFile) {
if err != nil {
return
}
Comment on lines +746 to +749
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not swallow stream read errors silently.

At Line 748, Line 765, and Line 778, read errors cause an immediate return with no log/error signal. That can truncate scans silently and is very hard to diagnose.

Proposed fix
@@
-               for item, err := range fileutil.Lines(r.options.InputFile) {
+               for item, err := range fileutil.Lines(r.options.InputFile) {
                    if err != nil {
+                       gologger.Error().Msgf("Could not read input file '%s': %s\n", r.options.InputFile, err)
                        return
                    }
@@
-               for item, err := range fileutil.Lines(file) {
+               for item, err := range fileutil.Lines(file) {
                    if err != nil {
+                       gologger.Error().Msgf("Could not read input file '%s': %s\n", file, err)
                        return
                    }
@@
-           for item, err := range fileutil.LinesReader(os.Stdin) {
+           for item, err := range fileutil.LinesReader(os.Stdin) {
                if err != nil {
+                   gologger.Error().Msgf("Could not read input from stdin: %s\n", err)
                    return
                }

Also applies to: 763-766, 776-779

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@runner/runner.go` around lines 746 - 749, In the loops iterating "for item,
err := range fileutil.Lines(r.options.InputFile)" do not silently return on err;
instead surface the error (log it via the runner's logger or return it up the
call chain). Replace the bare "if err { return }" with code that logs a
descriptive message including r.options.InputFile and the error (or wraps and
returns the error) so read errors at the fileutil.Lines call are visible and
fail the scan; apply the same change to the other two occurrences handling
errors from fileutil.Lines.

if r.options.SkipDedupe || r.testAndSet(item) {
if !trySend(item) {
return
Expand All @@ -761,11 +760,10 @@ func (r *Runner) streamInput() (chan string, error) {
gologger.Fatal().Msgf("No input provided: %s", err)
}
for _, file := range files {
fchan, err := fileutil.ReadFile(file)
if err != nil {
return
}
for item := range fchan {
for item, err := range fileutil.Lines(file) {
if err != nil {
return
}
if r.options.SkipDedupe || r.testAndSet(item) {
if !trySend(item) {
return
Expand All @@ -775,11 +773,10 @@ func (r *Runner) streamInput() (chan string, error) {
}
}
if fileutil.HasStdin() {
fchan, err := fileutil.ReadFileWithReader(os.Stdin)
if err != nil {
return
}
for item := range fchan {
for item, err := range fileutil.LinesReader(os.Stdin) {
if err != nil {
return
}
if r.options.SkipDedupe || r.testAndSet(item) {
if !trySend(item) {
return
Expand Down
Loading