Skip to content

CMR-11037: Fix middleware for body-copy to work properly with form data#2366

Merged
zimzoom merged 1 commit intomasterfrom
CMR-11037
Jan 7, 2026
Merged

CMR-11037: Fix middleware for body-copy to work properly with form data#2366
zimzoom merged 1 commit intomasterfrom
CMR-11037

Conversation

@zimzoom
Copy link
Copy Markdown
Contributor

@zimzoom zimzoom commented Dec 23, 2025

Overview

What is the objective?

Original reported bug noted that collection facets search was returning an inaccurate field; further investigation determined the root cause was that this middleware function for attaching "body-copy" when processing form encoded data was not working correctly, and actually just returning an empty string. This middleware has been fixed in this PR to correctly parse the body copy in the case of form encoded requests while still maintaining the original slurp functionality otherwise.

What are the changes?

This middleware has been fixed in this PR to correctly parse the body copy in the case of form encoded requests while still maintaining the original slurp functionality otherwise.

What areas of the application does this impact?

search-app

Required Checklist

  • [ x] New and existing unit and int tests pass locally and remotely
  • [ x] clj-kondo has been run locally and all errors in changed files are corrected
  • [ x] I have commented my code, particularly in hard-to-understand areas
  • [ x] I have made changes to the documentation (if necessary)
  • [ x] My changes generate no new warnings

Additional Checklist

  • [ x] I have removed unnecessary/dead code and imports in files I have changed
  • [ x] I have cleaned up integration tests by doing one or more of the following:
    • migrated any are2 tests to are3 in files I have changed
    • de-duped, consolidated, removed dead int tests
    • transformed applicable int tests into unit tests
    • reduced number of system state resets by updating fixtures. Ex) (use-fixtures :each (ingest/reset-fixture {})) to be :once instead of :each

Summary by CodeRabbit

  • New Features

    • Added multipart/form-data parsing so multipart POSTs are parsed and reconstructed for reliable processing.
  • Bug Fixes

    • Improved request-body preservation so POST form bodies can be read multiple times (body now preserved and replayable).

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Dec 23, 2025

📝 Walkthrough

Walkthrough

Adds multipart/form-data parsing to search routes. Introduces parse-multipart-body to reconstruct multipart params as a URL-encoded byte stream. copy-of-body-handler now detects multipart requests, uses the parser, and attaches both :body-copy and a reconstituted :body (ByteArrayInputStream).

Changes

Cohort / File(s) Summary
Multipart Request Body Handling
search-app/src/cmr/search/routes.clj
Added parse-multipart-body to convert multipart form-data parts into a query-string-like byte sequence. Updated copy-of-body-handler to detect multipart/form-data, use the new parser instead of slurping, and populate both :body-copy and :body (ByteArrayInputStream). Imported ring.middleware.multipart-params.

Sequence Diagram

sequenceDiagram
    autonumber
    participant Client as HTTP Client
    participant Handler as copy-of-body-handler
    participant Detector as Content-Type Check
    participant MultipartParser as parse-multipart-body
    participant Stream as Reconstituted Body (ByteArrayInputStream)

    Client->>Handler: POST request (maybe multipart/form-data)
    Handler->>Detector: Inspect Content-Type header
    alt multipart/form-data
        Handler->>MultipartParser: Pass request with params
        MultipartParser->>MultipartParser: Convert parts -> query-string bytes
        MultipartParser->>Stream: Return ByteArrayInputStream
        Handler->>Handler: Attach `:body-copy` and `:body` (Stream)
    else not multipart
        Handler->>Handler: Slurp raw body -> attach `:body-copy` and `:body`
    end
    Handler->>Client: Continue processing with preserved body
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I nibbled parts and stitched them tight,
Turned scattered fields into one soft byte,
A stream to read again by candlelight,
Hooray — no lost body in the night!

Pre-merge checks

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly identifies the issue (CMR-11037) and describes the main fix: middleware for body-copy now works with form data.
Description check ✅ Passed The description follows the template with all required sections completed, including objectives, changes, impacted areas, and both checklists marked complete.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 444e43e and 177f693.

📒 Files selected for processing (1)
  • search-app/src/cmr/search/routes.clj
🚧 Files skipped from review as they are similar to previous changes (1)
  • search-app/src/cmr/search/routes.clj

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8ce2b5d and e4df4ac.

📒 Files selected for processing (1)
  • search-app/src/cmr/search/routes.clj
🔇 Additional comments (2)
search-app/src/cmr/search/routes.clj (2)

27-27: LGTM!

The import addition is necessary for multipart form-data parsing and follows standard conventions.


70-73: Implementation looks correct, pending fixes to parse-multipart-body.

The content-type detection using string/starts-with? properly handles multipart content-types with boundary parameters (e.g., multipart/form-data; boundary=...). The approach of parsing the multipart body and reconstructing it as a ByteArrayInputStream enables multiple reads, which addresses the original bug.

However, ensure the issues flagged in parse-multipart-body (URL encoding and file upload handling) are addressed, as this handler depends on that function's correctness.

Comment thread search-app/src/cmr/search/routes.clj
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Dec 31, 2025

Codecov Report

❌ Patch coverage is 12.50000% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 29.14%. Comparing base (42d0ee6) to head (444e43e).

Files with missing lines Patch % Lines
search-app/src/cmr/search/routes.clj 12.50% 7 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (42d0ee6) and HEAD (444e43e). Click for more details.

HEAD has 2 uploads less than BASE
Flag BASE (42d0ee6) HEAD (444e43e)
6 4
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #2366       +/-   ##
===========================================
- Coverage   58.02%   29.14%   -28.89%     
===========================================
  Files        1063     1003       -60     
  Lines       72750    69333     -3417     
  Branches     2110     1196      -914     
===========================================
- Hits        42212    20204    -22008     
- Misses      28569    47994    +19425     
+ Partials     1969     1135      -834     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
search-app/src/cmr/search/routes.clj (1)

77-77: Major: Specify UTF-8 encoding explicitly when converting to bytes.

Using .getBytes without specifying a charset uses the platform's default encoding, which can vary across environments and cause issues with non-ASCII characters (e.g., international characters, emojis).

🔎 Proposed fix specifying UTF-8 encoding
       (handler (assoc
                 request
                 :body-copy body
-                :body (java.io.ByteArrayInputStream. (.getBytes body)))))))
+                :body (java.io.ByteArrayInputStream. (.getBytes body "UTF-8")))))))

This ensures consistent UTF-8 encoding across all environments and properly handles international characters.

♻️ Duplicate comments (1)
search-app/src/cmr/search/routes.clj (1)

59-62: Critical: URL encoding is still missing, plus type and nil safety concerns.

This function has several issues that will cause runtime errors or malformed data:

  1. URL encoding: Parameter names and values containing special characters (&, =, %, spaces, etc.) will create malformed query strings that cannot be parsed correctly downstream. This was flagged in a previous review but remains unaddressed.

  2. Type safety: (name k) assumes keys are keywords, but :multipart-params can contain string keys. This will throw an exception for string keys.

  3. Nil and non-string values: Values could be nil, File objects (for file uploads), or other non-string types. These need explicit handling before concatenation.

🔎 Proposed fix addressing all three issues
+(ns cmr.search.routes
+  ...
+  (:require
+   ...
+   [ring.util.codec :as codec]
+   ...))
+
 (defn parse-multipart-body [request]
   (let [parsed (multipart/multipart-params-request request)]
     (string/join "&" (for [[k v] (:multipart-params parsed)]
-                       (str (name k) "=" v)))))
+                       (let [key-str (if (keyword? k) (name k) (str k))
+                             val-str (if (nil? v) "" (str v))]
+                         (str (codec/url-encode key-str) "=" (codec/url-encode val-str)))))))

This fix:

  • Imports ring.util.codec for URL encoding
  • Handles both keyword and string keys safely
  • Converts nil values to empty strings
  • Converts all values to strings before encoding
  • URL-encodes both keys and values to prevent malformed query strings
🧹 Nitpick comments (1)
search-app/src/cmr/search/routes.clj (1)

70-77: Consider adding error handling for parsing failures.

The current implementation doesn't handle potential exceptions from parse-multipart-body or slurp. While the outer exception-handler middleware (line 132) will catch these, adding explicit error handling here could:

  1. Provide more specific error messages for debugging
  2. Allow graceful fallbacks if multipart parsing fails
  3. Make the middleware more resilient to malformed requests
Example error handling approach
(defn copy-of-body-handler
  [handler]
  (fn [request]
    (let [content-type (get-in request [:headers "content-type"])
          body (try
                 (if (and content-type 
                         (string? content-type)
                         (string/starts-with? content-type "multipart/form-data"))
                   (parse-multipart-body request)
                   (slurp (:body request)))
                 (catch Exception e
                   (throw (ex-info "Failed to parse request body"
                                   {:type :body-parsing-error
                                    :content-type content-type}
                                   e))))]
      (handler (assoc
                request
                :body-copy body
                :body (java.io.ByteArrayInputStream. (.getBytes body "UTF-8")))))))

This provides clearer error context while still propagating the exception to be handled by the middleware chain.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 96cda8a and 444e43e.

📒 Files selected for processing (1)
  • search-app/src/cmr/search/routes.clj
🔇 Additional comments (2)
search-app/src/cmr/search/routes.clj (2)

27-27: LGTM!

The import of ring.middleware.multipart-params is appropriate for the multipart form-data parsing functionality added in this PR.


70-73: Critical: Nil content-type will cause NullPointerException.

Line 71 will throw a NullPointerException if the content-type header is nil because string/starts-with? does not handle nil strings.

🔎 Proposed fix to safely check content-type
     (let [content-type (get-in request [:headers "content-type"])
-          body (if (and content-type (string/starts-with? content-type "multipart/form-data"))
+          body (if (and content-type 
+                       (string? content-type)
+                       (string/starts-with? content-type "multipart/form-data"))
                  (parse-multipart-body request)
                  (slurp (:body request)))]

This ensures content-type is not only non-nil but also a string before calling string/starts-with?.

Likely an incorrect or invalid review comment.

Comment on lines +59 to +62
(defn parse-multipart-body [request]
(let [parsed (multipart/multipart-params-request request)]
(string/join "&" (for [[k v] (:multipart-params parsed)]
(str (name k) "=" v)))))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify file upload patterns and body-copy usage in the search app

echo "=== Check for file upload handling in search routes ==="
rg -n -C5 'file|upload|:tempfile|:filename' search-app/src/cmr/search/ --type=clojure

echo -e "\n=== Find all usages of :body-copy to understand expected format ==="
rg -n -C5 ':body-copy' search-app/ --type=clojure

echo -e "\n=== Check if shapefile middleware expects multipart or query string ==="
rg -n -C5 'body-copy.*shapefile|shapefile.*body-copy' search-app/src/cmr/search/middleware/ --type=clojure

Repository: nasa/Common-Metadata-Repository

Length of output: 50388


🏁 Script executed:

#!/bin/bash

# Find where parse-multipart-body is defined and used
echo "=== Find parse-multipart-body usages ==="
rg -n "parse-multipart-body" search-app/src/cmr/search/ --type=clojure

# Search specifically for :body-copy assignments and usage
echo -e "\n=== Find :body-copy usage patterns ==="
rg -n ":body-copy" search-app/src/cmr/search/ --type=clojure -C3 | head -100

# Check copy-of-body-handler middleware
echo -e "\n=== Find copy-of-body-handler ==="
rg -n "copy-of-body-handler\|body-copy" search-app/src/cmr/search/ --type=clojure -B2 -A5 | head -80

Repository: nasa/Common-Metadata-Repository

Length of output: 4255


🏁 Script executed:

#!/bin/bash

# Get the complete copy-of-body-handler function
echo "=== Complete copy-of-body-handler function ==="
rg -A20 "defn copy-of-body-handler" search-app/src/cmr/search/routes.clj

# Check if shapefile uploads use multipart/form-data
echo -e "\n=== Shapefile middleware inspection ==="
rg -n "shapefile-upload" search-app/src/cmr/search/middleware/ -A10 -B5 --type=clojure | head -60

# Find the shapefile middleware file
echo -e "\n=== Find shapefile middleware ==="
fd "shapefile" search-app/src/cmr/search/middleware/

Repository: nasa/Common-Metadata-Repository

Length of output: 2770


🏁 Script executed:

#!/bin/bash

# Check what the parse-multipart-body function actually does with multipart params
echo "=== Check parse-multipart-body full implementation ==="
rg -B2 -A8 "defn parse-multipart-body" search-app/src/cmr/search/routes.clj

# Check middleware order and where shapefile param flows
echo -e "\n=== Understand middleware chain order in routes.clj ==="
sed -n '130,145p' search-app/src/cmr/search/routes.clj

# Check if body-copy is actually used in request parameter processing
echo -e "\n=== Check if body or body-copy is used in find-concepts ==="
rg -n "find-concepts" search-app/src/cmr/search/api/concepts_search.clj -A30 | head -50

# Check what parameters are expected in the search API
echo -e "\n=== Check parameter usage in search ==="
rg -n "defn find-concepts" search-app/src/cmr/search/api/concepts_search.clj -A5

Repository: nasa/Common-Metadata-Repository

Length of output: 4254


🏁 Script executed:

#!/bin/bash

# Let's check what multipart/multipart-params-request actually returns
echo "=== Find multipart-params-request usage ==="
rg -n "multipart-params-request" search-app/src/cmr/search/ -B2 -A5 --type=clojure

# Check if body or body-copy is used anywhere for shapefile/file processing
echo -e "\n=== Check how body-copy is consumed in the API ==="
rg -n "body" search-app/src/cmr/search/api/concepts_search.clj | grep -E "(body-copy|body)" | head -30

# Look at find-concepts function to see if body is used as shapefile parameter
echo -e "\n=== Check find-concepts implementation ==="
rg -B5 -A20 "defn- find-concepts" search-app/src/cmr/search/api/concepts_search.clj | head -50

Repository: nasa/Common-Metadata-Repository

Length of output: 4808


🏁 Script executed:

#!/bin/bash

# Check the shapefile parameter conversion logic
echo "=== Check how shapefile parameter is converted ==="
rg -B5 -A15 "defmethod p/parameter->condition :shapefile" search-app/src/cmr/search/services/parameters/converters/shapefile.clj

# Check what the shapefile-upload middleware actually extracts
echo -e "\n=== Check shapefile-upload middleware ==="
cat search-app/src/cmr/search/middleware/shapefile.clj | head -100

# Check if shapefile param goes through parse-multipart-body or is handled separately
echo -e "\n=== Check wrap-multipart-params usage ==="
rg -n "wrap-multipart-params" search-app/ --type=clojure -B3 -A3

Repository: nasa/Common-Metadata-Repository

Length of output: 4732


🏁 Script executed:

#!/bin/bash

# Check if shapefile param actually flows through find-concepts with body parameter
echo "=== Check how body-copy is used in find-concepts ==="
rg -B2 -A10 "defn- find-concepts\b" search-app/src/cmr/search/api/concepts_search.clj

# Check what params structure looks like and if shapefile is passed in params or via body
echo -e "\n=== Check core-api/process-params to understand param processing ==="
rg -n "defn process-params" search-app/src/cmr/search/api/ --type=clojure -A5

# Confirm: does shapefile come as a query param or multipart param?
echo -e "\n=== Check shapefile parameter mapping ==="
rg -n "shapefile.*:" search-app/src/cmr/search/services/parameters/conversion.clj | head -20

Repository: nasa/Common-Metadata-Repository

Length of output: 4129


🏁 Script executed:

#!/bin/bash

# Check if body or query-string (from body-copy) is actually used for shapefile parameter conversion
echo "=== Check how query-string is used in query service ==="
rg -n "query-string" search-app/src/cmr/search/services/query_service.clj -B2 -A2 | head -40

# Check if shapefile comes from params or from body
echo -e "\n=== Check shapefile parameter flow in parameter conversion ==="
rg -n "shapefile" search-app/src/cmr/search/services/parameters/conversion.clj -B3 -A3 | head -60

# Confirm: is :body-copy actually used for anything related to parameter parsing?
echo -e "\n=== Check where body-copy is used in query execution ==="
rg -n "body-copy\|query-string" search-app/src/cmr/search/services/query_service.clj -C2 | head -50

Repository: nasa/Common-Metadata-Repository

Length of output: 2649


🏁 Script executed:

#!/bin/bash

# Check where params are populated - from which source?
echo "=== Check params/wrap-params ==="
rg -n "wrap-params\|defn wrap-params" search-app/src/cmr/search/ --type=clojure -B2 -A10 | head -50

# Check if body-copy is actually consumed anywhere - search exhaustively
echo -e "\n=== Search for all body-copy usages ==="
rg -n "body-copy" search-app/src/cmr/search/ --type=clojure

# Check if params are sourced from body or from URL params
echo -e "\n=== Check if params are parsed from body-copy or URL ==="
rg -n "params.*wrap-params\|wrap-params.*params" search-app/src/cmr/search/ --type=clojure -B3 -A3 | head -40

Repository: nasa/Common-Metadata-Repository

Length of output: 912


🏁 Script executed:

#!/bin/bash

# Check the middleware execution order more carefully
echo "=== Middleware order in routes.clj ==="
sed -n '125,145p' search-app/src/cmr/search/routes.clj

# Verify: does shapefile-upload populate params, and when does copy-of-body-handler run?
echo -e "\n=== Check what shapefile-upload middleware does ==="
cat search-app/src/cmr/search/middleware/shapefile.clj

# Check if there are any other file upload params besides shapefile
echo -e "\n=== Check if there are other multipart/file params ==="
rg -n "multipart\|:tempfile\|file.*upload" search-app/src/cmr/search/services/parameters/ --type=clojure | grep -v shapefile | head -20

Repository: nasa/Common-Metadata-Repository

Length of output: 3304


🏁 Script executed:

#!/bin/bash

# Let me check what actually happens to multipart-params after wrap-multipart-params
echo "=== Check Ring's wrap-multipart-params documentation behavior ==="
rg -n "multipart-params\|:multipart-params" search-app/ --type=clojure | head -30

# Check if shapefile comes via params or via explicit multipart handling
echo -e "\n=== Trace shapefile parameter flow ==="
rg -n "shapefile" search-app/src/cmr/search/api/concepts_search.clj --type=clojure -B3 -A3

# Most importantly: verify if body-copy is ever used for anything that matters
echo -e "\n=== Check all functions that receive body parameter ==="
rg -n "defn.*body\]" search-app/src/cmr/search/api/concepts_search.clj --type=clojure -A5 | head -60

Repository: nasa/Common-Metadata-Repository

Length of output: 237


🏁 Script executed:

#!/bin/bash

# Let's trace how shapefile actually gets to the handler
echo "=== Check if shapefile is a query param or multipart field ==="
rg -n "shapefile" search-app/src/cmr/search/api/concepts_search.clj --type=clojure -C5

# Check what params/wrap-params does - does it consume multipart-params?
echo -e "\n=== Check wrap-params behavior ==="
rg -n "defn wrap-params\|require.*wrap-params" search-app/src/cmr/search/ --type=clojure -B2 -A2

# Check if shapefile comes from query string or body
echo -e "\n=== Check Ring wrap-params documentation usage ==="
rg -n "require.*params\|ring.middleware.params" search-app/src/cmr/search/routes.clj -B2 -A2

# Look at complete routes file to understand param flow
echo -e "\n=== Check complete search routes configuration ==="
head -80 search-app/src/cmr/search/routes.clj

Repository: nasa/Common-Metadata-Repository

Length of output: 4080


🏁 Script executed:

#!/bin/bash

# Verify how wrap-multipart-params populates :multipart-params and how params/wrap-params handles it
echo "=== Check if Ring's wrap-params merges multipart-params into params ==="
rg -n "wrap-params\|multipart" search-app/src/cmr/search/routes.clj

# Confirm: does the shapefile File object make it to the converter?
echo -e "\n=== Check shapefile converter input (shapefile-info) ==="
rg -B5 -A15 "esri-shapefile->condition-vec\|geojson->conditions-vec" search-app/src/cmr/search/services/parameters/converters/shapefile.clj | head -80

# Final confirmation: what is shapefile-info and where does it come from?
echo -e "\n=== Check parameter->condition flow for shapefile ==="
rg -B10 "parameter->condition :shapefile" search-app/src/cmr/search/services/parameters/converters/shapefile.clj

Repository: nasa/Common-Metadata-Repository

Length of output: 640


🏁 Script executed:

#!/bin/bash

# Let's trace exactly how shapefile gets from multipart to the converter
echo "=== Check if shapefile File object is in params after wrap-multipart-params ==="
# Find where parameter->condition is called with shapefile
rg -n "parameter->condition" search-app/src/cmr/search/services/ --type=clojure -B3 -A3 | head -40

# Check how Ring's wrap-multipart-params works - does it merge into :params?
echo -e "\n=== Verify Ring params middleware behavior ==="
rg -n "require.*ring.middleware.params\|require.*params" search-app/src/cmr/search/routes.clj

# Final check: confirm wrap-multipart-params puts files into :multipart-params
echo -e "\n=== Confirm multipart-params location after wrap-multipart-params ==="
cat search-app/src/cmr/search/middleware/shapefile.clj | grep -A5 "wrap-multipart-params"

Repository: nasa/Common-Metadata-Repository

Length of output: 5529


🏁 Script executed:

#!/bin/bash

# Check what happens to other multipart params - are there any that are not files?
echo "=== Check if there are other multipart param types besides files ==="
rg -n "multipart\|:tempfile\|File\|multipart-params" search-app/src/cmr/search/services/parameters/ --type=clojure | grep -v shapefile

# Check Ring's wrap-params behavior - does it merge multipart-params into params?
echo -e "\n=== Verify that wrap-params merges multipart-params into params ==="
# Ring's wrap-params is standard, but let's confirm by checking how the params are used
rg -n ":params\|multipart-params" search-app/src/cmr/search/api/concepts_search.clj | head -20

# Check if body-copy string representation breaks anything
echo -e "\n=== Check if query-string (from body-copy) is used in parameter parsing ==="
rg -n "query-string" search-app/src/cmr/search/services/parameters/ --type=clojure -B2 -A2

Repository: nasa/Common-Metadata-Repository

Length of output: 289


🏁 Script executed:

#!/bin/bash

# Let's check where params destructuring happens in the handler
echo "=== Check handler params destructuring ==="
rg -n "{params :params" search-app/src/cmr/search/api/concepts_search.clj -B2 -A10 | head -50

# Check what Ring's wrap-params does - this should merge URL params and multipart-params
echo -e "\n=== Check which params middleware is used ==="
grep -n "wrap-params\|wrap-keyword-params\|wrap-nested-params" search-app/src/cmr/search/routes.clj

# Let me check Ring's standard behavior by looking at CMR's usage of wrap-params
echo -e "\n=== Understand params flow: which order and which merge ==="
sed -n '130,145p' search-app/src/cmr/search/routes.clj

Repository: nasa/Common-Metadata-Repository

Length of output: 1145


The conversion from multipart form data to a query string format is fragile and creates unnecessary data loss, though it doesn't currently break file uploads.

The original concern is partially valid. The (str v) conversion in parse-multipart-body does create a lossy string representation (e.g., File objects become "java.io.File@abc123" references). However, the code functions for shapefiles because:

  1. Ring's wrap-multipart-params middleware preserves actual File objects in :multipart-params
  2. Ring's subsequent wrap-params middleware merges :multipart-params into :params
  3. The shapefile converter accesses files via (:shapefile params), retrieving the actual File object, not the string from :body-copy

The real issue is that parse-multipart-body creates a string representation that discards complex data types, and :body-copy receives this degraded information. While this doesn't affect file uploads (which work independently via :params), the design is fragile and could fail if:

  • Other multipart parameters containing complex data are added
  • The middleware ordering changes
  • Code attempts to use :body-copy for parameter extraction instead of :params

The string representation in :body-copy is only appropriate for JSON/AQL query bodies, not for multipart parameter preservation.

🤖 Prompt for AI Agents
search-app/src/cmr/search/routes.clj around lines 59-62: parse-multipart-body
currently joins all multipart values with str which loses complex types (Files,
streams, maps) and yields a fragile :body-copy; change it to detect multipart
requests and preserve non-scalar values instead of stringifying them —
specifically, when :multipart-params exists, build the body-copy only from
scalar/text params (or omit multipart params entirely) and skip/ignore File and
InputStream instances (or serialize only their metadata if needed);
alternatively, if you must provide a copy for logging, convert only
primitive/string values to a safe representation (or JSON encode simple params)
and leave complex objects out so files remain accessible via :params and
:body-copy does not drop type information.

@zimzoom zimzoom merged commit 23f93e6 into master Jan 7, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants