Skip to content

Conversation

@chris-lightbug
Copy link
Contributor

No description provided.

@github-advanced-security
Copy link

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

// Add text before the link
if (match.index > lastIndex) {
const beforeText = html.substring(lastIndex, match.index);
const cleanBefore = beforeText.replace(/<[^>]+>/g, '');

Check failure

Code scanning / CodeQL

Incomplete multi-character sanitization High

This string may still contain
<script
, which may cause an HTML element injection vulnerability.

Copilot Autofix

AI 3 months ago

To fix this problem, we need to ensure that all HTML tags (including those that may be split or repeated) are fully removed from the string. The optimal solution is to repeatedly apply the replacement until no further matches are found. This approach is robust against edge cases and multi-character reintroduction.

Concretely, replace all uses of .replace(/<[^>]+>/g, '') (on lines 85, 102, and 110) with a helper function that strips HTML tags by running this pattern until the string does not change anymore. The helper function can be defined inside the <script setup> block and should be used to sanitize the beforeText, remainingText, and in line 110 when no links are found.

No extra dependencies are needed for robust tag removal if looping is sufficient for the application (PDF text).

File/region to change:

  • components/DownloadPdfButton.vue, specifically the body of parseHtmlForPdfLinks function (lines 85, 102, and 110).

Implementation details:

  • Add a stripHtmlTags helper function before parseHtmlForPdfLinks.
  • Replace direct .replace(/<[^>]+>/g, '') calls with stripHtmlTags(text).
Suggested changeset 1
components/DownloadPdfButton.vue

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/components/DownloadPdfButton.vue b/components/DownloadPdfButton.vue
--- a/components/DownloadPdfButton.vue
+++ b/components/DownloadPdfButton.vue
@@ -70,6 +70,18 @@
  * @param {string} html - HTML string that may contain anchor tags
  * @returns {Array} Array of text segments with link information
  */
+// Repeatedly remove all HTML tags until none remain
+function stripHtmlTags(input) {
+  if (!input || typeof input !== 'string') return input;
+  let output = input;
+  let prev;
+  do {
+    prev = output;
+    output = output.replace(/<[^>]+>/g, '');
+  } while (output !== prev);
+  return output;
+}
+
 function parseHtmlForPdfLinks(html) {
   if (!html || typeof html !== 'string') return [{ text: html || '', isLink: false }];
 
@@ -82,7 +94,7 @@
     // Add text before the link
     if (match.index > lastIndex) {
       const beforeText = html.substring(lastIndex, match.index);
-      const cleanBefore = beforeText.replace(/<[^>]+>/g, '');
+      const cleanBefore = stripHtmlTags(beforeText);
       if (cleanBefore) {
         segments.push({ text: cleanBefore, isLink: false });
       }
@@ -99,7 +111,7 @@
   // Add remaining text after last link
   if (lastIndex < html.length) {
     const remainingText = html.substring(lastIndex);
-    const cleanRemaining = remainingText.replace(/<[^>]+>/g, '');
+    const cleanRemaining = stripHtmlTags(remainingText);
     if (cleanRemaining) {
       segments.push({ text: cleanRemaining, isLink: false });
     }
@@ -107,7 +119,7 @@
 
   // If no links found, return the cleaned HTML
   if (segments.length === 0) {
-    const cleanText = html.replace(/<[^>]+>/g, '');
+    const cleanText = stripHtmlTags(html);
     return [{ text: cleanText, isLink: false }];
   }
 
EOF
@@ -70,6 +70,18 @@
* @param {string} html - HTML string that may contain anchor tags
* @returns {Array} Array of text segments with link information
*/
// Repeatedly remove all HTML tags until none remain
function stripHtmlTags(input) {
if (!input || typeof input !== 'string') return input;
let output = input;
let prev;
do {
prev = output;
output = output.replace(/<[^>]+>/g, '');
} while (output !== prev);
return output;
}

function parseHtmlForPdfLinks(html) {
if (!html || typeof html !== 'string') return [{ text: html || '', isLink: false }];

@@ -82,7 +94,7 @@
// Add text before the link
if (match.index > lastIndex) {
const beforeText = html.substring(lastIndex, match.index);
const cleanBefore = beforeText.replace(/<[^>]+>/g, '');
const cleanBefore = stripHtmlTags(beforeText);
if (cleanBefore) {
segments.push({ text: cleanBefore, isLink: false });
}
@@ -99,7 +111,7 @@
// Add remaining text after last link
if (lastIndex < html.length) {
const remainingText = html.substring(lastIndex);
const cleanRemaining = remainingText.replace(/<[^>]+>/g, '');
const cleanRemaining = stripHtmlTags(remainingText);
if (cleanRemaining) {
segments.push({ text: cleanRemaining, isLink: false });
}
@@ -107,7 +119,7 @@

// If no links found, return the cleaned HTML
if (segments.length === 0) {
const cleanText = html.replace(/<[^>]+>/g, '');
const cleanText = stripHtmlTags(html);
return [{ text: cleanText, isLink: false }];
}

Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
// Add remaining text after last link
if (lastIndex < html.length) {
const remainingText = html.substring(lastIndex);
const cleanRemaining = remainingText.replace(/<[^>]+>/g, '');

Check failure

Code scanning / CodeQL

Incomplete multi-character sanitization High

This string may still contain
<script
, which may cause an HTML element injection vulnerability.

Copilot Autofix

AI 3 months ago

The best way to fix the problem is to avoid simple one-pass regexes for removing HTML tags, as explained. You have two main choices:

  1. Use a well-tested sanitizer library. In Vue.js/JavaScript, a highly recommended choice is the sanitize-html npm package, which reliably and recursively removes all dangerous tags and attributes, handling even fragmented or nested input.
  2. If use of a library is not an option here, repeatedly apply the replace() operation until no more changes occur, as shown in the background examples.

Given only the shown code is editable, and library installation may not always be possible, a robust and generic solution is to repeatedly apply the regex replacement in parseHtmlForPdfLinks, for each location where we strip tags. Concretely, this means:

  • On lines 85, 102, and 111, rather than doing a single .replace(/<[^>]+>/g, ''), use a helper function which loops until no more changes.
  • Implement a helper function (locally or above parseHtmlForPdfLinks) called stripHtmlTags(input), which repeatedly applies the regular expression replacement as demonstrated in the Background.

If using a library is allowed (project uses npm), you could install sanitize-html and use that instead everywhere tags are stripped for much better results.


Suggested changeset 1
components/DownloadPdfButton.vue

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/components/DownloadPdfButton.vue b/components/DownloadPdfButton.vue
--- a/components/DownloadPdfButton.vue
+++ b/components/DownloadPdfButton.vue
@@ -70,6 +70,17 @@
  * @param {string} html - HTML string that may contain anchor tags
  * @returns {Array} Array of text segments with link information
  */
+// Helper function for robust tag stripping
+function stripHtmlTags(input) {
+  if (!input || typeof input !== "string") return input;
+  let previous;
+  do {
+    previous = input;
+    input = input.replace(/<[^>]+>/g, '');
+  } while (input !== previous);
+  return input;
+}
+
 function parseHtmlForPdfLinks(html) {
   if (!html || typeof html !== 'string') return [{ text: html || '', isLink: false }];
 
@@ -82,7 +93,7 @@
     // Add text before the link
     if (match.index > lastIndex) {
       const beforeText = html.substring(lastIndex, match.index);
-      const cleanBefore = beforeText.replace(/<[^>]+>/g, '');
+      const cleanBefore = stripHtmlTags(beforeText);
       if (cleanBefore) {
         segments.push({ text: cleanBefore, isLink: false });
       }
@@ -99,7 +110,7 @@
   // Add remaining text after last link
   if (lastIndex < html.length) {
     const remainingText = html.substring(lastIndex);
-    const cleanRemaining = remainingText.replace(/<[^>]+>/g, '');
+    const cleanRemaining = stripHtmlTags(remainingText);
     if (cleanRemaining) {
       segments.push({ text: cleanRemaining, isLink: false });
     }
@@ -107,7 +118,7 @@
 
   // If no links found, return the cleaned HTML
   if (segments.length === 0) {
-    const cleanText = html.replace(/<[^>]+>/g, '');
+    const cleanText = stripHtmlTags(html);
     return [{ text: cleanText, isLink: false }];
   }
 
EOF
@@ -70,6 +70,17 @@
* @param {string} html - HTML string that may contain anchor tags
* @returns {Array} Array of text segments with link information
*/
// Helper function for robust tag stripping
function stripHtmlTags(input) {
if (!input || typeof input !== "string") return input;
let previous;
do {
previous = input;
input = input.replace(/<[^>]+>/g, '');
} while (input !== previous);
return input;
}

function parseHtmlForPdfLinks(html) {
if (!html || typeof html !== 'string') return [{ text: html || '', isLink: false }];

@@ -82,7 +93,7 @@
// Add text before the link
if (match.index > lastIndex) {
const beforeText = html.substring(lastIndex, match.index);
const cleanBefore = beforeText.replace(/<[^>]+>/g, '');
const cleanBefore = stripHtmlTags(beforeText);
if (cleanBefore) {
segments.push({ text: cleanBefore, isLink: false });
}
@@ -99,7 +110,7 @@
// Add remaining text after last link
if (lastIndex < html.length) {
const remainingText = html.substring(lastIndex);
const cleanRemaining = remainingText.replace(/<[^>]+>/g, '');
const cleanRemaining = stripHtmlTags(remainingText);
if (cleanRemaining) {
segments.push({ text: cleanRemaining, isLink: false });
}
@@ -107,7 +118,7 @@

// If no links found, return the cleaned HTML
if (segments.length === 0) {
const cleanText = html.replace(/<[^>]+>/g, '');
const cleanText = stripHtmlTags(html);
return [{ text: cleanText, isLink: false }];
}

Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
// If no links found, return the cleaned HTML
if (segments.length === 0) {
const cleanText = html.replace(/<[^>]+>/g, '');

Check failure

Code scanning / CodeQL

Incomplete multi-character sanitization High

This string may still contain
<script
, which may cause an HTML element injection vulnerability.

Copilot Autofix

AI 3 months ago

The best way to fix this issue is to use a well-tested HTML sanitization library (like sanitize-html) to safely remove all HTML tags and ensure no unsafe content remains. However, if dependencies are to be minimized, another robust approach is to repeatedly apply the regex replacement until no matches remain (as shown in the background), since this will ensure that multi-character overlapping matches are removed. This applies specifically to the region where html.replace(/<[^>]+>/g, '') is used (line 110 and elsewhere for tag removal).

To implement the change:

  • Replace direct single-pass usages of .replace(/<[^>]+>/g, '') with a loop that repeatedly strips all HTML tags until none remain.
  • Optionally, encapsulate this logic in a helper function (for maintainability).
  • Insert the helper function above its usage in the file.
  • Update usages to call the new helper method.

No imports are needed for this approach. Changes are all in file components/DownloadPdfButton.vue, in the region with the PDF HTML parsing/cleaning logic.


Suggested changeset 1
components/DownloadPdfButton.vue

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/components/DownloadPdfButton.vue b/components/DownloadPdfButton.vue
--- a/components/DownloadPdfButton.vue
+++ b/components/DownloadPdfButton.vue
@@ -70,6 +70,16 @@
  * @param {string} html - HTML string that may contain anchor tags
  * @returns {Array} Array of text segments with link information
  */
+// Helper function to robustly remove all HTML tags (multi-pass)
+function stripHtmlTags(input) {
+  let previous;
+  do {
+    previous = input;
+    input = input.replace(/<[^>]+>/g, '');
+  } while (input !== previous);
+  return input;
+}
+
 function parseHtmlForPdfLinks(html) {
   if (!html || typeof html !== 'string') return [{ text: html || '', isLink: false }];
 
@@ -82,7 +92,7 @@
     // Add text before the link
     if (match.index > lastIndex) {
       const beforeText = html.substring(lastIndex, match.index);
-      const cleanBefore = beforeText.replace(/<[^>]+>/g, '');
+      const cleanBefore = stripHtmlTags(beforeText);
       if (cleanBefore) {
         segments.push({ text: cleanBefore, isLink: false });
       }
@@ -99,7 +109,7 @@
   // Add remaining text after last link
   if (lastIndex < html.length) {
     const remainingText = html.substring(lastIndex);
-    const cleanRemaining = remainingText.replace(/<[^>]+>/g, '');
+    const cleanRemaining = stripHtmlTags(remainingText);
     if (cleanRemaining) {
       segments.push({ text: cleanRemaining, isLink: false });
     }
@@ -107,7 +117,7 @@
 
   // If no links found, return the cleaned HTML
   if (segments.length === 0) {
-    const cleanText = html.replace(/<[^>]+>/g, '');
+    const cleanText = stripHtmlTags(html);
     return [{ text: cleanText, isLink: false }];
   }
 
EOF
@@ -70,6 +70,16 @@
* @param {string} html - HTML string that may contain anchor tags
* @returns {Array} Array of text segments with link information
*/
// Helper function to robustly remove all HTML tags (multi-pass)
function stripHtmlTags(input) {
let previous;
do {
previous = input;
input = input.replace(/<[^>]+>/g, '');
} while (input !== previous);
return input;
}

function parseHtmlForPdfLinks(html) {
if (!html || typeof html !== 'string') return [{ text: html || '', isLink: false }];

@@ -82,7 +92,7 @@
// Add text before the link
if (match.index > lastIndex) {
const beforeText = html.substring(lastIndex, match.index);
const cleanBefore = beforeText.replace(/<[^>]+>/g, '');
const cleanBefore = stripHtmlTags(beforeText);
if (cleanBefore) {
segments.push({ text: cleanBefore, isLink: false });
}
@@ -99,7 +109,7 @@
// Add remaining text after last link
if (lastIndex < html.length) {
const remainingText = html.substring(lastIndex);
const cleanRemaining = remainingText.replace(/<[^>]+>/g, '');
const cleanRemaining = stripHtmlTags(remainingText);
if (cleanRemaining) {
segments.push({ text: cleanRemaining, isLink: false });
}
@@ -107,7 +117,7 @@

// If no links found, return the cleaned HTML
if (segments.length === 0) {
const cleanText = html.replace(/<[^>]+>/g, '');
const cleanText = stripHtmlTags(html);
return [{ text: cleanText, isLink: false }];
}

Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
});
// Strip any remaining HTML tags
out = out.replace(/<[^>]+>/g, '');

Check failure

Code scanning / CodeQL

Incomplete multi-character sanitization High

This string may still contain
<script
, which may cause an HTML element injection vulnerability.

Copilot Autofix

AI 3 months ago

To correctly remove all HTML tags, including potentially dangerous or nested occurrences, we should either (a) use a well-known and robust library such as sanitize-html to sanitize the HTML string, or (b) if that is not available, repeatedly apply the regular expression replacement until no further replacements occur. The second approach can be implemented by running the replacement in a loop until the output no longer changes, which ensures that nested or consecutive tags are also removed. As we are only permitted to modify existing code in components/DownloadPdfButton.vue, we will update the implementation of htmlAnchorsToPlainText to use this approach, with no external dependencies. No code outside this function will change.

  • Edit the function htmlAnchorsToPlainText.
  • Replace the line out = out.replace(/<[^>]+>/g, ''); with a loop that continues replacing tags until none remain (out remains unchanged by the replacement).
  • No imports or other project files need to be changed.

Suggested changeset 1
components/DownloadPdfButton.vue

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/components/DownloadPdfButton.vue b/components/DownloadPdfButton.vue
--- a/components/DownloadPdfButton.vue
+++ b/components/DownloadPdfButton.vue
@@ -188,7 +188,12 @@
   });
 
   // Strip any remaining HTML tags
-  out = out.replace(/<[^>]+>/g, '');
+  // Repeatedly remove tags to handle nested/multiple cases
+  let previous;
+  do {
+    previous = out;
+    out = out.replace(/<[^>]+>/g, '');
+  } while (out !== previous);
 
   return out;
 }
EOF
@@ -188,7 +188,12 @@
});

// Strip any remaining HTML tags
out = out.replace(/<[^>]+>/g, '');
// Repeatedly remove tags to handle nested/multiple cases
let previous;
do {
previous = out;
out = out.replace(/<[^>]+>/g, '');
} while (out !== previous);

return out;
}
Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
@chris-lightbug chris-lightbug merged commit e3a43d6 into production Oct 1, 2025
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants