Skip to content

fix: don't escape markdown fields#323

Merged
Bogay merged 1 commit intomainfrom
prevent-double-escape-for-markdown-field
Mar 11, 2026
Merged

fix: don't escape markdown fields#323
Bogay merged 1 commit intomainfrom
prevent-double-escape-for-markdown-field

Conversation

@Bogay
Copy link
Copy Markdown
Member

@Bogay Bogay commented Mar 10, 2026

We already apply DOMPurify in frontend. Remove escape logic in backend to avoid double escape input.

@Bogay Bogay requested a review from as535364 as a code owner March 10, 2026 16:54
@Bogay Bogay requested review from Copilot and removed request for as535364 March 10, 2026 16:54
@Bogay Bogay force-pushed the prevent-double-escape-for-markdown-field branch from 3d2eec8 to a8e071e Compare March 10, 2026 16:56
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Removes backend-side HTML escaping of markdown/problem description fields to avoid double-escaping now that sanitization is handled in the frontend.

Changes:

  • Removed MongoEngine pre_save escaping hooks for markdown fields (Homework/Message/Announcement/PostThread).
  • Removed ProblemDescription.escape() and its corresponding pre_save hook.
  • Deleted the unit test that asserted backend escaping behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
mongo/engine.py Removes backend escaping hooks for markdown/problem description content before persistence.
tests/test_mongo_engine.py Drops the now-obsolete test that expected ProblemDescription.escape() to HTML-escape content.
Comments suppressed due to low confidence (2)

mongo/engine.py:181

  • Removing server-side escaping means existing persisted documents (saved before this change) may still contain HTML-escaped markdown, while newly saved docs will be raw. Consider adding a one-time migration (or a background job) to unescape existing Homework.markdown values (and the other affected collections) so rendering is consistent across old/new records.
    homework_name = StringField(
        max_length=64,
        required=True,
        db_field='homeworkName',
        unique_with='course_id',
    )
    markdown = StringField(max_length=10000, default='')
    scoreboard_status = IntField(
        default=0,
        choices=[0, 1],

mongo/engine.py:174

  • Now that Homework.markdown is stored unescaped, any consumer that renders it to HTML must ensure sanitization happens at render time (client-side or server-side) to prevent stored XSS via embedded HTML/script inside markdown.
    homework_name = StringField(
        max_length=64,
        required=True,

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 259 to 262

def escape(self):
self.description, self.input, self.output, self.hint = (html.escape(
v or '') for v in (
self.description,
self.input,
self.output,
self.hint,
))
_io = zip(self.sample_input, self.sample_output)
for i, (ip, op) in enumerate(_io):
self.sample_input[i] = ip or html.escape(ip)
self.sample_output[i] = op or html.escape(op)


@handler(signals.pre_save)
def problem_desc_escape(sender, document):
document.description.escape()


@problem_desc_escape.apply
class Problem(Document):

Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the ProblemDescription escaping hook was removed, existing Problem.description content in the DB may still be HTML-escaped while newly saved problems will keep raw strings. Consider a migration to unescape legacy description/input/output/hint fields to avoid inconsistent rendering across existing vs new problems.

Copilot uses AI. Check for mistakes.
Comment on lines 26 to 28
def test_in(self):
d = Duration()
assert datetime.datetime.now() in d
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR removes the only test that asserted how ProblemDescription content is transformed, but it doesn’t add a replacement assertion for the new intended behavior (that markdown/problem description fields are not HTML-escaped). Please add a regression test that saves and reloads one of the affected documents (e.g., Announcement/Homework/Problem) containing <h1> or <script> in the text and verifies the stored value is unchanged.

Copilot uses AI. Check for mistakes.
mongo/engine.py Outdated
Comment on lines 42 to 47
return decorator


@handler(signals.pre_save)
def escape_markdown(sender, document):
document.markdown = html.escape(document.markdown)


class ZipField(FileField):

def __init__(self, max_size=0, **ks):
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the removal of the pre_save escaping hooks, this module no longer appears to use MongoEngine signals at all. Consider removing the now-dead handler() helper plus the unused signals (and html) imports to avoid confusion and potential lint failures from unused code.

Copilot uses AI. Check for mistakes.
We already apply DOMPurify in frontend. Remove escape logic in backend
to avoid double escape input.
@Bogay Bogay force-pushed the prevent-double-escape-for-markdown-field branch from a8e071e to 653ca7f Compare March 10, 2026 17:04
@Bogay Bogay added this pull request to the merge queue Mar 11, 2026
Merged via the queue into main with commit d7f30e3 Mar 11, 2026
1 check passed
@Bogay Bogay deleted the prevent-double-escape-for-markdown-field branch March 11, 2026 07:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants