Conversation
3d2eec8 to
a8e071e
Compare
There was a problem hiding this comment.
Pull request overview
Removes backend-side HTML escaping of markdown/problem description fields to avoid double-escaping now that sanitization is handled in the frontend.
Changes:
- Removed MongoEngine
pre_saveescaping hooks for markdown fields (Homework/Message/Announcement/PostThread). - Removed
ProblemDescription.escape()and its correspondingpre_savehook. - Deleted the unit test that asserted backend escaping behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
mongo/engine.py |
Removes backend escaping hooks for markdown/problem description content before persistence. |
tests/test_mongo_engine.py |
Drops the now-obsolete test that expected ProblemDescription.escape() to HTML-escape content. |
Comments suppressed due to low confidence (2)
mongo/engine.py:181
- Removing server-side escaping means existing persisted documents (saved before this change) may still contain HTML-escaped markdown, while newly saved docs will be raw. Consider adding a one-time migration (or a background job) to unescape existing
Homework.markdownvalues (and the other affected collections) so rendering is consistent across old/new records.
homework_name = StringField(
max_length=64,
required=True,
db_field='homeworkName',
unique_with='course_id',
)
markdown = StringField(max_length=10000, default='')
scoreboard_status = IntField(
default=0,
choices=[0, 1],
mongo/engine.py:174
- Now that
Homework.markdownis stored unescaped, any consumer that renders it to HTML must ensure sanitization happens at render time (client-side or server-side) to prevent stored XSS via embedded HTML/script inside markdown.
homework_name = StringField(
max_length=64,
required=True,
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| def escape(self): | ||
| self.description, self.input, self.output, self.hint = (html.escape( | ||
| v or '') for v in ( | ||
| self.description, | ||
| self.input, | ||
| self.output, | ||
| self.hint, | ||
| )) | ||
| _io = zip(self.sample_input, self.sample_output) | ||
| for i, (ip, op) in enumerate(_io): | ||
| self.sample_input[i] = ip or html.escape(ip) | ||
| self.sample_output[i] = op or html.escape(op) | ||
|
|
||
|
|
||
| @handler(signals.pre_save) | ||
| def problem_desc_escape(sender, document): | ||
| document.description.escape() | ||
|
|
||
|
|
||
| @problem_desc_escape.apply | ||
| class Problem(Document): | ||
|
|
There was a problem hiding this comment.
Since the ProblemDescription escaping hook was removed, existing Problem.description content in the DB may still be HTML-escaped while newly saved problems will keep raw strings. Consider a migration to unescape legacy description/input/output/hint fields to avoid inconsistent rendering across existing vs new problems.
| def test_in(self): | ||
| d = Duration() | ||
| assert datetime.datetime.now() in d |
There was a problem hiding this comment.
This PR removes the only test that asserted how ProblemDescription content is transformed, but it doesn’t add a replacement assertion for the new intended behavior (that markdown/problem description fields are not HTML-escaped). Please add a regression test that saves and reloads one of the affected documents (e.g., Announcement/Homework/Problem) containing <h1> or <script> in the text and verifies the stored value is unchanged.
mongo/engine.py
Outdated
| return decorator | ||
|
|
||
|
|
||
| @handler(signals.pre_save) | ||
| def escape_markdown(sender, document): | ||
| document.markdown = html.escape(document.markdown) | ||
|
|
||
|
|
||
| class ZipField(FileField): | ||
|
|
||
| def __init__(self, max_size=0, **ks): |
There was a problem hiding this comment.
With the removal of the pre_save escaping hooks, this module no longer appears to use MongoEngine signals at all. Consider removing the now-dead handler() helper plus the unused signals (and html) imports to avoid confusion and potential lint failures from unused code.
We already apply DOMPurify in frontend. Remove escape logic in backend to avoid double escape input.
a8e071e to
653ca7f
Compare
We already apply DOMPurify in frontend. Remove escape logic in backend to avoid double escape input.