Updates checksum calculation from MD5 to SHA256 by happylittle7 · Pull Request #160 · Spaghetti-OJ/back_end

happylittle7 · 2025-12-27T13:54:20Z

Overview

Replaces all usage of MD5 checksum for problem testcase archives with SHA256 to improve security and integrity verification.
Updates API endpoints and documentation to reflect the new checksum algorithm.
Introduces utility functions for retrieving the SHA256 hash and file path of testcase archives.
Integrates the SHA256 hash into the payload sent to the sandbox during code submission, ensuring accurate validation.

Motivation

Switching to SHA256 enhances cryptographic strength when verifying testcase files, addressing security best practices and preventing potential hash collisions.

Details

Documentation and API responses now reference SHA256 instead of MD5.
The backend now computes and exposes SHA256 hashes for testcases.
Submissions sent to the sandbox include the correct SHA256 hash, ensuring consistency between backend and sandbox verification.
Adds error handling for missing testcase archives during submission.

No backward compatibility issues are expected, but clients must expect SHA256 hashes in responses.

Copilot

Pull request overview

This PR updates the checksum algorithm from MD5 to SHA256 for problem testcase archive verification, improving security and cryptographic strength. The changes span backend services, API endpoints, sandbox client integration, and documentation.

Key Changes:

Introduced utility functions get_problem_testcase_hash() and get_problem_testcase_path() to retrieve SHA256 hashes and file paths for testcase archives
Updated sandbox submission payload to include SHA256 hash instead of placeholder TODO value
Modified API endpoint documentation and implementation to return SHA256 checksums instead of MD5

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File	Description
`submissions/sandbox_client.py`	Integrated `get_problem_testcase_hash()` to fetch and validate SHA256 hashes before submission; added error handling for missing testcase packages; refactored settings access and payload assembly
`problems/views/api.py`	Updated `ProblemTestCaseChecksumView` to compute and return SHA256 checksums instead of MD5 for sandbox verification
`problems/services/storage.py`	Added two new utility functions for retrieving SHA256 hash and file path of testcase archives; leverages existing `_sha256_of_fileobj()` helper
`docs/problems.MD`	Updated documentation to reflect SHA256 checksum usage in API responses and endpoint descriptions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-27T13:58:33Z

-        file_content = submission.source_code.encode('utf-8')
+        # 6. 準備程式碼檔案
+        source_code = submission.source_code or ''
+        file_extension = get_file_extension(submission.language_type)


The get_file_extension function is being called with submission.language_type, which is a numeric code (0=C, 1=C++, 2=Python, etc.). However, get_file_extension expects a language string ('c', 'cpp', 'python', etc.) as its parameter. The converted language string is already available in the language variable from line 85. This should be changed to use language instead of submission.language_type.

Suggested change

file_extension = get_file_extension(submission.language_type)

file_extension = get_file_extension(language)

Copilot · 2025-12-27T13:58:33Z

+        file_extension = get_file_extension(submission.language_type)
        files = {
-            'file': (filename, BytesIO(file_content), 'text/plain')
+            'file': (f'solution{file_extension}', BytesIO(source_code.encode('utf-8')), 'text/plain')


The filename is missing a dot before the extension. The current code generates filenames like 'solutionpy' or 'solutioncpp' instead of 'solution.py' or 'solution.cpp'. This should be changed to f'solution.{file_extension}'.

Suggested change

'file': (f'solution{file_extension}', BytesIO(source_code.encode('utf-8')), 'text/plain')

'file': (f'solution.{file_extension}', BytesIO(source_code.encode('utf-8')), 'text/plain')

Copilot · 2025-12-27T13:58:33Z

    except Problems.DoesNotExist:
-        logger.error(f'Problem not found: problem_id={submission.problem_id}')
-        raise ValueError(f'Problem {submission.problem_id} not found')
-
-    except requests.RequestException as e:
-        logger.error(f'Sandbox API error: {str(e)}')
-        raise
-
-    except Exception as e:
-        logger.error(f'Unexpected error submitting to sandbox: {str(e)}')
-        raise
+        raise ValueError(f"Problem {submission.problem_id} does not exist")


The error handling has been reduced compared to the previous implementation. The old code had specific handling for requests.RequestException and a general Exception catch with appropriate logging. Without these handlers, network errors and other exceptions will propagate without being logged, making debugging difficult. Consider re-adding error handling for requests.RequestException and ValueError (for missing testcase packages) with appropriate logging.

Copilot · 2025-12-27T13:58:33Z

        response = requests.post(
-            url,
+            f'{settings.SANDBOX_API_URL}/api/v1/submissions',
            data=data,
            files=files,
            headers=headers,
-            timeout=SANDBOX_TIMEOUT
+            timeout=getattr(settings, 'SANDBOX_TIMEOUT', 30)
        )

-        # 7. 檢查回應
        response.raise_for_status()
-        result = response.json()
-
-        logger.info(f'Sandbox response: {result}')
-        return result
+        return response.json()


The informative logging statements that were present in the old code have been removed. The old implementation logged the submission details before sending the request and the sandbox response upon success. These logs are valuable for debugging and monitoring. Consider re-adding at least a log statement before sending the request (e.g., logger.info(f'Submitting to Sandbox: submission_id={submission.id}, problem_id={submission.problem_id}')) and after receiving a successful response.

Copilot · 2025-12-27T13:58:34Z

@@ -147,12 +147,12 @@ curl "http://127.0.0.1:8000/problem/30/test-case" -H "Authorization: Bearer $TOK



This line still references "MD5" when describing the checksum verification process. Since the PR updates checksum calculation from MD5 to SHA256 throughout the system, this should be updated to say "SHA256" instead of "MD5" for consistency with the rest of the changes.

fix: update checksum method from MD5 to SHA256 and add related functions

84d443d

happylittle7 requested a review from Copilot December 27, 2025 13:54

Copilot started reviewing on behalf of happylittle7 December 27, 2025 13:54 View session

Copilot AI reviewed Dec 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updates checksum calculation from MD5 to SHA256#160

Updates checksum calculation from MD5 to SHA256#160
happylittle7 wants to merge 1 commit intodevfrom
fix/checksum

happylittle7 commented Dec 27, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 27, 2025

Uh oh!

Copilot AI Dec 27, 2025

Uh oh!

Copilot AI Dec 27, 2025

Uh oh!

Copilot AI Dec 27, 2025

Uh oh!

Copilot AI Dec 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	file_extension = get_file_extension(submission.language_type)
	file_extension = get_file_extension(language)

	'file': (f'solution{file_extension}', BytesIO(source_code.encode('utf-8')), 'text/plain')
	'file': (f'solution.{file_extension}', BytesIO(source_code.encode('utf-8')), 'text/plain')

		@@ -147,12 +147,12 @@ curl "http://127.0.0.1:8000/problem/30/test-case" -H "Authorization: Bearer $TOK

Conversation

happylittle7 commented Dec 27, 2025

Overview

Motivation

Details

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants