Adds LLM-based test case generation API endpoints by happylittle7 · Pull Request #195 · Spaghetti-OJ/back_end

happylittle7 · 2025-12-28T21:47:38Z

Overview

Introduces integration with an external LLM service to automate test data generation for programming problems, enhancing the problem creation workflow and reducing manual effort.

Details

Configures LLM test generation service settings, including endpoint and timeout.
Implements service logic to handle solution uploads, test case generation, and service health checks.
Provides new API endpoints for:
- Health checks of the LLM testgen service
- Automated test case generation based on problem data
- Customizable test case generation using API parameters
- Saving generated test cases directly to a problem
Updates routing to expose these endpoints.
Ensures permission checks and consistent API response formatting.
Supports both direct input/output generation and generation using provided solution code.

Motivation

Automates a previously manual and error-prone process, streamlines problem authoring, and provides a scalable foundation for future AI-assisted features.

- Configure LLM test generation settings in settings.py - Implement LLM test generation logic in llm_testgen.py - Add API views for LLM test generation in llm_testgen.py - Update URLs to include LLM test generation endpoints

Copilot

Pull request overview

This PR introduces LLM-based automated test case generation functionality, integrating with an external LLM service to reduce manual effort in creating test data for programming problems.

Key Changes:

Adds new API endpoints for health checks, test case generation, and saving generated test cases
Implements service layer to handle LLM API communication including solution uploads and test generation
Configures LLM service settings with endpoint URL and timeout values

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 20 comments.

File	Description
problems/views/llm_testgen.py	Implements four API view classes for health checks, automatic generation, custom generation, and saving test cases with permission checks
problems/services/llm_testgen.py	Provides service layer functions for uploading solutions, generating test cases, and health checks of the external LLM service
problems/urls.py	Adds URL routing for the four new LLM test generation endpoints
back_end/settings.py	Adds configuration for LLM service URL and timeout settings

Critical Issues Found:

Multiple database model field mismatches (Test_cases uses path fields not data fields, Problem_subtasks missing expected fields)
Security concerns with hardcoded production IP addresses and missing rate limiting
Missing input validation and potential race conditions in test case creation
Lack of test coverage for new endpoints

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-28T21:51:56Z

+"""
+
+import logging
+import json


The 'json' module is imported but never used in this file. Remove this unused import to keep the code clean.

Suggested change

import json

Copilot · 2025-12-28T21:51:57Z

+                existing_count = Test_cases.objects.filter(subtask_id=subtask).count()
+
+                # 建立測資
+                Test_cases.objects.create(
+                    subtask_id=subtask,
+                    idx=existing_count + 1,


Potential race condition when computing 'idx' for test cases. The code queries the count and then creates the test case in two separate operations. If multiple requests create test cases for the same subtask concurrently, they could compute the same 'idx' value, leading to a database constraint violation on the unique constraint (subtask_id, idx). Consider using database-level sequence generation or wrapping this in a transaction with appropriate locking.

Copilot · 2025-12-28T21:51:57Z

+        response.raise_for_status()
+        result = response.json()
+
+        logger.info(f'Upload solution response: {result}')


The code logs the full API response payload which may contain sensitive information or large data that could flood logs. Consider using logger.debug for detailed payload logging rather than logger.info, and potentially sanitize or truncate the logged data.

Suggested change

logger.info(f'Upload solution response: {result}')

# Log the response payload in debug level and truncate to avoid flooding logs

result_str = str(result)

max_log_length = 1000

if len(result_str) > max_log_length:

result_str = result_str[:max_log_length] + '... [truncated]'

logger.debug(f'Upload solution response: {result_str}')

Copilot · 2025-12-28T21:51:57Z

+        'javascript': 'python',  # LLM 服務不支援 JS，暫時用 python
+        'js': 'python',


The JavaScript to Python fallback for unsupported languages is problematic. It silently converts JavaScript code to be executed as Python, which will likely fail during execution. Instead, this should return an error indicating that the language is not supported by the LLM service.

Copilot · 2025-12-28T21:51:57Z

+            subtasks.append({
+                'id': st.subtask_no,
+                'name': f'Subtask {st.subtask_no}',
+                'desc': st.description or '',


The Problem_subtasks model has 'description' as a potential field based on its usage pattern in the service (line 288), but this field doesn't exist in the model definition. The model only has 'weight', 'time_limit_ms', 'memory_limit_mb', and timestamp fields. This will cause an AttributeError at runtime.

Suggested change

'desc': st.description or '',

'desc': getattr(st, 'description', '') or '',

Copilot · 2025-12-28T21:52:00Z

+
+logger = logging.getLogger(__name__)
+
+# LLM 測資生成服務設定
+LLM_TESTGEN_API_URL = getattr(settings, 'LLM_TESTGEN_API_URL', 'http://34.81.90.111:8001')


The LLM service URL should be validated when loaded from settings. If the URL is malformed or points to an invalid location, requests will fail with unclear errors. Consider adding validation to ensure the URL is properly formatted and optionally check service availability on startup.

Suggested change

logger = logging.getLogger(__name__)

# LLM 測資生成服務設定

LLM_TESTGEN_API_URL = getattr(settings, 'LLM_TESTGEN_API_URL', 'http://34.81.90.111:8001')

from urllib.parse import urlparse

logger = logging.getLogger(__name__)

# LLM 測資生成服務設定

DEFAULT_LLM_TESTGEN_API_URL = 'http://34.81.90.111:8001'

def _validate_llm_service_url(url: str) -> str:

"""

驗證並標準化 LLM 服務的基底 URL。

如果設定的 URL 無效，會記錄警告並回退到預設值。

"""

if not url:

logger.warning(

"LLM_TESTGEN_API_URL is empty or not set; falling back to default '%s'.",

DEFAULT_LLM_TESTGEN_API_URL,

)

return DEFAULT_LLM_TESTGEN_API_URL

parsed = urlparse(url)

if parsed.scheme not in ("http", "https") or not parsed.netloc:

logger.warning(

"Invalid LLM_TESTGEN_API_URL '%s'; falling back to default '%s'.",

url,

DEFAULT_LLM_TESTGEN_API_URL,

)

return DEFAULT_LLM_TESTGEN_API_URL

# 移除結尾的斜線以避免組 URL 時出現重複的 '/'

return url.rstrip("/")

LLM_TESTGEN_API_URL = _validate_llm_service_url(

getattr(settings, 'LLM_TESTGEN_API_URL', DEFAULT_LLM_TESTGEN_API_URL)

)

Copilot · 2025-12-28T21:52:00Z

+                        'score': 100 // len(set(t.get('subtask_id', 1) for t in testcases)),
+                        'description': f'Subtask {subtask_no}'


The Problem_subtasks model does not have a 'score' field as seen in the model definition (it has 'weight' instead), and there's no 'description' field either. The defaults dictionary should use 'weight' instead of 'score', and should not reference non-existent 'description' field. Additionally, the field used in line 288 ('num_testcases') doesn't exist in the Problem_subtasks model.

Suggested change

'score': 100 // len(set(t.get('subtask_id', 1) for t in testcases)),

'description': f'Subtask {subtask_no}'

'weight': 100 // len(set(t.get('subtask_id', 1) for t in testcases)),

Copilot · 2025-12-28T21:52:00Z

+                Test_cases.objects.create(
+                    subtask_id=subtask,
+                    idx=existing_count + 1,
+                    input_data=input_data,
+                    expected_output=output_data
+                )


The Test_cases model uses 'input_path' and 'output_path' fields for file storage, not 'input_data' and 'expected_output' as used here. According to the model definition (lines 290-295), test cases store file paths, not the actual data content. This will cause a database error when trying to create test cases.

Copilot · 2025-12-28T21:52:00Z

+
+        data = request.data
+        testcases = data.get('testcases', [])
+        create_subtasks = data.get('create_subtasks', True)


The parameter 'create_subtasks' is accepted in the request but never actually used in the logic. The code always creates subtasks via get_or_create regardless of this parameter's value. Either remove this unused parameter from the API contract or implement the conditional logic to respect it.

Copilot · 2025-12-28T21:52:00Z

+                        'ok': True,
+                        'mode': result.get('mode'),
+                        'testcases': result.get('data', {}).get('testcases', []),
+                        'raw_response': result.get('data')


The API response includes 'raw_response' which may expose internal details or sensitive information from the LLM service. Consider whether this detailed response data should be included in production responses, or if it should only be available in debug mode or to admin users.

feat: add LLM test generation service and API endpoints

837a945

- Configure LLM test generation settings in settings.py - Implement LLM test generation logic in llm_testgen.py - Add API views for LLM test generation in llm_testgen.py - Update URLs to include LLM test generation endpoints

happylittle7 requested a review from Copilot December 28, 2025 21:47

Copilot started reviewing on behalf of happylittle7 December 28, 2025 21:48 View session

Copilot AI reviewed Dec 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds LLM-based test case generation API endpoints#195

Adds LLM-based test case generation API endpoints#195
happylittle7 wants to merge 1 commit intodevfrom
feat/llm-generate

happylittle7 commented Dec 28, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 28, 2025

Uh oh!

Copilot AI Dec 28, 2025

Uh oh!

Copilot AI Dec 28, 2025

Uh oh!

Copilot AI Dec 28, 2025

Uh oh!

Copilot AI Dec 28, 2025

Uh oh!

Copilot AI Dec 28, 2025

Uh oh!

Copilot AI Dec 28, 2025

Uh oh!

Copilot AI Dec 28, 2025

Uh oh!

Copilot AI Dec 28, 2025

Uh oh!

Copilot AI Dec 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-        logger.info(f'Upload solution response: {result}')
+        # Log the response payload in debug level and truncate to avoid flooding logs
+        result_str = str(result)
+        max_log_length = 1000
+        if len(result_str) > max_log_length:
+            result_str = result_str[:max_log_length] + '... [truncated]'
+        logger.debug(f'Upload solution response: {result_str}')

		'javascript': 'python', # LLM 服務不支援 JS，暫時用 python
		'js': 'python',

	'desc': st.description or '',
	'desc': getattr(st, 'description', '') or '',

-logger = logging.getLogger(__name__)
-# LLM 測資生成服務設定
-LLM_TESTGEN_API_URL = getattr(settings, 'LLM_TESTGEN_API_URL', 'http://34.81.90.111:8001')
+from urllib.parse import urlparse
+logger = logging.getLogger(__name__)
+# LLM 測資生成服務設定
+DEFAULT_LLM_TESTGEN_API_URL = 'http://34.81.90.111:8001'
+def _validate_llm_service_url(url: str) -> str:
+    """
+    驗證並標準化 LLM 服務的基底 URL。
+    如果設定的 URL 無效，會記錄警告並回退到預設值。
+    """
+    if not url:
+        logger.warning(
+            "LLM_TESTGEN_API_URL is empty or not set; falling back to default '%s'.",
+            DEFAULT_LLM_TESTGEN_API_URL,
+        )
+        return DEFAULT_LLM_TESTGEN_API_URL
+    parsed = urlparse(url)
+    if parsed.scheme not in ("http", "https") or not parsed.netloc:
+        logger.warning(
+            "Invalid LLM_TESTGEN_API_URL '%s'; falling back to default '%s'.",
+            url,
+            DEFAULT_LLM_TESTGEN_API_URL,
+        )
+        return DEFAULT_LLM_TESTGEN_API_URL
+    # 移除結尾的斜線以避免組 URL 時出現重複的 '/'
+    return url.rstrip("/")
+LLM_TESTGEN_API_URL = _validate_llm_service_url(
+    getattr(settings, 'LLM_TESTGEN_API_URL', DEFAULT_LLM_TESTGEN_API_URL)
+)

		'score': 100 // len(set(t.get('subtask_id', 1) for t in testcases)),
		'description': f'Subtask {subtask_no}'

	'score': 100 // len(set(t.get('subtask_id', 1) for t in testcases)),
	'description': f'Subtask {subtask_no}'
	'weight': 100 // len(set(t.get('subtask_id', 1) for t in testcases)),

Conversation

happylittle7 commented Dec 28, 2025

Overview

Details

Motivation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 28, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants