Skip to content

fix(api): refactors the SQL LIKE pattern escaping logic to use a centralized utility function, ensuring consistent and secure handling of special characters across all database queries.#61

Open
tomerqodo wants to merge 4 commits intocopilot_combined_20260121_qodo_grep_cursor_copilot_1_base_fixapi_refactors_the_sql_like_pattern_escaping_logic_to_use_a_centralized__utility_function_ensuring_consistent_and_secure_handling_of_specialfrom
copilot_combined_20260121_qodo_grep_cursor_copilot_1_head_fixapi_refactors_the_sql_like_pattern_escaping_logic_to_use_a_centralized__utility_function_ensuring_consistent_and_secure_handling_of_special
Open

fix(api): refactors the SQL LIKE pattern escaping logic to use a centralized utility function, ensuring consistent and secure handling of special characters across all database queries.#61
tomerqodo wants to merge 4 commits intocopilot_combined_20260121_qodo_grep_cursor_copilot_1_base_fixapi_refactors_the_sql_like_pattern_escaping_logic_to_use_a_centralized__utility_function_ensuring_consistent_and_secure_handling_of_specialfrom
copilot_combined_20260121_qodo_grep_cursor_copilot_1_head_fixapi_refactors_the_sql_like_pattern_escaping_logic_to_use_a_centralized__utility_function_ensuring_consistent_and_secure_handling_of_special

Conversation

@tomerqodo
Copy link
Copy Markdown

Benchmark PR from qodo-benchmark#431

NeatGuyCoding and others added 4 commits January 21, 2026 15:54
…ralized

utility function, ensuring consistent and secure handling of special characters
across all database queries.

Signed-off-by: NeatGuyCoding <15627489+NeatGuyCoding@users.noreply.github.com>
…logic

Signed-off-by: NeatGuyCoding <15627489+NeatGuyCoding@users.noreply.github.com>
Copilot AI review requested due to automatic review settings January 21, 2026 19:00
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request refactors SQL LIKE pattern escaping across the Dify API codebase by introducing a centralized utility function escape_like_pattern() to prevent SQL injection via LIKE wildcards. The changes ensure that special characters (%, _, ) in user-provided search terms are properly escaped before being used in database queries.

Changes:

  • Added centralized escape_like_pattern() utility function in api/libs/helper.py with comprehensive unit tests
  • Updated 10+ service and controller files to use the new utility function for consistent SQL LIKE pattern escaping
  • Added integration tests to verify special character handling in search functionality across multiple services
  • Fixed circular dependency issues in test files by moving AppService imports to function scope

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
api/libs/helper.py Added escape_like_pattern() utility function to centrally handle SQL LIKE pattern escaping
api/tests/unit_tests/libs/test_helper.py Added comprehensive unit tests for the new escape function covering edge cases
api/services/workflow_app_service.py Refactored keyword search to use centralized escaping with proper ESCAPE clause
api/services/tag_service.py Updated tag name filtering to use escaped patterns
api/services/external_knowledge_service.py Updated search functionality with pattern escaping
api/services/dataset_service.py Updated dataset name and content search with escaping
api/services/conversation_service.py Replaced inline escaping logic with centralized utility
api/services/app_service.py Updated app name filtering with escaped patterns
api/services/annotation_service.py Updated annotation search with pattern escaping
api/core/rag/retrieval/dataset_retrieval.py Updated metadata filtering with escaped patterns
api/core/rag/datasource/vdb/iris/iris_vector.py Updated full-text search with pattern escaping
api/core/rag/datasource/vdb/clickzetta/clickzetta_vector.py Updated LIKE-based search with pattern escaping
api/controllers/console/datasets/datasets_segments.py Updated segment search with escaped keyword patterns
api/controllers/console/app/conversation.py Updated conversation search with pattern escaping
api/tests/test_containers_integration_tests/services/*.py Added integration tests for special character handling and fixed circular dependencies
Comments suppressed due to low confidence (8)

api/tests/test_containers_integration_tests/services/test_tag_service.py:512

  • This import of module uuid is redundant, as it was previously imported on line 1.
        import uuid

api/tests/test_containers_integration_tests/services/test_tag_service.py:791

  • This import of module uuid is redundant, as it was previously imported on line 1.
        import uuid

api/tests/test_containers_integration_tests/services/test_tag_service.py:882

  • This import of module uuid is redundant, as it was previously imported on line 1.
        import uuid

api/tests/test_containers_integration_tests/services/test_tag_service.py:954

  • This import of module uuid is redundant, as it was previously imported on line 1.
        import uuid

api/tests/test_containers_integration_tests/services/test_tag_service.py:1063

  • This import of module uuid is redundant, as it was previously imported on line 1.
        import uuid

api/tests/test_containers_integration_tests/services/test_tag_service.py:1198

  • This import of module uuid is redundant, as it was previously imported on line 1.
        import uuid

api/tests/test_containers_integration_tests/services/test_tag_service.py:1246

  • This import of module uuid is redundant, as it was previously imported on line 1.
        import uuid

api/tests/test_containers_integration_tests/services/test_tag_service.py:1270

  • This import of module uuid is redundant, as it was previously imported on line 1.
        import uuid

Comment thread api/libs/helper.py
Comment on lines +64 to +65
escapedPattern = pattern.replace("\\", "\\\\").replace("%", "\\%").replace("_", "\\_")
return escapedPattern
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable name 'escapedPattern' does not follow Python naming conventions. It should use snake_case instead of camelCase. Consider renaming to 'escaped_pattern' for consistency with Python style guidelines (PEP 8).

Suggested change
escapedPattern = pattern.replace("\\", "\\\\").replace("%", "\\%").replace("_", "\\_")
return escapedPattern
escaped_pattern = pattern.replace("\\", "\\\\").replace("%", "\\%").replace("_", "\\_")
return escaped_pattern

Copilot uses AI. Check for mistakes.
SELECT TOP {top_k} id, text, meta
FROM {self.schema}.{self.table_name}
WHERE text LIKE ?
WHERE text LIKE ? ESCAPE '|'
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ESCAPE clause is inconsistent with other usages in the codebase. Throughout the PR, the escape parameter uses "\" (e.g., escape="\\" for SQLAlchemy methods), but here it uses '|'. This inconsistency could lead to confusion and should use '\' for consistency with the escape_like_pattern function which escapes using backslash.

Suggested change
WHERE text LIKE ? ESCAPE '|'
WHERE text LIKE ? ESCAPE '\\'

Copilot uses AI. Check for mistakes.
from libs.helper import escape_like_pattern

escaped_query = escape_like_pattern(query).replace("'", "''")
filter_clauses.append(f"{Field.CONTENT_KEY} LIKE '%{escaped_query}%' ESCAPE '\\\\'")
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ESCAPE clause syntax is inconsistent with other usages in the codebase. The string literal uses four backslashes ('\\') which may be appropriate for raw SQL, but this differs from the SQLAlchemy escape parameter pattern used elsewhere (e.g., escape="\\"). Consider documenting why this database requires a different escape syntax, or ensure this is the correct escape character for ClickZetta.

Suggested change
filter_clauses.append(f"{Field.CONTENT_KEY} LIKE '%{escaped_query}%' ESCAPE '\\\\'")
# Use '\' as the LIKE escape character to match escape_like_pattern and other usages
filter_clauses.append(f"{Field.CONTENT_KEY} LIKE '%{escaped_query}%' ESCAPE '\\'")

Copilot uses AI. Check for mistakes.
case "not contains":
filters.append(json_field.notlike(f"%{value}%"))
escaped_value = escape_like_pattern(str(value))
filters.append(json_field.notlike(f"%{escaped_value}%"))
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'not contains' case is missing the escape parameter. While the value is properly escaped using escape_like_pattern, the notlike() call should include escape="\\" parameter for consistency with the 'contains' case on line 1203. Without this, the escape characters won't be interpreted correctly by the database.

Suggested change
filters.append(json_field.notlike(f"%{escaped_value}%"))
filters.append(json_field.notlike(f"%{escaped_value}%", escape="\\"))

Copilot uses AI. Check for mistakes.
case "end with":
filters.append(json_field.like(f"%{value}"))
escaped_value = escape_like_pattern(str(value))
filters.append(json_field.like(f"%{escaped_value}", escape="\\"))
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'end with' case is missing the escape parameter. While the value is properly escaped using escape_like_pattern, the like() call should include escape="\\" parameter for consistency with the 'contains' and 'start with' cases. Without this, the escape characters won't be interpreted correctly by the database.

Copilot uses AI. Check for mistakes.
account,
)

app_with_underscore = app_service.create_app(
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable app_with_underscore is not used.

Suggested change
app_with_underscore = app_service.create_app(
app_service.create_app(

Copilot uses AI. Check for mistakes.
account,
)

app_with_backslash = app_service.create_app(
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable app_with_backslash is not used.

Suggested change
app_with_backslash = app_service.create_app(
app_service.create_app(

Copilot uses AI. Check for mistakes.
)

# Create app that should NOT match
app_no_match = app_service.create_app(
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable app_no_match is not used.

Suggested change
app_no_match = app_service.create_app(
app_service.create_app(

Copilot uses AI. Check for mistakes.
- SQL injection via LIKE wildcards is prevented
"""
# Arrange: Create test data
fake = Faker()
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable fake is not used.

Suggested change
fake = Faker()

Copilot uses AI. Check for mistakes.
- SQL injection via LIKE wildcards is prevented
"""
# Arrange: Create test data
fake = Faker()
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable fake is not used.

Suggested change
fake = Faker()

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants