-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
bugSomething isn't workingSomething isn't workingcriticalCritical priority issues that block functionalityCritical priority issues that block functionality
Description
Title: Fix eval() security vulnerability in transform operations
Problem Description
The transform command uses eval() to evaluate user-provided expressions, which poses a security risk even with current safeguards. The validation is not comprehensive enough to prevent all attack vectors.
Current Behavior
In excel_toolkit/operations/transforming.py (line 148):
result = eval(expression, {"__builtins__": {}}, series_dict)While the code blocks dangerous patterns like import, exec, __, etc., there are still potential attack vectors:
- Attribute access bypass:
(1).__class__.__base__can access Python internals - Lambda expressions: Can be constructed to bypass filters
- List comprehensions: May allow arbitrary code execution
- Operator overloading: Could exploit dunder methods
Real-World Risk Scenarios
Scenario 1: Information disclosure
# Attacker could use:
xl transform data.xlsx --col "Amount" --expr "(1).__class__.__base__.__subclasses__()"
# This would reveal all available classes in the systemScenario 2: Resource exhaustion
# Attacker could create expressions that consume massive resources
xl transform data.xlsx --col "Amount" --expr "([1 for _ in range(1000000)], 0)[1]"Scenario 3: Bypassing filters
# Current filters block: import, exec, eval, __, org., sys., etc.
# But don't block: __class__, __base__, __subclasses__, etc.Affected Files
excel_toolkit/operations/transforming.py(lines 28-46, 148-154)excel_toolkit/commands/transform.py
Proposed Solutions
Option 1: Use AST Parsing and Whitelisting (Recommended)
import ast
def validate_expression_ast(expression: str) -> Result[None, InvalidExpressionError]:
"""Validate expression using AST parsing."""
try:
tree = ast.parse(expression, mode='eval')
except SyntaxError:
return err(InvalidExpressionError("Invalid syntax"))
# Only allow specific node types
allowed_nodes = {
ast.Expression, ast.BinOp, ast.UnaryOp,
ast.Compare, ast.BoolOp,
ast.Name, ast.Constant, ast.Num, ast.Str,
ast.Load, ast.Add, ast.Sub, ast.Mult, ast.Div,
ast.Mod, ast.Pow, ast.USub, ast.UAdd,
ast.Eq, ast.NotEq, ast.Lt, ast.LtE, ast.Gt, ast.GtE,
ast.And, ast.Or, ast.Not
}
for node in ast.walk(tree):
if type(node) not in allowed_nodes:
return err(InvalidExpressionError(
f"Operation '{type(node).__name__}' is not allowed"
))
return ok(None)Option 2: Use Simple Math Expression Parser
Replace eval() with a safe expression parser:
import simpleeval
def evaluate_expression_safe(expression: str, variables: dict) -> Result[Any, InvalidExpressionError]:
"""Evaluate expression using safe parser."""
try:
# simpleeval only allows basic math operations
evaluator = simpleeval.SimpleEval()
evaluator.names.update(variables)
result = evaluator.eval(expression)
return ok(result)
except Exception as e:
return err(InvalidExpressionError(str(e)))Requires adding dependency: simpleeval>=1.0.0
Option 3: Implement Custom Expression Language
Create a domain-specific language with only safe operations:
# Supported syntax:
# - Column references: Amount, Price
# - Basic math: Amount * 2, Price + 10
# - Functions: sum(Amount), avg(Price), round(Amount, 2)
# - Comparisons: Amount > 100, Price == 50
# Not supported:
# - Any Python code
# - Function calls
# - Imports
# - Complex expressionsOption 4: Add Comprehensive Input Validation
Keep eval() but add much stronger validation:
DANGEROUS_PATTERNS = [
r'\bimport\b', r'\bexec\b', r'\beval\b',
r'__class__', r'__base__', r'__subclasses__',
r'\.__', r'\[.*__.*\]', # Access to dunder methods
r'\blambda\b', # Lambda expressions
r'\[.*for.*in.*\]', # List comprehensions
r'\bgetattr\b', r'\bsetattr\b', r'\bhasattr\b',
r'\btype\b', r'\bisinstance\b',
]Additional Safeguards
- Sandbox execution:
import RestrictedPython
def eval_restricted(expression: str, variables: dict):
"""Evaluate in restricted environment."""
compiled = RestrictedPython.compile_restricted(expression, '<string>', 'eval')
return eval(compiled.code, {'__builtins__': {}}, variables)- Resource limits:
import signal
def timeout_handler(signum, frame):
raise TimeoutError("Expression evaluation timed out")
signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(1) # 1 second timeout
result = eval(expression, {...}, variables)
signal.alarm(0)- Expression complexity limit:
def max_expression_complexity(expression: str, max_length: int = 100):
"""Limit expression complexity."""
if len(expression) > max_length:
raise InvalidExpressionError("Expression too complex")Impact
- Severity: Critical security vulnerability
- Affected users: All users of the
transformcommand - Exploitability: Medium (requires user to run malicious expression)
- Risk: Code execution, information disclosure, DoS
Related Issues
- No input sanitization for user parameters (Non-critical warning messages pollute output (Slicer List extension) #14)
- Inconsistent security validation (No column indexing - cannot reference columns by position when names have special characters #12)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingcriticalCritical priority issues that block functionalityCritical priority issues that block functionality