-
Notifications
You must be signed in to change notification settings - Fork 2
jinja support in caterpillar #25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Wiz Scan Summary
To detect these findings earlier in the dev lifecycle, try using Wiz Code VS Code Extension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds Jinja-style template support to the caterpillar configuration system, enabling file inclusion and date string templating for more modular and dynamic configuration management.
Key Changes
- Implements Jinja-style
{% include 'path' %}directive processing with recursive inclusion and circular dependency detection - Adds
dstemplate function to retrieve date strings from environment variables (DS, EXECUTION_DATE) or current date - Integrates include preprocessing before template parsing in the config loading pipeline
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| absPath, err := filepath.Abs(resolvedPath) | ||
| if err != nil { | ||
| return "", fmt.Errorf("failed to resolve absolute path for %s: %w", filePath, err) | ||
| } | ||
|
|
||
| if visited[absPath] { | ||
| return "", fmt.Errorf("circular include detected: %s", absPath) | ||
| } | ||
|
|
||
| // Read the included file | ||
| includedContent, err := os.ReadFile(resolvedPath) | ||
| if err != nil { | ||
| return "", fmt.Errorf("failed to read included file %s: %w", filePath, err) |
Copilot
AI
Dec 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The processIncludes function is missing validation to prevent directory traversal attacks. While filepath.Clean normalizes paths, it doesn't prevent accessing files outside the intended directory structure. An attacker could use {% include '../../../etc/passwd' %} to read sensitive files.
Consider adding a check after resolving the absolute path to ensure it stays within allowed directories:
// After line 139, add validation
if !strings.HasPrefix(absPath, filepath.Clean(baseDir)) {
return "", fmt.Errorf("include path %s is outside allowed directory", filePath)
}Alternatively, maintain a whitelist of allowed base directories for includes.
| func getDateString() (string, error) { | ||
| // Check for DS environment variable first (common in data pipelines) | ||
| if ds := os.Getenv("DS"); ds != "" { | ||
| return ds, nil | ||
| } | ||
| // Check for EXECUTION_DATE (alternative common name) | ||
| if execDate := os.Getenv("EXECUTION_DATE"); execDate != "" { | ||
| return execDate, nil | ||
| } | ||
| // Default to current date | ||
| return time.Now().Format("2006-01-02"), nil |
Copilot
AI
Dec 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The getDateString function always returns nil error but is declared to return an error. Since this function never fails (both environment variable reads and time.Now().Format() cannot fail), the error return type is unnecessary and inconsistent with similar functions like getEnvironmentVariable which also returns (string, error) but could be simplified.
Consider either:
- Removing the error return type:
func getDateString() string - Validating the date format from environment variables and returning an error if invalid
If you keep the error return, you should validate that DS/EXECUTION_DATE values are in the expected YYYY-MM-DD format.
| func getDateString() (string, error) { | |
| // Check for DS environment variable first (common in data pipelines) | |
| if ds := os.Getenv("DS"); ds != "" { | |
| return ds, nil | |
| } | |
| // Check for EXECUTION_DATE (alternative common name) | |
| if execDate := os.Getenv("EXECUTION_DATE"); execDate != "" { | |
| return execDate, nil | |
| } | |
| // Default to current date | |
| return time.Now().Format("2006-01-02"), nil | |
| func getDateString() string { | |
| // Check for DS environment variable first (common in data pipelines) | |
| if ds := os.Getenv("DS"); ds != "" { | |
| return ds | |
| } | |
| // Check for EXECUTION_DATE (alternative common name) | |
| if execDate := os.Getenv("EXECUTION_DATE"); execDate != "" { | |
| return execDate | |
| } | |
| // Default to current date | |
| return time.Now().Format("2006-01-02") |
| // Replace the include directive with the file content | ||
| result = strings.ReplaceAll(result, includePath, processedContent) | ||
| } |
Copilot
AI
Dec 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using strings.ReplaceAll to replace include directives could cause incorrect replacements if the same include directive appears multiple times or if the included content itself contains text that matches other include directives in the original file.
For example, if file A has {% include 'common.yaml' %} twice, both will be replaced with the content, which is correct. However, if the included content also contains the text {% include 'common.yaml' %} as a literal string (not meant to be processed), it will be incorrectly replaced in subsequent iterations.
Consider using a different approach:
- Replace includes one at a time in order
- Build the result string progressively by replacing only the first occurrence at each iteration
- Or process matches in reverse order to avoid position shifts
| // Replace the include directive with the file content | |
| result = strings.ReplaceAll(result, includePath, processedContent) | |
| } | |
| // Replace only the first occurrence of the include directive with the file content | |
| if idx := strings.Index(result, includePath); idx != -1 { | |
| result = result[:idx] + processedContent + result[idx+len(includePath):] | |
| } |
|
|
||
| var ( | ||
| // Jinja-style include directive: {% include 'path/to/file' %} | ||
| includeRegex = regexp.MustCompile(`\{\%\s*include\s+['"]([^'"]+)['"]\s*\%\}`) |
Copilot
AI
Dec 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex pattern \{\%\s*include\s+['"]([^'"]+)['"]\s*\%\} doesn't properly match Jinja-style include directives that may have mixed quotes. The pattern [^'"] will match until it finds either a single or double quote, which could lead to incorrect parsing if quotes are mixed or nested.
For example, the pattern would incorrectly parse:
{% include "file's_name.yaml" %}(stops at the apostrophe in "file's"){% include 'file"name.yaml' %}(stops at the internal quote)
Consider using separate patterns for single and double quotes:
includeRegex = regexp.MustCompile(`\{\%\s*include\s+(?:'([^']+)'|"([^"]+)")\s*\%\}`)Then handle both capture groups in the matching logic.
| includeRegex = regexp.MustCompile(`\{\%\s*include\s+['"]([^'"]+)['"]\s*\%\}`) | |
| includeRegex = regexp.MustCompile(`\{\%\s*include\s+(?:'([^']+)'|"([^"]+)")\s*\%\}`) |
Description
Jinja style in caterpillar
Types of changes
Checklist