-
Notifications
You must be signed in to change notification settings - Fork 0
Trigger scan on duplicate files #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review by Korbit AI
Korbit automatically attempts to detect when you fix issues in new commits.
| Category | Issue | Status |
|---|---|---|
| Loss of Error Context ▹ view | ||
| Unsafe subprocess execution with unvalidated paths ▹ view | ||
| Print Statement Instead of Logger ▹ view | ||
| Insufficient Log Format ▹ view | ||
| Redundant thumbnail saving ▹ view | ||
| BytesIO stream position not reset ▹ view | ||
| Incomplete function docstring ▹ view | ||
| Hardcoded font path ▹ view | ||
| Misleading function name ▹ view |
Files scanned
| File Path | Reviewed |
|---|---|
| app-review.py | ✅ |
Explore our documentation to understand the languages and file types we support and the files we ignore.
Check out our docs on how you can make Korbit work best for you and your team.
| except Exception as e: | ||
| return f"{image_path}: {e}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Loss of Error Context 
Tell me more
What is the issue?
Using a bare Exception catch without logging the full stack trace loses valuable debugging information.
Why this matters
When an error occurs, only the string representation of the error is returned, discarding the stack trace that could help diagnose the root cause.
Suggested change ∙ Feature Preview
Log the full exception details before returning the error message:
except Exception as e:
logging.exception(f"Error processing {image_path}")
return f"{image_path}: {e}"Provide feedback to improve future suggestions
💬 Looking for more details? Reply to this comment to chat with Korbit.
| result = subprocess.run( | ||
| ["exiftool", f"-{tag}", "-b", str(image_path)], | ||
| stdout=subprocess.PIPE, | ||
| stderr=subprocess.PIPE, | ||
| check=False, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsafe subprocess execution with unvalidated paths 
Tell me more
What is the issue?
Direct use of user-provided input (image_path) in subprocess calls without proper path validation could lead to command injection.
Why this matters
An attacker could craft a malicious filename that includes shell commands, potentially leading to arbitrary code execution.
Suggested change ∙ Feature Preview
from pathlib import Path
def extract_cr3_preview(image_path: Path, tag: str = "PreviewImage") -> io.BytesIO:
if not image_path.is_file():
raise ValueError("Invalid file path")
image_path_str = str(image_path.resolve())
result = subprocess.run(
["exiftool", f"-{tag}", "-b", image_path_str],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
check=False,
)Provide feedback to improve future suggestions
💬 Looking for more details? Reply to this comment to chat with Korbit.
| if not out_path.exists(): | ||
| print(f"Thumbnail does not exist: {uri} {out_path}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Print Statement Instead of Logger 
Tell me more
What is the issue?
Using print() statement instead of logging for notifying about missing thumbnails.
Why this matters
Using print() makes it impossible to control output verbosity and breaks logging consistency. This message would not appear in log files or monitoring systems.
Suggested change ∙ Feature Preview
Replace with: logging.debug(f"Thumbnail does not exist: {uri} {out_path}")
Provide feedback to improve future suggestions
💬 Looking for more details? Reply to this comment to chat with Korbit.
|
|
||
| from tqdm import tqdm | ||
|
|
||
| logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Insufficient Log Format 
Tell me more
What is the issue?
Basic logging format missing crucial debugging information like timestamp and module name.
Why this matters
Current format makes it difficult to trace when issues occurred and from which module they originated, complicating debugging in production.
Suggested change ∙ Feature Preview
Update to: logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
Provide feedback to improve future suggestions
💬 Looking for more details? Reply to this comment to chat with Korbit.
| thumb.save(out_path, format="PNG", pnginfo=metadata) | ||
|
|
||
| file_extension = image_path.suffix.lower() | ||
| if file_extension in SUPPORTED_EXTENSIONS: | ||
| thumb = add_raw_label(thumb, label, file_extension) | ||
| thumb.save(out_path, format="PNG", pnginfo=metadata) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant thumbnail saving 
Tell me more
What is the issue?
The thumbnail is being saved twice for each size, once without the label and immediately after with the label, overwriting the first save.
Why this matters
This causes unnecessary I/O operations and processing time, impacting the overall performance of the thumbnail generation process.
Suggested change ∙ Feature Preview
Remove the first save operation and only save after adding the label:
out_dir = output_base / label
out_dir.mkdir(parents=True, exist_ok=True)
out_path = out_dir / out_filename
file_extension = image_path.suffix.lower()
if file_extension in SUPPORTED_EXTENSIONS:
thumb = add_raw_label(thumb, label, file_extension)
thumb.save(out_path, format="PNG", pnginfo=metadata)Provide feedback to improve future suggestions
💬 Looking for more details? Reply to this comment to chat with Korbit.
| img_bytes_io = io.BytesIO() | ||
| img.save(img_bytes_io, format="JPEG") | ||
|
|
||
| return img_bytes_io |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BytesIO stream position not reset 
Tell me more
What is the issue?
The BytesIO object is returned without seeking to the start, making it unusable for subsequent read operations.
Why this matters
When the BytesIO object is used later, it will be at the end of the stream, resulting in no data being read.
Suggested change ∙ Feature Preview
Reset the stream position before returning:
img_bytes_io = io.BytesIO()
img.save(img_bytes_io, format="JPEG")
img_bytes_io.seek(0)
return img_bytes_ioProvide feedback to improve future suggestions
💬 Looking for more details? Reply to this comment to chat with Korbit.
| def add_raw_label(image: Image.Image, label: str, text: str) -> Image.Image: | ||
| """Add a label to the image indicating the original file format.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incomplete function docstring 
Tell me more
What is the issue?
The docstring doesn't explain the parameters or return value, and doesn't clarify why the label is being added.
Why this matters
Makes it harder for developers to understand the expected inputs and purpose of the label addition.
Suggested change ∙ Feature Preview
"""Add a label to the image showing the original file format for RAW files.
Args:
image: The PIL Image to modify
label: The thumbnail size category (e.g. 'normal', 'large')
text: The file extension to display
Returns:
The modified image with label added"""
Provide feedback to improve future suggestions
💬 Looking for more details? Reply to this comment to chat with Korbit.
| logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s") | ||
|
|
||
| SUPPORTED_EXTENSIONS = {".jpg", ".jpeg", ".png", ".cr2", ".cr3", ".arw"} | ||
| FONT = "/usr/share/fonts/truetype/jetbrains-mono/JetBrainsMono-Bold.ttf" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoded font path 
Tell me more
What is the issue?
Hardcoded font path that may not exist on all systems.
Why this matters
The script may fail on systems with different font paths or missing the specific font, making it less portable.
Suggested change ∙ Feature Preview
Move to configuration and provide fallback options:
FONT_PATHS = [
"/usr/share/fonts/truetype/jetbrains-mono/JetBrainsMono-Bold.ttf",
"/usr/share/fonts/TTF/DejaVuSans-Bold.ttf",
# Add more fallback paths
]
def get_system_font():
for path in FONT_PATHS:
if Path(path).exists():
return path
return None
FONT = get_system_font()Provide feedback to improve future suggestions
💬 Looking for more details? Reply to this comment to chat with Korbit.
| return image | ||
|
|
||
|
|
||
| def extract_cr3_preview(image_path: Path, tag: str = "PreviewImage") -> io.BytesIO: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Misleading function name 
Tell me more
What is the issue?
Function name is too specific (cr3) despite handling multiple RAW formats (cr2, cr3, arw).
Why this matters
The function name doesn't accurately reflect its capabilities, making it misleading for maintainers.
Suggested change ∙ Feature Preview
Rename to better reflect functionality:
def extract_raw_preview(image_path: Path, tag: str = "PreviewImage") -> io.BytesIO:Provide feedback to improve future suggestions
💬 Looking for more details? Reply to this comment to chat with Korbit.
Description by Korbit AI
What change is being made?
Add a new Python script
app.pyto generate PNG thumbnails for images, with support for freedesktop sizes, using parallel processing.Why are these changes being made?
This change introduces an automated way to generate image thumbnails in various sizes, accommodating both standard image formats and RAW formats through metadata extraction and rotational adjustments using ExifTool. The support for multiprocessing enhances processing speed, making this implementation efficient for handling large batches of images.