fix(web): fix lightbox black screen and add video thumbnails#170
fix(web): fix lightbox black screen and add video thumbnails#170GeiserX wants to merge 9 commits into
Conversation
- Extract shared media_utils.py with proper arithmetic for Telegram marked-ID resolution (fixes channel fallback producing wrong paths) - Fix ACL bypass: enforce access control against resolved path, not URL - Fix entrypoint crash when media table doesn't exist on pre-v6 databases - Remove dead _BATCH_SIZE constant and identical dialect branches in migration - Add try/except for RuntimeError (symlink loops) in fallback paths - Add debug logging to fallback and ACL legacy grant paths - Fix except syntax to use parenthesized tuple form - Add 25 unit tests for media_utils with roundtrip consistency checks - Bump version to 7.10.15
Return empty list for non-numeric folders (e.g., "chat1") instead of raising ValueError. Fixes test_web_thumbnails failure on CI.
…ry tests - Strip chat IDs from all debug log statements (PII compliance) - Fix thumbnail ACL bypass: ensure_thumbnail now returns resolved folder so serve_thumbnail checks ACL on the actual resolved path - Simplify serve_media path reconstruction (str(relative_to) vs manual join) - Guard folder_int <= 0 in legacy_folder_alternates - Fix import style in thumbnails.py (absolute → relative) - Add boundary tests for exact CHANNEL_ID_OFFSET inflection point - Add integration tests for ACL bypass scenario (deny + allow)
…-depth Keep the pre-resolution ACL check (prevents file existence leakage), and add a secondary check only when the resolved folder differs from the requested folder (prevents bypass via legacy fallback).
Lightbox from gallery was setting lightboxMedia to an array instead of a single object. Introduced lightboxList ref for gallery navigation while keeping lightboxMedia as a single item. Video thumbnails: added ffmpeg to Docker image and extended thumbnail generation to extract frames from video files (mp4, mkv, avi, mov, webm). Gallery endpoint now assigns thumb_url for video extensions too.
|
Caution Review failedPull request was closed or merged during review No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughMedia serving centralizes legacy folder/ID mapping, adds ffmpeg-based video thumbnailing that returns (thumb_path, resolved_folder), enforces ACLs against resolved folders, scopes lightbox navigation per album, hardens pre-Alembic migration detection, and bumps version to 7.11.0. ChangesMedia Legacy Handling and Video Thumbnails
🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #170 +/- ##
=======================================
Coverage 92.55% 92.55%
=======================================
Files 22 23 +1
Lines 6527 6599 +72
=======================================
+ Hits 6041 6108 +67
- Misses 486 491 +5
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (1)
tests/test_web_thumbnails.py (1)
217-219: ⚡ Quick winAdd a cache-hit legacy-resolution regression test.
This test covers cached return shape, but not the case where cached thumbnail originated from an alternate legacy folder. Add one case asserting
resolved_folderstays the alternate value on cache hits.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_web_thumbnails.py` around lines 217 - 219, Add a new test case that simulates a cache hit where the cached thumbnail was created from an alternate legacy folder and asserts the function returns the cached thumb path and preserves the original alternate resolved_folder; specifically, create a scenario around the existing assertions for thumb_path and resolved_folder (variables thumb_path, resolved_folder, thumb.resolve(), folder) but seed the cache or call the legacy-resolution helper so the cached entry records a different folder (e.g., alt_folder) and then assert self.assertEqual(thumb_path, thumb.resolve()) and self.assertEqual(resolved_folder, alt_folder) to ensure legacy-folder identity is preserved on cache hits.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/web/main.py`:
- Line 943: The except clause uses Python 2 multi-exception syntax ("except
OSError, ValueError, RuntimeError") which is invalid in Python 3; update the
handler in src/web/main.py so the except clause catches multiple exceptions
using a parenthesized tuple (e.g., convert the comma-separated form into an
except that takes a tuple of exceptions) and keep the existing exception
variable/handling logic unchanged; repeat this change for all other occurrences
listed (lines referenced in the review: the handlers at 336, 663, 700, 930, 943,
1078, 1230, 2027, 2085, 2224, 2278) so there are no remaining "except X, Y"
forms in the file.
In `@src/web/templates/index.html`:
- Line 2723: In both openMediaItem and openMedia, remove any previously attached
keyboard handler before adding a new one to avoid duplicate listeners: call
document.removeEventListener('keydown', handleLightboxKeydown) just before the
existing document.addEventListener('keydown', handleLightboxKeydown) line (same
pattern at the occurrences around handleLightboxKeydown in lines ~2723 and
~3546) so the handler is cleaned up prior to re-attaching.
In `@src/web/thumbnails.py`:
- Around line 186-190: The cached-path early return returns (dest,
resolved_folder) before any alternate-folder resolution can update
resolved_folder, which can bypass the secondary ACL check in serve_thumbnail;
fix by computing or updating resolved_folder (i.e., run the same
alternate-folder resolution logic that may change resolved_folder) before the
dest.exists() cache-hit return in the function that sets resolved_folder and
dest, or move the dest.exists() check to after the alternate-folder resolution,
so the returned resolved_folder reflects any alternate resolution used by
serve_thumbnail.
- Line 123: The logger.warning call in thumbnails.py is logging the full source
path (logger.warning("Video thumbnail failed for %s: %s", source, e)), which can
include chat/topic identifiers; change this to avoid logging PII by removing or
sanitizing the source variable — e.g., log a non-identifying aggregate or a
fixed message and include only safe metadata (like a boolean or count), or log
just the file basename or a checksum if needed; update the logger.warning
invocation referencing logger.warning and the source variable accordingly so no
chat IDs, topic IDs, or topic titles are emitted.
- Around line 203-204: The except clause in src/web/thumbnails.py uses Python2
syntax "except OSError, RuntimeError:" which fails in Python3; change it to
valid Python3 multi-exception syntax by catching a tuple like "except (OSError,
RuntimeError):" (or split into two except blocks) at the exception handling
point in the thumbnails generation function where that clause appears so the
code parses and handles both errors correctly.
---
Nitpick comments:
In `@tests/test_web_thumbnails.py`:
- Around line 217-219: Add a new test case that simulates a cache hit where the
cached thumbnail was created from an alternate legacy folder and asserts the
function returns the cached thumb path and preserves the original alternate
resolved_folder; specifically, create a scenario around the existing assertions
for thumb_path and resolved_folder (variables thumb_path, resolved_folder,
thumb.resolve(), folder) but seed the cache or call the legacy-resolution helper
so the cached entry records a different folder (e.g., alt_folder) and then
assert self.assertEqual(thumb_path, thumb.resolve()) and
self.assertEqual(resolved_folder, alt_folder) to ensure legacy-folder identity
is preserved on cache hits.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 82290f7c-f388-4142-9753-20e3ab406121
⛔ Files ignored due to path filters (1)
alembic/versions/20260524_013_fix_media_file_paths.pyis excluded by!alembic/versions/**
📒 Files selected for processing (11)
Dockerfilepyproject.tomlscripts/entrypoint.shsrc/__init__.pysrc/web/main.pysrc/web/media_utils.pysrc/web/templates/index.htmlsrc/web/thumbnails.pytests/test_media_utils.pytests/test_web_coverage.pytests/test_web_thumbnails.py
| logger.debug("Legacy fallback: served media via alternate folder resolution") | ||
| break | ||
| except OSError, ValueError: | ||
| except OSError, ValueError, RuntimeError: |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
rg -nP '^\s*except\s+[A-Za-z_][\w.]*\s*,\s*[A-Za-z_]' src/web/main.pyRepository: GeiserX/Telegram-Archive
Length of output: 632
🏁 Script executed:
rg -nP '^\s*except\s+[A-Za-z_][\w.]*\s*,\s*[A-Za-z_]' src/web/main.pyRepository: GeiserX/Telegram-Archive
Length of output: 632
🏁 Script executed:
rg -nP '^\s*except\s+[A-Za-z_][\w.]*\s*,\s*[A-Za-z_]' src/web/main.pyRepository: GeiserX/Telegram-Archive
Length of output: 632
🏁 Script executed:
rg -nP '^\s*except\s+[A-Za-z_][\w.]*\s*,\s*[A-Za-z_]' src/web/main.pyRepository: GeiserX/Telegram-Archive
Length of output: 632
🏁 Script executed:
rg -nP '^\s*except\s+[A-Za-z_][\w.]*\s*,\s*[A-Za-z_]' src/web/main.pyRepository: GeiserX/Telegram-Archive
Length of output: 632
Fix Python 3 invalid except A, B[, ...]: multi-exception syntax in src/web/main.py.
src/web/main.py still contains Python-2-style handlers at 336, 663, 700, 930, 943, 1078, 1230, 2027, 2085, 2224, 2278. Update each to except (A, B[, ...]):.
Suggested fix (line 943)
- except OSError, ValueError, RuntimeError:
+ except (OSError, ValueError, RuntimeError):
continueRe-run the same search; expected: no remaining except X, Y matches in this file.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| except OSError, ValueError, RuntimeError: | |
| except (OSError, ValueError, RuntimeError): | |
| continue |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/web/main.py` at line 943, The except clause uses Python 2 multi-exception
syntax ("except OSError, ValueError, RuntimeError") which is invalid in Python
3; update the handler in src/web/main.py so the except clause catches multiple
exceptions using a parenthesized tuple (e.g., convert the comma-separated form
into an except that takes a tuple of exceptions) and keep the existing exception
variable/handling logic unchanged; repeat this change for all other occurrences
listed (lines referenced in the review: the handlers at 336, 663, 700, 930, 943,
1078, 1230, 2027, 2085, 2224, 2278) so there are no remaining "except X, Y"
forms in the file.
| lightboxIndex.value = startIdx | ||
| lightboxMedia.value = mapped[startIdx] | ||
| lightboxOpen.value = true | ||
| document.addEventListener('keydown', handleLightboxKeydown) |
There was a problem hiding this comment.
Remove existing keyboard handler before adding to prevent duplicates.
Both openMediaItem and openMedia add the keyboard handler without removing any existing one. If the lightbox is already open and another item is clicked, duplicate handlers accumulate.
🔧 Suggested fix
Add cleanup before attaching:
lightboxOpen.value = true
+document.removeEventListener('keydown', handleLightboxKeydown)
document.addEventListener('keydown', handleLightboxKeydown)Apply this pattern in both functions (lines 2723 and 3546).
Also applies to: 3546-3546
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/web/templates/index.html` at line 2723, In both openMediaItem and
openMedia, remove any previously attached keyboard handler before adding a new
one to avoid duplicate listeners: call document.removeEventListener('keydown',
handleLightboxKeydown) just before the existing
document.addEventListener('keydown', handleLightboxKeydown) line (same pattern
at the occurrences around handleLightboxKeydown in lines ~2723 and ~3546) so the
handler is cleaned up prior to re-attaching.
| resolved_folder = folder | ||
|
|
||
| if dest.exists(): | ||
| return dest | ||
| return dest, resolved_folder | ||
|
|
There was a problem hiding this comment.
Preserve resolved_folder on cache hits before returning.
Line 188 returns cached thumbnails before alternate-folder resolution can update resolved_folder. That can suppress the secondary ACL check in serve_thumbnail and leak cross-chat thumbnails after cache warmup.
Suggested fix
- if dest.exists():
- return dest, resolved_folder
-
if not source.exists():
alt_folders = legacy_folder_alternates(folder)
found = False
@@
if not found:
return None
+
+ # Resolve source folder first so callers can enforce ACL on the true origin.
+ if dest.exists():
+ return dest, resolved_folderAlso applies to: 191-201
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/web/thumbnails.py` around lines 186 - 190, The cached-path early return
returns (dest, resolved_folder) before any alternate-folder resolution can
update resolved_folder, which can bypass the secondary ACL check in
serve_thumbnail; fix by computing or updating resolved_folder (i.e., run the
same alternate-folder resolution logic that may change resolved_folder) before
the dest.exists() cache-hit return in the function that sets resolved_folder and
dest, or move the dest.exists() check to after the alternate-folder resolution,
so the returned resolved_folder reflects any alternate resolution used by
serve_thumbnail.
| except OSError, RuntimeError: | ||
| continue |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
rg -nP '^\s*except\s+[A-Za-z_][\w.]*\s*,\s*[A-Za-z_]' src/web/thumbnails.pyRepository: GeiserX/Telegram-Archive
Length of output: 114
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Context around the match =="
nl -ba src/web/thumbnails.py | sed -n '185,220p'
echo
echo "== Any other invalid except A, B in this file =="
rg -nP '^\s*except\s+[A-Za-z_][\w.]*\s*,\s*[A-Za-z_]' src/web/thumbnails.pyRepository: GeiserX/Telegram-Archive
Length of output: 160
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Context around the match (lines 185-220) =="
awk 'NR>=185 && NR<=220 {printf "%d:%s\n", NR, $0}' src/web/thumbnails.py
echo
echo "== Any other invalid except A, B in this file =="
rg -nP '^\s*except\s+[A-Za-z_][\w.]*\s*,\s*[A-Za-z_]' src/web/thumbnails.pyRepository: GeiserX/Telegram-Archive
Length of output: 1348
Fix invalid Python 3 multi-exception syntax in src/web/thumbnails.py.
except OSError, RuntimeError: (lines 203-204) uses Python 2 syntax and will fail parsing under Python 3.
Suggested fix
- except OSError, RuntimeError:
+ except (OSError, RuntimeError):
continue🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/web/thumbnails.py` around lines 203 - 204, The except clause in
src/web/thumbnails.py uses Python2 syntax "except OSError, RuntimeError:" which
fails in Python3; change it to valid Python3 multi-exception syntax by catching
a tuple like "except (OSError, RuntimeError):" (or split into two except blocks)
at the exception handling point in the thumbnails generation function where that
clause appears so the code parses and handles both errors correctly.
…, tests - Add separate video semaphore (limit 2) to prevent CPU/memory pressure - Consolidate extension lists into media_utils.py (single source of truth) - Fix ffmpeg to seek before input and fallback to first frame for short videos - Remove absolute path from video thumbnail log warning (PII guideline) - Add tests for _is_video, _check_ffmpeg, _generate_video_sync - Update test_rejects_non_image_file to reflect video acceptance
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/web/main.py (1)
821-822:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRemove folder identifier from log message.
Line 821 logs
parts[0]which is the folder name (typically a chat ID).Suggested fix
- logger.warning("Blocked restricted media request for non-chat folder: %s", parts[0]) + logger.warning("Blocked restricted media request for non-chat folder")As per coding guidelines, Never log chat IDs, topic IDs, or topic titles — log only aggregated counts to avoid PII exposure.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/web/main.py` around lines 821 - 822, The log call in logger.warning("Blocked restricted media request for non-chat folder: %s", parts[0]) exposes a folder/chat identifier (parts[0]); remove the identifier from the message and log only non-PII info (e.g., a generic message or an aggregated count). Update the logger.warning invocation in the same block (where HTTPException is raised) to omit parts[0] and, if useful, include only a safe metric such as a counter or "count=1" instead of the raw folder ID.src/web/thumbnails.py (2)
143-144:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRemove path logging that can expose chat identifiers.
Line 143 logs the full
sourcepath with file size information.Suggested fix
- logger.warning("Source too large for thumbnail: %s (%d bytes)", source, source.stat().st_size) + logger.warning("Source too large for thumbnail generation")As per coding guidelines, Never log chat IDs, topic IDs, or topic titles — log only aggregated counts to avoid PII exposure.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/web/thumbnails.py` around lines 143 - 144, The warning call in thumbnails.py currently logs the full file path via logger.warning("Source too large for thumbnail: %s (%d bytes)", source, source.stat().st_size) which can expose chat/topic identifiers; change the logger.warning invocation in the thumbnail generation function to omit the full path and log only non-identifying info (e.g., the file size and a generic label or count). Locate the logger.warning usage and replace the "%s" path argument with either a static string like "source" or an aggregated indicator, keeping the source.stat().st_size value for diagnostics and removing any sensitive path or ID data.
151-152:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRemove path logging that can expose chat identifiers.
Line 151 logs the full
sourcepath, which embeds chat IDs/folder identifiers.Suggested fix
- logger.warning("Thumbnail generation failed for %s: %s", source, e) + logger.warning("Thumbnail generation failed: %s", e)As per coding guidelines, Never log chat IDs, topic IDs, or topic titles — log only aggregated counts to avoid PII exposure.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/web/thumbnails.py` around lines 151 - 152, The warning log currently emits the full source path (logger.warning("Thumbnail generation failed for %s: %s", source, e)), which can expose chat/topic identifiers; update the log in the thumbnail generation code to avoid printing the raw source variable—use only non-PII info such as a sanitized indicator (e.g., a boolean/exists flag, an aggregated counter, or a fixed message) or a one-way hash/length instead of the path, and keep the exception e for diagnostics; locate the occurrence of logger.warning and replace the message to exclude source path details while preserving useful error context.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/web/main.py`:
- Around line 821-822: The log call in logger.warning("Blocked restricted media
request for non-chat folder: %s", parts[0]) exposes a folder/chat identifier
(parts[0]); remove the identifier from the message and log only non-PII info
(e.g., a generic message or an aggregated count). Update the logger.warning
invocation in the same block (where HTTPException is raised) to omit parts[0]
and, if useful, include only a safe metric such as a counter or "count=1"
instead of the raw folder ID.
In `@src/web/thumbnails.py`:
- Around line 143-144: The warning call in thumbnails.py currently logs the full
file path via logger.warning("Source too large for thumbnail: %s (%d bytes)",
source, source.stat().st_size) which can expose chat/topic identifiers; change
the logger.warning invocation in the thumbnail generation function to omit the
full path and log only non-identifying info (e.g., the file size and a generic
label or count). Locate the logger.warning usage and replace the "%s" path
argument with either a static string like "source" or an aggregated indicator,
keeping the source.stat().st_size value for diagnostics and removing any
sensitive path or ID data.
- Around line 151-152: The warning log currently emits the full source path
(logger.warning("Thumbnail generation failed for %s: %s", source, e)), which can
expose chat/topic identifiers; update the log in the thumbnail generation code
to avoid printing the raw source variable—use only non-PII info such as a
sanitized indicator (e.g., a boolean/exists flag, an aggregated counter, or a
fixed message) or a one-way hash/length instead of the path, and keep the
exception e for diagnostics; locate the occurrence of logger.warning and replace
the message to exclude source path details while preserving useful error
context.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 9062b3b1-7d7e-4cf9-9ae6-8d4ca037abdf
📒 Files selected for processing (4)
src/web/main.pysrc/web/media_utils.pysrc/web/thumbnails.pytests/test_web_thumbnails.py
|
Superseded by new PR from clean branch (rebased onto main) |
Summary
openMediaItem()was settinglightboxMediato an array instead of a single objectthumb_urlfor video files (mp4, mkv, avi, etc.)Test plan
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Chores