A tool for downloading, auditing, and organizing content from Canvas LMS courses. Available as a graphical desktop application or command-line tool.
- Overview
- Requirements
- Installation
- Quick Start
- Usage
- Configuration
- Security and Privacy
- Accessibility
- Pipeline Testing
- Program Flags Reference
- Obtaining a Canvas API Access Token
- Uninstall
- Support
- License
CanvasBot is a Windows application designed for accessible media coordinators and instructional designers at universities. It connects to Canvas LMS via the REST API to:
- Download all files from courses (documents, videos, audio, images)
- Categorize embedded content by type using configurable regex patterns
- Export content inventories to Excel or JSON for accessibility auditing
- Track download progress to avoid re-downloading files
CanvasBot operates in read-only mode — it reads course content via the Canvas API but never creates, modifies, or deletes any content, grades, enrollments, or settings in Canvas.
CanvasBot can be used through a graphical user interface (GUI) or the command line (CLI). Double-click the executable or run without arguments to launch the GUI; pass command-line flags for scripted/automated workflows.
Download course content into organized folder structures that mirror the Canvas course hierarchy. Files are organized by module, assignment, and content type for easy navigation.
Canvasbot.exe --course_id 12345 --download_folder "C:\Downloads" --include_video_filesOutput structure:
Biology 101 - 12345/
├── 26-01-2026/
│ ├── Module 1 - Introduction/
│ │ ├── Week 1 Assignment/
│ │ │ └── Documents/
│ │ │ ├── Content Location.lnk ← shortcut to Canvas page
│ │ │ └── syllabus.pdf
│ │ └── VideoFiles/
│ │ └── welcome_video.mp4
│ └── Module 2 - Cell Biology/
│ └── Documents/
│ └── lecture_notes.docx
CanvasBot tracks previously downloaded files in a manifest, so running it daily only downloads new content. Each run creates a date-stamped folder with only the files added since your last download.
# Run daily - only new files are downloaded
Canvasbot.exe --course_id 12345 --download_folder "C:\Downloads"This makes it ideal for:
- Automated daily content audits
- Keeping local copies in sync with Canvas
- Archiving course materials over time
Inspect course content structure directly in the terminal with a color-coded tree view. See all content types, hidden items, caption status, and clickable URLs at a glance.
# Content tree - only resources with content (hides empty branches)
Canvasbot.exe --course_id 12345 --print_content_tree
# Full course tree - shows all resources including empty ones
Canvasbot.exe --course_id 12345 --print_full_courseExample output:
🎓 Biology 101 | Course ID: 12345
│ ↳ https://yourschool.instructure.com/courses/12345
├── 📚 Modules
│ └── 📖 Module: Introduction to Biology
│ ├── 📄 Document: Syllabus.pdf
│ │ ↳ https://yourschool.instructure.com/files/123/download
│ ├── 🎬 VideoFile: Welcome Video.mp4
│ │ ↳ https://yourschool.instructure.com/files/456/download
│ └── 📹 VideoSite: Introduction Lecture
│ ↳ https://www.youtube.com/watch?v=abc123
Features:
- Color-coded by content type
[hidden]indicator for unpublished content- Full URLs for easy access
- Content-only view hides empty modules and resource branches
After scanning, view a summary of all discovered content by type:
Export complete course content metadata to JSON for integration with other systems, custom reporting, or programmatic analysis.
Canvasbot.exe --course_id 12345 --output_as_json "C:\Reports"Use cases:
- Feed into accessibility scanning tools
- Build custom dashboards
- Integrate with institutional reporting systems
- Archive course metadata
Generate organized Excel workbooks (.xlsm) with content categorized across multiple sheets. Includes dropdown validation, conditional formatting, and hyperlinks for accessibility tracking workflows.
Canvasbot.exe --course_id 12345 --output_as_excel "C:\Reports"Sheets included:
- Documents & Document Sites
- Video Files & Video Sites
- Audio Files & Audio Sites
- Image Files
- Unsorted Links
Features:
- Dropdown menus for tracking review status
- Hyperlinks to source pages and download URLs
- Hidden content flagging
- Ready for accessibility audit workflows
CanvasBot classifies content into these categories:
| Type | Description | Examples |
|---|---|---|
| Documents | Downloadable document files | PDF, DOCX, PPTX, XLSX, ODT, EPUB, Pages |
| Document Sites | Cloud document platforms | Google Docs, OneDrive |
| Video Files | Downloadable video files | MP4, MOV, MKV |
| Video Sites | Video hosting platforms | YouTube, Vimeo, Panopto, Kaltura, YuJa, Echo360, Zoom, Wistia, Brightcove, Kanopy |
| Audio Files | Downloadable audio files | MP3, M4A, WAV |
| Audio Sites | Audio/podcast platforms | Podcast links |
| Image Files | Image files | JPG, PNG, GIF |
| File Storage Sites | Cloud storage | Box, Google Drive |
| Digital Textbooks | E-textbook platforms | Cengage, McGraw-Hill |
| Canvas Studio | Canvas Studio embeds | Institution media |
| Unsorted | Unclassified links | Everything else |
- Windows 10 or later
- Canvas API Access Token (read access to courses)
- For Excel export with macros: "Trust access to the VBA project object model" must be enabled in Excel (File > Options > Trust Center > Trust Center Settings > Macro Settings)
Download the latest executable from Releases.
This is a standalone executable - no Python installation required.
git clone https://github.com/Fontaineconsult/canvas-bot-v2.git
cd canvas-bot-v2
pip install -r requirements.txt
python canvas_bot.py --helpThe executable is code-signed (publisher: Daniel Fontaine). Windows SmartScreen may still show a "Windows protected your PC" warning for new releases — this is normal and stops after enough users download the same version.
To proceed:
- Click "More info"
- Verify the publisher shows "Daniel Fontaine"
- Click "Run anyway"
To verify your download, right-click the .exe → Properties → Digital Signatures tab, or compare the SHA256 checksum on the Releases page.
On first run, you'll be prompted for:
- Canvas identifier - Your institution's subdomain (e.g.,
sfsuforhttps://sfsu.instructure.com). All URLs are auto-generated from this. - API Access Token - Generated from Canvas settings (see Obtaining a Canvas API Access Token)
Double-click the executable or run python canvas_bot.py with no arguments to launch the graphical interface. The GUI is organized into three tabs. All settings are saved between sessions. Click About for a built-in guide, or use View Config / Reset Config in the title bar to manage credentials.
Enter a course ID (or select a batch .txt file), choose an output folder, and click Run. Download options let you include video, audio, image, hidden, or inactive content. Display options print a course tree to the log.
Browse content from previously scanned courses. Select a course from the dropdown, then use the category buttons (Documents, Videos, Audio, Images, Other, Unsorted) to view content in sortable tables. Mark items as Passed, Needs Review, or Ignore for accessibility auditing — status is saved per course. Action buttons at the bottom open files, folders, and Canvas source pages.
View and edit the regex patterns that classify content URLs. The left panel lists categories; the right panel shows patterns in the selected category. Use Add, Remove, and Validate to manage patterns, or Test a URL to see which categories match it.
Pass command-line flags for scripted or automated workflows:
# Download documents from a course
Canvasbot.exe --course_id 12345 --download_folder "C:\Downloads"
# Export course content to Excel
Canvasbot.exe --course_id 12345 --output_as_excel "C:\Reports"
# Export to JSON
Canvasbot.exe --course_id 12345 --output_as_json "C:\Reports"Canvasbot.exe --course_id 12345 --download_folder "C:\Downloads"Create a text file with course IDs (one per line):
Canvasbot.exe --course_id_list courses.txt --download_folder "C:\Downloads"By default, only documents are downloaded. Add flags for other types:
Canvasbot.exe --course_id 12345 --download_folder "C:\Downloads" \
--include_video_files \
--include_audio_files \
--include_image_filesDownload all files to a single folder instead of preserving course structure:
Canvasbot.exe --course_id 12345 --download_folder "C:\Downloads" --flattenCanvasBot tracks downloaded files in download_manifest.yaml to prevent re-downloads. Each run creates a date-stamped folder with only new files. Delete the course folder to re-download everything.
If a file cannot be downloaded (authentication required, unavailable, etc.), CanvasBot creates a Windows shortcut (.lnk) to the URL for manual investigation.
Generate a macro-enabled workbook (.xlsm) with content organized by type:
Canvasbot.exe --course_id 12345 --output_as_excel "C:\Reports"Sheets included:
- Documents
- Document Sites
- Video Files
- Video Sites
- Audio Files
- Audio Sites
- Image Files
- Unsorted
Features:
- Dropdown validation for tracking status
- Conditional formatting
- Hyperlinks to source pages and downloaded files
Export all content metadata to JSON:
Canvasbot.exe --course_id 12345 --output_as_json "C:\Reports"Example output:
{
"course_id": "12345",
"course_name": "Introduction to Biology",
"course_url": "https://yourschool.instructure.com/courses/12345",
"content": {
"documents": [
{
"title": "Syllabus.pdf",
"url": "https://yourschool.instructure.com/files/123/download",
"file_type": "pdf",
"source_page_type": "Page",
"source_page_url": "https://yourschool.instructure.com/courses/12345/pages/welcome",
"is_hidden": false,
"order": 1
}
],
"videos": {
"video_sites": [...],
"video_files": [...]
}
}
}Display the course structure in the console:
# Show only resources that contain content
Canvasbot.exe --course_id 12345 --print_content_tree
# Show complete course tree
Canvasbot.exe --course_id 12345 --print_full_courseCanvasBot decides how to classify every URL and filename it discovers by testing it against a series of regular expression (regex) patterns. These patterns are organized into categories like document_content_regex, web_video_resources_regex, ignore_list_regex, and so on. When a URL matches a pattern in a category, CanvasBot assigns it to that content type — for example, a URL ending in .pdf matches document_content_regex and becomes a Document node, while a YouTube link matches web_video_resources_regex and becomes a VideoSite node. URLs that don't match any pattern are classified as Unsorted.
The default patterns cover common file types and popular platforms (YouTube, Vimeo, Google Docs, Box, etc.), but every institution has its own tools and services. You can add patterns to recognize institution-specific video platforms, custom document hosting, or any other content source that CanvasBot doesn't detect out of the box. You can also remove patterns that produce false positives or add entries to the ignore list to skip URLs you don't care about.
Patterns use Python's re module syntax with case-insensitive matching. If you're new to regular expressions, Pythex is a helpful tool for building and testing patterns interactively before adding them to CanvasBot.
Patterns can be managed from the Patterns tab in the GUI or via CLI flags:
# List all pattern categories
Canvasbot.exe --patterns-list
# List patterns in a specific category
Canvasbot.exe --patterns-list document_content_regex# Add a pattern (with confirmation prompt)
Canvasbot.exe --patterns-add document_content_regex ".*\.odt"
# Add without confirmation
Canvasbot.exe --patterns-add document_content_regex ".*\.odt" -yCanvasbot.exe --patterns-remove document_content_regex ".*\.odt" -y# Test what categories match a URL or filename
Canvasbot.exe --patterns-test "myfile.pdf"
Canvasbot.exe --patterns-test "https://youtube.com/watch?v=abc123"Canvasbot.exe --patterns-validate ".*\.pdf"Canvasbot.exe --patterns-reset -yCredentials are stored securely in Windows Credential Vault:
- Canvas API Token
- Canvas Studio OAuth tokens (client ID, secret, access/refresh tokens)
| File | Location | Purpose |
|---|---|---|
config.json |
%APPDATA%\canvas bot\ |
Instance URLs and settings |
re.yaml |
%APPDATA%\canvas bot\ |
User-customized patterns |
gui_settings.json |
%APPDATA%\canvas bot\ |
Saved GUI inputs (folders, checkboxes) |
canvas_bot.log |
%APPDATA%\canvas bot\ |
Application logs |
From the GUI, use the View Config and Reset Config buttons in the title bar.
From the CLI:
# Reset Canvas API credentials
Canvasbot.exe --reset_canvas_params
# Reset Canvas Studio OAuth
Canvasbot.exe --reset_canvas_studio_params
# View current configuration status
Canvasbot.exe --config_statusCanvasBot handles sensitive credentials and institutional course content. The following measures are in place to protect this data.
- Encrypted storage — API tokens and OAuth credentials are stored in the Windows Credential Vault via the
keyringlibrary, never in plaintext configuration files. - In-memory isolation — After loading from the Credential Vault, credentials are held in a private module-level store rather than process environment variables. This prevents leakage to child processes, debugging tools, or other libraries reading
os.environ. - Automatic cleanup — Credentials are cleared from memory when the application exits via
atexithandler.
- TLS certificate verification — All Canvas API and Studio API calls validate SSL/TLS certificates. Connections to servers with invalid or expired certificates are rejected.
- Sensitive data stripped from logs — API tokens, email addresses, and other sensitive query parameters are removed from all URLs before they are written to log files or displayed in warnings.
- Audit trail — Structured log entries track course scan start/completion, download summaries, and export operations with course IDs and item counts.
- User and session identification — Every log entry includes the Windows username and a unique session ID for attribution on shared machines.
- Unhandled exception logging — All unexpected errors are captured with full tracebacks for debugging, including a global exception hook as a safety net.
- Log file permissions — Log files are stored under
%APPDATA%\canvas bot\, which is per-user protected on Windows. Best-effort file permission restrictions are applied on creation. - No sensitive content in logs — Logs contain course IDs, file counts, timestamps, and error messages. No file contents, student data, or credentials are recorded.
- Course ID validation — Course IDs are validated as numeric before being used in API requests, in both CLI and GUI modes. Batch course list files skip invalid entries with per-line warnings.
- Filename sanitization — Downloaded filenames are stripped of invalid Windows characters and truncated to respect path length limits.
- Regex validation — User-supplied patterns are validated with
re.compile()before being saved.
- Executable blocklist — The "Open File" button in the Content Viewer enforces a hardcoded blocklist of dangerous file extensions (
.exe,.bat,.cmd,.ps1,.vbs,.js,.msi,.dll,.lnk,.hta, macro-enabled Office formats, and others). Blocked files cannot be opened viaos.startfile()and display a warning dialog instead.
- No shell injection — GUI subprocess calls use argument lists instead of shell string interpolation, preventing command injection.
- No dynamic code execution — The application does not use
eval(),exec(), or similar constructs.
All application data is stored under %APPDATA%\canvas bot\, a per-user protected directory on Windows:
| Data | Location | Sensitivity |
|---|---|---|
| API tokens | Windows Credential Vault | High (encrypted) |
| Instance config | config.json |
Low (URLs only) |
| Application logs | canvas_bot.log |
Medium (URLs, errors — tokens stripped) |
| GUI settings | gui_settings.json |
Low (paths, preferences) |
| Downloaded content | User-specified folder | Varies (course content) |
Downloaded course content is stored as-is in user-specified folders. Canvas Bot does not transmit downloaded content to any third party — all data remains on the local machine.
FERPA note: Downloaded course content may contain FERPA-protected information (student names in page titles, discussion references, etc.). Handle downloaded materials per your institution's data governance policy, store them on an encrypted drive (e.g., BitLocker), and delete downloads when they are no longer needed.
For IT administrators evaluating Canvas Bot, a detailed security summary is available at claude/IT_SECURITY.md.
The GUI is built with CustomTkinter, which has inherent limitations with screen reader support. Within those constraints, Canvas Bot implements the following accessibility features:
- Full keyboard access — every interactive element is reachable via Tab/Shift+Tab and activatable with Enter. Tab selector buttons support Left/Right arrow navigation. Pattern categories support Up/Down arrows.
- Keyboard shortcuts — Alt+key shortcuts are provided for all buttons across all tabs, with underlined mnemonic characters. Tab switching via Alt+U/N/P and Ctrl+1/2/3. All shortcuts require a modifier key (no single-character shortcuts).
- Visible focus indicators — all buttons, entries, checkboxes, and dynamically created controls display a 2px blue focus ring in both light and dark modes.
- Color is never the sole indicator — review status rows use background color (green/yellow/gray) but always include a text label in the Status column. The status bar uses a "WARNING" text prefix alongside orange color. Pattern test results include "MATCH:" / "No matches" prefixes alongside color.
- Error identification — validation errors in the Add Pattern dialog display descriptive text and return focus to the input field. Status bar errors include text prefixes.
- Logical focus and reading order — tab order matches the visual layout. Dialogs set initial focus on the primary action. Escape closes all dialogs.
- Tooltips — all controls have descriptive tooltips that appear on hover and keyboard focus, showing the associated shortcut key.
These cannot be resolved without migrating to a different GUI framework:
- No screen reader support — CustomTkinter widgets do not expose name, role, or value to Windows UI Automation or MSAA. Screen readers (NVDA, JAWS, Narrator) have severely limited support.
- No live region announcements — status bar changes and console output cannot be pushed to assistive technology.
- No semantic structure — no heading levels, landmark regions, or programmatic language declaration.
- No user text spacing control — font rendering is fixed by the Tk engine.
A detailed WCAG 2.1 conformance report is available at claude/WCAG_VPAT.md.
CanvasBot includes a testing framework to validate the content extraction pipeline.
Validates that raw Canvas API data is correctly transformed:
display_namevsfilenamehandling- URL decoding of filenames
- Extension preservation
- Windows-safe filename generation
# Collect test data from courses (requires API access)
python -m test.pipeline_testing batch-collect --range 34000-35000 --output corpus.json
# Run tests offline against collected data
python -m test.pipeline_testing batch-test --corpus corpus.json
# Direct comparison of raw vs processed output
python -m test.pipeline_testing compare --raw raw.json --processed processed.json| Command | Description |
|---|---|
collect |
Collect raw API data from single course |
batch-collect |
Collect from many courses (1 API call each) |
batch-test |
Test pipeline offline against corpus |
test |
Direct pipeline test against raw data |
compare |
Compare raw API vs processed output |
side-by-side |
Visual comparison output |
| Flag | Description |
|---|---|
--course_id TEXT |
Single course ID to process |
--course_id_list TEXT |
File containing course IDs (one per line) |
| Flag | Description |
|---|---|
--download_folder TEXT |
Directory for downloaded files |
--output_as_json TEXT |
Export content to JSON (specify directory) |
--output_as_excel TEXT |
Export content to Excel (specify directory) |
--print_content_tree |
Display course tree showing only resources with content |
--print_full_course |
Display complete course tree including all resources |
| Flag | Description | Default |
|---|---|---|
--include_video_files |
Include video files in download | False |
--include_audio_files |
Include audio files in download | False |
--include_image_files |
Include image files in download | False |
--include_inactive_content |
Include files not linked from any active Canvas page | False |
--flatten |
Download all files to single directory | False |
--download_hidden_files |
Include content hidden from students | False |
--flush_after_download |
Delete files after processing | False |
| Flag | Description |
|---|---|
--patterns-list [CATEGORY] |
List all patterns or patterns in category |
--patterns-add CATEGORY PATTERN |
Add pattern to category |
--patterns-remove CATEGORY PATTERN |
Remove pattern from category |
--patterns-test TEXT |
Test what categories match input |
--patterns-validate TEXT |
Validate regex syntax |
--patterns-reset |
Reset patterns to defaults |
-y |
Skip confirmation prompts |
| Flag | Description |
|---|---|
--reset_canvas_params |
Reset Canvas API credentials |
--reset_canvas_studio_params |
Reset Canvas Studio OAuth |
--config_status |
Show current configuration |
- Log into Canvas
- Go to Account > Settings
- Scroll to Approved Integrations
- Click + New Access Token
- Name your token and click Generate Token
- Copy the token immediately (it won't be shown again)
The token is stored encrypted in Windows Credential Vault.
CanvasBot only requires read access to courses.
For department-wide use, we recommend creating a dedicated Canvas service account rather than using individual staff tokens:
- Create a Canvas account with read-only enrollment across the courses you need to audit
- Generate a single API token from that account
- Distribute the Canvas Bot executable to staff — each user configures the same service account token on first run
This provides centralized access control: revoking the service account token immediately disables Canvas Bot for all users. Individual staff do not need to manage their own tokens or have personal API access.
Canvas Bot is a portable application with no installer. To fully remove it:
- Delete the executable from wherever you saved it
- Delete application data:
%APPDATA%\canvas bot\(contains config, logs, GUI settings, and user patterns) - Remove stored credentials: open Windows Credential Manager, search for entries containing "canvas", and delete them
- Revoke your API token: in Canvas, go to Account > Settings > Approved Integrations and delete the token
- Delete downloaded content from your output folders if no longer needed
Contact: fontaine@sfsu.edu
For bug reports and feature requests: GitHub Issues
GUI:
- Tabbed interface — reorganized the GUI into three tabs (Run, Content, Patterns) with
Ctrl+1/2/3keyboard shortcuts to switch between them. - Consolidated output — replaced three separate folder pickers with a single Output Folder and three action checkboxes (Download files, Export to Excel, Export to JSON).
- Content Viewer — a persistent browser for all previously scanned courses. Scans the output folder for
.manifest/JSON files and populates a course dropdown. Content is displayed in nested sub-tabs (Documents, Videos, Audio, Images, Unsorted) with sortable tables, a summary banner, a detail panel with clickable URLs, and buttons to open file locations or source pages. A "Downloaded" column checks whether each file exists at its expected path. - Pattern Manager — full GUI for managing regex patterns from
re.yaml. Left column lists all pattern categories with counts; right column shows patterns for the selected category with Add, Remove, and Validate buttons. Bottom panel tests a URL or filename against all compiled matchers with live reload. "Reset All to Defaults" restores the bundledre.yaml. Category visibility is configurable at the code level to hide internal categories. Patterns with{CANVAS_DOMAIN}placeholders display substituted values for readability. - Reusable table widget —
ContentTableclass wrappingttk.Treeviewwith scrollbars, column-header sorting, alternating row colors, and automatic dark/light theme matching. - Focus rings and tooltips — all interactive elements across Content and Patterns tabs show focus rings and descriptive tooltips, matching the Run tab's accessibility features.
- Content tab auto-refresh — switching to the Content tab automatically refreshes the course list.
Default Patterns:
- Expanded document patterns — added 9 accessibility-relevant file types: ODT, ODP, ODS, Key, Numbers, Pub, EPUB, XPS, 7z.
- Expanded video site patterns — added 47 new video platform patterns covering Panopto, Kaltura, YuJa, Wistia, Brightcove, Echo360, Kanopy, Loom, ScreenPal, Flipgrid/Flip, Microsoft Stream, Twitch, Instagram Reels, LinkedIn Video, and many more.
- Institution-specific video patterns — populated the
institution_video_services_regexcategory with 12{CANVAS_DOMAIN}-prefixed patterns for platforms that use institution subdomains (Panopto, Kaltura, YuJa, Echo360, Kanopy, ShareStream, Ensemble, ScreenPal).
Content Pipeline:
- Module anchor URLs in source page links — when content is discovered inside a Module (which has no direct
html_url), the source page URL is now constructed as{course_url}/modules#{module_id}. This creates an anchor link that scrolls directly to the correct module on the Canvas modules page, rather than linking to the generic modules listing. - Active content filtering — downloads now skip files not linked from any active Canvas page by default. Use
--include_inactive_content(CLI) or the "Include inactive content" checkbox (GUI) to override. The Content Viewer also has a "Show Inactive Content" filter toggle.
Content Viewer:
- Downloaded column shows download date — file tables now display the actual download date (from the date-stamped folder on disk) instead of "Yes", with glob-based search across date folders for files downloaded on previous days.
- Empty table placeholders — tables with no content show "No {Content Type} Found" instead of an empty table.
Content Pipeline:
- Robust file type detection — centralized
get_file_type()helper with a 7-step fallback chain replaces inconsistent inline logic, improvingfile_typeaccuracy in JSON and Excel exports. - Canvas Studio downloads use correct URL — Studio video downloads now use the DRM video stream URL instead of the Studio page URL.
Stability:
- OSError handler for disconnected drives — the downloader now catches
OSErrorduring file writes (e.g., when a network drive is disconnected mid-download) and exits cleanly with a message instead of crashing with a traceback. - Pattern placeholder substitution fix — environment variables are now loaded at Pattern Manager init time so
{CANVAS_DOMAIN}tokens display correctly. - Regex pattern reloading — patterns with domain placeholders (
{CANVAS_STUDIO_DOMAIN},{CANVAS_DOMAIN}, etc.) are now recompiled after config loads, fixing Canvas Studio embeds and Box links being classified as Unsorted.
GUI:
- Graphical user interface — double-click the executable or run with no arguments to launch a desktop GUI built with CustomTkinter. CLI mode is still available by passing flags.
- Course selection (single ID or batch
.txtfile), output folder pickers, download/display option checkboxes, real-time log output, status bar - Settings persistence — all GUI inputs saved to
%APPDATA%\canvas bot\gui_settings.jsonand restored on next launch - About dialog — overview of Canvas Bot, guide to every GUI section, first-time setup steps, and contact info
- View Config / Reset Config buttons — manage credentials directly from the GUI
- Accessibility — keyboard shortcuts (
Alt+R/V/C/A), Tab focus navigation with visible focus rings, tooltips on all controls, Escape to close dialogs
- Course selection (single ID or batch
Excel Export:
- Robust COM automation — VBA insertion now handles corrupted type library caches, missing Trust Center permissions (with step-by-step fix instructions), and invalid hyperlink values gracefully instead of crashing
- Stale file lock detection — existing
.xlsmfiles are removed before writing; locked files produce a clear error message - Path normalization — GUI folder paths are normalized to prevent
PermissionErroron mapped network drives
Other:
- Application icon (
cb.ico) displayed in window titlebar and taskbar - Removed
--export_course_list,--semester_filter, and--check_video_site_caption_statusCLI flags
Improvements:
- Simplified first-run setup — only asks for the Canvas subdomain (e.g.,
sfsu). All URLs are auto-generated. Removed multi-step wizard and optional prompts for Box/Library Proxy domains. - Split tree display into two modes —
--print_content_treeshows only resources with content (empty branches hidden);--print_full_courseshows everything. Replaces the old--show_content_treeflag. - Content Location shortcuts — download folders now include a
Content Location.lnkshortcut that links directly to the Canvas page containing the content, making it easy to navigate back for inspection or remediation. - Safe folder deletion —
clear_folder_contents()now verifies the target contains a Canvas Bot manifest before deleting, preventing accidental deletion of unrelated folders. - Warning collector for animated spinners — network errors are now buffered silently during import and displayed in a single Error Report block after import completes, preventing error messages from corrupting spinner animations.
- Cleaner API error messages — network errors show human-readable status and message instead of raw JSON dicts. Access tokens are stripped from URLs before display.
- Canvas tree stats cleanup — container nodes filtered from Content Summary, resource labels pluralized, content URLs indented deeper than resource URLs for visual distinction.
- Security — API access tokens stripped from log files; duplicate log handler removed.
- EXE test harness — automated test suite (64 offline + 20 API tests) validates every CLI flag combination.
Bug Fixes:
- Fixed Pages import spinner incorrectly labeled as "Importing Announcements"
- Fixed
AttributeErrorfrom call to deleted_print_url_legend()method - Fixed missing manifest registration in Announcement class
- Fixed blocking
input()call in caption upload error path
Major release with significant new features and stability improvements.
New Features:
- Pattern management CLI (
--patterns-list,--patterns-add,--patterns-remove,--patterns-test,--patterns-validate,--patterns-reset) - Pipeline testing framework for validating content extraction
- Course list export with semester filtering (
--export_course_list,--semester_filter) - Configuration status command (
--config_status)
Bug Fixes:
- Fixed filename derivation to prefer
display_nameover URL-encodedfilename - Added URL decoding for filenames (converts
+to spaces) - Improved Canvas Studio embed detection
Testing:
- Validated against 27,000+ files across 499 courses with 99.7% pass rate
- Canvas Studio integration
- Many bug fixes
- Macro-enabled Excel workbook export
- YouTube API integration for caption checking
- Logging system
- Initial release
- LTI / SCORM / External Tool detection — identify third-party content that is outside institutional control for accessibility compliance review
- GUI interface (added in v1.2.0)
- Content Viewer for browsing scanned course data (added in v1.2.2)
- Pattern Manager GUI for regex CRUD (added in v1.2.2)
- Better Box/Dropbox/Google Drive support
- Batch accessibility reporting
- Long directory paths may cause issues on Windows (260 character limit)
- Some shortcut creation may fail depending on path characters
Creative Commons Attribution-NonCommercial 4.0 International (CC-BY-NC-4.0)
Copyright (c) 2023-2026 Daniel Fontaine
You are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material
Under the following terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made.
- NonCommercial — You may not use the material for commercial purposes.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Full license text: https://creativecommons.org/licenses/by-nc/4.0/legalcode






