Standardize Unicode encodings to explicitly use UTF-8#635
Standardize Unicode encodings to explicitly use UTF-8#635dvmukul wants to merge 2 commits intomicrosoft:mainfrom
Conversation
This change adds encoding='utf-8' to file open operations in the script runner, configuration manager, marketplace client, and other core modules. This resolves reported issues (e.g., microsoft#604) where UnicodeDecodeError occurs on Windows when handling prompts and configuration files with non-ASCII characters.
|
@dvmukul please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
Contributor License AgreementContribution License AgreementThis Contribution License Agreement (“Agreement”) is agreed to by the party signing below (“You”),
|
sergio-sisternes-epam
left a comment
There was a problem hiding this comment.
Looks good overall — the changes are correct and directly fix the Windows UnicodeDecodeError from #604. Two suggestions in the inline comments (missed open() calls + test coverage).
|
|
||
| if not os.path.exists(CONFIG_FILE): | ||
| with open(CONFIG_FILE, "w") as f: | ||
| with open(CONFIG_FILE, "w", encoding="utf-8") as f: |
There was a problem hiding this comment.
👍 Good fix. Note that an AST scan of the codebase found five more text-mode open() calls that still lack explicit encoding='utf-8':
| File | Line | Operation |
|---|---|---|
adapters/client/codex.py |
65 | toml.dump() — write config |
adapters/client/codex.py |
80 | toml.load() — read config |
adapters/client/copilot.py |
68 | json.dump() — write config |
adapters/client/copilot.py |
83 | json.load() — read config |
deps/github_downloader.py |
184 | write empty gitconfig |
Since the PR title says "Standardize … across the codebase," it would be great to cover these too (same pattern, same Windows risk).
| self.compiled_dir.mkdir(parents=True, exist_ok=True) | ||
|
|
||
| with open(prompt_path, "r") as f: | ||
| with open(prompt_path, "r", encoding="utf-8") as f: |
There was a problem hiding this comment.
This is the core fix for #604. Consider adding a small test that round-trips a non-ASCII string (e.g. CJK characters) through PromptCompiler.compile() to lock in the fix — the existing test_cli_encoding.py only covers console stream reconfiguration, not file I/O.
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Standardizes text file I/O across apm_cli by explicitly using UTF-8 encodings when reading/writing JSON and configuration files, improving cross-platform behavior (notably Windows default encodings).
Changes:
- Add
encoding="utf-8"toopen()calls that read/write JSON config and cache files. - Ensure plugin metadata and MCP server configuration are read consistently as UTF-8.
- Standardize marketplace registry persistence to UTF-8 for both reads and writes.
Show a summary per file
| File | Description |
|---|---|
| src/apm_cli/runtime/copilot_runtime.py | Reads MCP config JSON using explicit UTF-8 encoding. |
| src/apm_cli/models/plugin.py | Reads plugin.json metadata using explicit UTF-8 encoding. |
| src/apm_cli/marketplace/registry.py | Reads/writes marketplace registry JSON using explicit UTF-8 encoding (including temp file). |
| src/apm_cli/marketplace/client.py | Reads/writes marketplace cache JSON/meta using explicit UTF-8 encoding. |
| src/apm_cli/config.py | Reads/writes global config JSON using explicit UTF-8 encoding. |
Copilot's findings
- Files reviewed: 5/5 changed files
- Comments generated: 0
Summary of Changes
This Pull Request standardizes file operations across the codebase by adding explicit
encoding='utf-8'toopen()calls.Rationale
On Windows, the default system encoding is often locale-specific (e.g., CP1252 or CP950), which causes
UnicodeDecodeErrorwhen reading or writing files that contain UTF-8 characters (like prompt templates or localized marketplace metadata). This fix ensures thatapmbehaves consistently across Windows, macOS, and Linux.Key Fixes:
script_runner.py: Fixed issue in prompt compilation and script reading (resolves bug: UnicodeDecodeError reading .prompt.md on Windows CP950 — open() missing encoding parameter #604).config.py: Standardized global configuration file operations.marketplace/client.py®istry.py: Fixed local caching of marketplace manifests.models/plugin.py: Ensured stable reading ofplugin.jsonmetadata.runtime/copilot_runtime.py: Fixed MCP configuration retrieval.Verification
Changes were audited for text-based text vs binary-based operations to avoid data corruption. Existing unit tests were prioritized to ensure no regressions in core functionality.