draft(intake): study upstream ai-skills/generic stack impact by nsalvacao · Pull Request #147 · nsalvacao/spec-kit

nsalvacao · 2026-02-25T12:59:14Z

Purpose (Study-Only)

This is an evaluation Draft PR to inspect the upstream AI-skills/generic stack in isolation before deciding whether to adapt it to our fork architecture.

Target Upstream Stack

9402ebd - Feat/ai skills (Feat/ai skills github/spec-kit#1632)
24d76b5 - Add pytest and Python linting (ruff) to CI (Add pytest and Python linting (ruff) to CI github/spec-kit#1637)
6150f1e - Generic agent support with customizable command directories (Add generic agent support (bring your own agent) github/spec-kit#1639)
04fc3fd - Bump github/codeql-action v3 -> v4 (chore(deps): bump github/codeql-action from 3 to 4 github/spec-kit#1635)

What Was Imported in This Draft

.github/workflows/codeql.yml
Release packaging/release-script deltas related to generic agent packaging
Agent context script updates
tests/test_ai_skills.py and follow-up test deltas from the stack

Explicitly Deferred (Conflict-Heavy Core)

For this study pass, conflict-heavy files were intentionally kept on fork baseline to avoid unintended replacement of current architecture:

src/specify_cli/__init__.py
README.md
CHANGELOG.md
pyproject.toml
AGENTS.md

Current Expected State

This Draft is not merge-ready.
tests/test_ai_skills.py currently fails to collect because core implementation symbols are not yet ported (e.g., _get_skills_dir).

Why This Is Useful

It allows us to:

Review surface area, CI/script effects, and packaging impacts of the upstream stack.
Estimate the real adaptation cost into our roadmap (without forcing premature adoption).
Plan a future dedicated adaptation PR if approved.

Credit / Provenance

Authorship is preserved via cherry-pick -x on all four upstream commits.

* implement ai-skills command line switch * fix: address review comments, remove breaking change for existing projects, add tests * fix: review comments * fix: review comments * fix: review comments * fix: review comments * fix: review comments, add test cases for all the agents * fix: review comments * fix: review comments * chore: trigger CI * chore: trigger CodeQL * ci: add CodeQL workflow for code scanning * ci: add actions language to CodeQL workflow, disable default setup --------- Co-authored-by: dhilipkumars <s.dhilipkumar@gmail.com> (cherry picked from commit 9402ebd)

* feat: add GitHub Actions workflow for testing and linting Python code * fix: resolve ruff lint errors in specify_cli - Remove extraneous f-string prefixes (F541) - Split multi-statement lines (E701, E702) - Remove unused variable assignments (F841) - Remove ruff format check from CI workflow (format-only PR to follow) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: strip ANSI codes in ai-skills help text test The Rich/Typer CLI injects ANSI escape codes into option names in --help output, causing plain string matching to fail. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> (cherry picked from commit 24d76b5)

…ub#1639) - Add --ai generic option for unsupported AI agents (bring your own agent) - Require --ai-commands-dir to specify where agent reads commands from - Generate Markdown commands with $ARGUMENTS format (compatible with most agents) - Rebuild CHANGELOG from GitHub releases (last 10 releases) - Align version to 0.1.3 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> (cherry picked from commit 6150f1e)

Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3 to 4. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](github/codeql-action@v3...v4) --- updated-dependencies: - dependency-name: github/codeql-action dependency-version: '4' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> (cherry picked from commit 04fc3fd)

github-actions · 2026-02-25T12:59:28Z

📝 AI PR Summary

Summary

Add CodeQL security analysis workflow, support new "generic" agent in release packaging and update scripts, and introduce comprehensive unit tests for AI agent skills installation.

Changes

Added .github/workflows/codeql.yml for CodeQL scanning on main branch pushes and PRs.
Enhanced release workflow to commit version bumps automatically.
Updated release packaging scripts (bash and PowerShell) to include new "generic" agent.
Modified agent context update scripts to recognize "generic" agent with appropriate messaging.
Added new unit tests (tests/test_ai_skills.py) covering AI agent skills installation, YAML frontmatter parsing, and CLI validation.

Impact

Security scanning integrated into CI via CodeQL.
Release process now supports automatic version bump commits.
Build and release pipelines support the new "generic" agent.
Agent context update tools handle "generic" agent gracefully.
Improved test coverage for AI skills installation functionality.

🤖 Auto-generated · openai/gpt-4.1-mini · GitHub Models free tier · 0 premium requests

gemini-code-assist · 2026-02-25T12:59:32Z

Summary of Changes

Hello @nsalvacao, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request serves as a preliminary investigation into integrating an upstream AI-skills/generic stack. It selectively imports components to assess their surface area, CI/script interactions, and packaging implications, while consciously avoiding immediate changes to core, conflict-prone files. The aim is to gather data and insights to inform a strategic decision on future adaptation, rather than to deliver a functional, mergeable feature at this stage.

Highlights

Evaluation Draft PR: This pull request is explicitly marked as a 'study-only' evaluation draft, intended to inspect an upstream AI-skills/generic stack without being merge-ready.
Upstream Stack Integration: The PR imports specific components from four upstream commits, including CI workflows, release packaging deltas, agent context script updates, and new AI skills test files.
Deferred Files: Several conflict-heavy core files (e.g., __init__.py, README.md, pyproject.toml) were intentionally excluded from this study pass to avoid premature architectural changes.
Known Limitations: The newly added tests/test_ai_skills.py currently fails to collect tests because core implementation symbols required for AI skills are not yet ported.
Purpose of Study: The primary goals are to review the impact of the upstream stack, estimate adaptation costs, and inform future dedicated adaptation efforts.
Agent Context Updates: Both Bash and PowerShell agent context update scripts were modified to recognize and handle a new 'generic' agent type.
New AI Skills Test Suite: A new Python test file, tests/test_ai_skills.py, was added to provide unit tests for AI agent skills installation, covering skill directory resolution, YAML frontmatter parsing, and CLI validation.

Changelog

scripts/bash/update-agent-context.sh
- Added 'generic' agent type to the list of supported agents in the update_specific_agent function.
- Included a log message for the 'generic' agent indicating no predefined context file and advising agent-specific update scripts.
scripts/powershell/update-agent-context.ps1
- Updated the ValidateSet parameter for AgentType to include 'generic'.
- Modified the Update-SpecificAgent function to handle the 'generic' agent type with an informational message.
- Updated the usage message in Print-Summary to reflect the inclusion of the 'generic' agent.
tests/test_ai_skills.py
- Added a new file containing comprehensive unit tests for AI agent skills installation.
- Implemented tests for skill directory resolution (_get_skills_dir), YAML frontmatter parsing, and SKILL.md generation.
- Included tests for handling missing or empty template directories, malformed YAML, and ensuring existing commands are preserved.
- Added tests for CLI validation, specifically ensuring --ai-skills requires the --ai flag.

Ignored Files

Ignored by pattern: .github/workflows/** (5)
- .github/workflows/codeql.yml
- .github/workflows/release.yml
- .github/workflows/scripts/create-github-release.sh
- .github/workflows/scripts/create-release-packages.ps1
- .github/workflows/scripts/create-release-packages.sh

Activity

The pull request was opened as a 'draft' by nsalvacao, indicating it is under active development and not yet ready for final review or merge.
The author explicitly stated the PR's purpose as 'Study-Only' and an 'evaluation Draft PR', highlighting its investigative nature.
The description notes that tests/test_ai_skills.py currently fails, suggesting the author is aware of the current state and is likely working on further integration or debugging.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

github-actions · 2026-02-25T12:59:52Z

🔍 AI Code Review

Chunk 01

Security Vulnerabilities

No critical vulnerabilities found.
The diff mainly adds agent support, workflow automation, and tests. No secrets, unsafe shell usage, or privilege escalation are present.

Bugs

🔴 Critical

Release Workflow: Git Push on Main
```
- name: Commit version bump to main
  ...
  git push
```
If this workflow runs on a fork or without proper permissions, git push may fail or push to the wrong remote. This can break CI or cause confusion.
Recommendation: Use git push origin main and ensure this step only runs in trusted contexts (e.g., not on forks).

🟡 Warning

CodeQL Language Matrix
```
matrix:
  language: [ 'actions', 'python' ]
```
If your repository does not contain GitHub Actions code or Python, this matrix is unnecessary and may waste CI resources.
Recommendation: Only specify languages present in your codebase.
Shell Script: Agent List Drift
- In create-release-packages.sh and .ps1, the agent lists are duplicated and may drift over time.
  Recommendation: Centralize agent lists to avoid inconsistencies.

Best Practice Violations

🟡 Warning

Workflow: Git User Configuration
```
git config user.name "github-actions[bot]"
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
```
Hardcoding bot credentials is common, but consider using ${{ github.actor }} for traceability.
Workflow: Conditional Commit
```
git diff --cached --quiet || git commit ...
```
This is good, but consider adding --allow-empty if you want to commit even when nothing changed, or clarify intent.

🔵 Info

Tests: Large Test File
- tests/test_ai_skills.py is very large (632 lines). Consider splitting into multiple files for maintainability.
Tests: Direct Use of Internal Functions
- Tests call internal functions like _get_skills_dir. This is fine for unit tests, but consider also testing public interfaces.
Shell Scripts: set -euo pipefail
- Good practice, but ensure all scripts use this for consistency.
YAML: Explicit Permissions
- Good use of permissions in CodeQL workflow.

Summary Table

Issue	Severity	Recommendation
Git push in workflow	🔴 Critical	Use `git push origin main`, restrict to trusted contexts
CodeQL language matrix	🟡 Warning	Only include languages present
Agent list duplication	🟡 Warning	Centralize agent lists
Hardcoded git user	🟡 Warning	Consider `${{ github.actor }}`
Large test file	🔵 Info	Split for maintainability
Internal function tests	🔵 Info	Add public interface tests

No security vulnerabilities found. Address the critical workflow bug and warnings for improved reliability and maintainability.

Chunk 02

Review Summary

The diff consists of test cases for the install_ai_skills function and related CLI logic. The tests cover various scenarios including YAML parsing, file preservation, error handling, and CLI validation. Below is a review based on security, bugs, and best practices.

1. Security Vulnerabilities

🔵 Info

No direct security vulnerabilities found.
The code is test code and does not expose sensitive operations or insecure patterns. However, always ensure that test fixtures and mocks do not leak sensitive data or credentials.

2. Bugs

🟡 Warning

YAML Parsing Robustness:
In tests like test_generated_skill_has_parseable_yaml, the code splits content by "---" and parses parts[1]. If the YAML frontmatter contains "---" within its content, this could break parsing.
Recommendation: Use a proper YAML frontmatter parser (e.g., python-frontmatter) instead of manual splitting.
Partial Test Name:
The last test method, test_ai_skills_fla, is incomplete.
Recommendation: Remove or complete this test to avoid confusion and potential test runner errors.

3. Best Practice Violations

🟡 Warning

Direct File Manipulation:
Many tests manipulate files and directories directly. While this is common in integration tests, ensure proper cleanup and isolation (which pytest fixtures likely provide).
Recommendation: Confirm that all temporary files/directories are cleaned up after tests.

🔵 Info

Use of assert Statements:
Using bare assert is fine in pytest, but for more informative errors, consider using assert ... , "message" or pytest's built-in assertion introspection.
Mocking and Patching:
The tests use patch for mocking. Ensure that patches are properly scoped and do not leak into other tests.
Hardcoded Strings:
Some tests check for hardcoded content in files. If the templates change, these tests may break.
Recommendation: Consider parametrizing content checks or using fixtures for template content.

4. Additional Observations

🔵 Info

Coverage:
The tests provide thorough coverage for edge cases, including malformed YAML, missing directories, and preservation of user files.
Skill Descriptions:
The test for SKILL_DESCRIPTIONS ensures all commands have adequate descriptions, which is good for maintainability.

Action Items

Replace manual YAML frontmatter parsing with a robust parser.
Complete or remove the partial test test_ai_skills_fla.
Ensure all file manipulations are isolated and cleaned up.
Consider improving assertion messages for easier debugging.

No critical vulnerabilities found.
Main issues are robustness and completeness of tests.

Chunk 03

Review Summary

1. Security Vulnerabilities

🔵 Info
No direct security vulnerabilities are apparent in this diff. The code is strictly test code and does not handle user input or sensitive data directly.

2. Bugs

🟡 Warning

Missing Imports:
- The code uses app and re but does not show their imports in the diff. If app and re are not imported elsewhere in the test module, these tests will fail with NameError.
  - Recommendation: Ensure app and re are imported at the top of the test file.
Test Isolation:
- Multiple tests use CliRunner() and invoke the CLI. If the CLI modifies global state or writes files, tests may interfere with each other or with the environment.
  - Recommendation: Use temporary directories or fixtures to isolate CLI invocations.

3. Best Practice Violations

🟡 Warning

Hardcoded String Checks:
- Tests check for specific strings in CLI output (e.g., "Usage:", "--ai-skills requires --ai"). If CLI output changes (e.g., wording, formatting), tests may fail unnecessarily.
  - Recommendation: Consider using more robust checks or documenting that output changes require test updates.
Test Naming:
- The test class is named TestCliValidation, which is good, but method names could be more descriptive (e.g., test_ai_skills_flag_requires_ai_option).
  - Recommendation: Use descriptive test method names for clarity.

🔵 Info

Description Length Assertion:
- The assertion assert len(SKILL_DESCRIPTIONS[cmd]) > 20 is arbitrary. If descriptions are intentionally concise, this may cause false negatives.
  - Recommendation: Document rationale for minimum length or make it configurable.

Summary Table

Issue Type	Severity	Description/Recommendation
Missing Imports	🟡	Ensure `app` and `re` are imported.
Test Isolation	🟡	Use temp dirs/fixtures for CLI tests.
Hardcoded String Checks	🟡	Document or improve robustness of output checks.
Test Naming	🟡	Use descriptive method names.
Description Length	🔵	Document rationale for minimum length assertion.

No critical issues found. Address warnings for improved reliability and maintainability.

🤖 AI Review · openai/gpt-4.1 · 11825 tokens · GitHub Models free tier · 0 premium requests

gemini-code-assist

Code Review

This pull request is a draft to evaluate upstream changes for AI skills and generic agent support. The changes introduce a 'generic' agent type in the agent context update scripts and add a comprehensive test suite for the new AI skills functionality. My review focuses on the new test file, tests/test_ai_skills.py. I've provided suggestions to improve test robustness, code structure, and adherence to Python style guidelines. The changes in the shell scripts are consistent and look good.

gemini-code-assist · 2026-02-25T13:06:41Z

+    def _fake_extract(self, agent, project_path, **_kwargs):
+        """Simulate template extraction: create agent commands dir."""
+        agent_cfg = AGENT_CONFIG.get(agent, {})
+        agent_folder = agent_cfg.get("folder", "")
+        if agent_folder:
+            cmds_dir = project_path / agent_folder.rstrip("/") / "commands"
+            cmds_dir.mkdir(parents=True, exist_ok=True)
+            (cmds_dir / "speckit.specify.md").write_text("# spec")


This helper method does not use the instance (self). It can be converted to a static method to make it clear that it doesn't depend on the state of the test class instance. This improves code clarity and structure.

Suggested change

def _fake_extract(self, agent, project_path, **_kwargs):

"""Simulate template extraction: create agent commands dir."""

agent_cfg = AGENT_CONFIG.get(agent, {})

agent_folder = agent_cfg.get("folder", "")

if agent_folder:

cmds_dir = project_path / agent_folder.rstrip("/") / "commands"

cmds_dir.mkdir(parents=True, exist_ok=True)

(cmds_dir / "speckit.specify.md").write_text("# spec")

@staticmethod

def _fake_extract(agent, project_path, **_kwargs):

"""Simulate template extraction: create agent commands dir."""

agent_cfg = AGENT_CONFIG.get(agent, {})

agent_folder = agent_cfg.get("folder", "")

if agent_folder:

cmds_dir = project_path / agent_folder.rstrip("/") / "commands"

cmds_dir.mkdir(parents=True, exist_ok=True)

(cmds_dir / "speckit.specify.md").write_text("# spec")

gemini-code-assist · 2026-02-25T13:06:41Z

+
+    def test_new_project_commands_removed_after_skills_succeed(self, tmp_path):
+        """For new projects, commands should be removed when skills succeed."""
+        from typer.testing import CliRunner


According to PEP 8, imports should be at the top of the file. Please move from typer.testing import CliRunner to the top-level imports section of this file. This avoids repeated imports within test methods and improves code organization.

This comment applies to all other local imports of CliRunner in this file.

References

Imports should usually be on separate lines, at the top of the file, just after any module comments and docstrings, and before module globals and constants. They should be grouped in the following order: standard library imports, third-party imports, local application/library specific imports. Each group should be separated by a blank line. ^(link)

gemini-code-assist · 2026-02-25T13:06:41Z

+             patch("specify_cli.install_ai_skills", return_value=True) as mock_skills, \
+             patch("specify_cli.is_git_repo", return_value=False), \
+             patch("specify_cli.shutil.which", return_value="/usr/bin/git"):
+            result = runner.invoke(app, ["init", str(target), "--ai", "claude", "--ai-skills", "--script", "sh", "--no-git"])


The result of runner.invoke is captured but not checked. It's a good practice to assert that the command executed successfully by checking the exit code. This makes the test more robust by ensuring that the CLI command itself didn't fail for an unexpected reason before the main assertions are checked.

This also applies to the runner.invoke calls in test_commands_preserved_when_skills_fail and test_here_mode_commands_preserved.

Suggested change

result = runner.invoke(app, ["init", str(target), "--ai", "claude", "--ai-skills", "--script", "sh", "--no-git"])

result = runner.invoke(app, ["init", str(target), "--ai", "claude", "--ai-skills", "--script", "sh", "--no-git"])

assert result.exit_code == 0

mnriem and others added 4 commits February 25, 2026 12:57

gemini-code-assist bot reviewed Feb 25, 2026

View reviewed changes

Base automatically changed from baseline/upstream-intake-2026-02-25 to main February 25, 2026 16:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

draft(intake): study upstream ai-skills/generic stack impact#147

draft(intake): study upstream ai-skills/generic stack impact#147
nsalvacao wants to merge 4 commits intomainfrom
intake/upstream-study-ai-skills-generic-stack

nsalvacao commented Feb 25, 2026

Uh oh!

github-actions bot commented Feb 25, 2026

Uh oh!

gemini-code-assist bot commented Feb 25, 2026

Uh oh!

github-actions bot commented Feb 25, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 25, 2026

Uh oh!

gemini-code-assist bot Feb 25, 2026

Uh oh!

gemini-code-assist bot Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	result = runner.invoke(app, ["init", str(target), "--ai", "claude", "--ai-skills", "--script", "sh", "--no-git"])
	result = runner.invoke(app, ["init", str(target), "--ai", "claude", "--ai-skills", "--script", "sh", "--no-git"])
	assert result.exit_code == 0

Conversation

nsalvacao commented Feb 25, 2026

Purpose (Study-Only)

Target Upstream Stack

What Was Imported in This Draft

Explicitly Deferred (Conflict-Heavy Core)

Current Expected State

Why This Is Useful

Credit / Provenance

Uh oh!

github-actions bot commented Feb 25, 2026

📝 AI PR Summary

Summary

Changes

Impact

Uh oh!

gemini-code-assist bot commented Feb 25, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

github-actions bot commented Feb 25, 2026

🔍 AI Code Review

Chunk 01

Security Vulnerabilities

Bugs

🔴 Critical

🟡 Warning

Best Practice Violations

🟡 Warning

🔵 Info

Summary Table

Chunk 02

Review Summary

1. Security Vulnerabilities

2. Bugs

3. Best Practice Violations

4. Additional Observations

Action Items

Chunk 03

Review Summary

1. Security Vulnerabilities

2. Bugs

3. Best Practice Violations

Summary Table

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants