Skip to content

draft(intake): study upstream ai-skills/generic stack impact#147

Draft
nsalvacao wants to merge 4 commits intomainfrom
intake/upstream-study-ai-skills-generic-stack
Draft

draft(intake): study upstream ai-skills/generic stack impact#147
nsalvacao wants to merge 4 commits intomainfrom
intake/upstream-study-ai-skills-generic-stack

Conversation

@nsalvacao
Copy link
Copy Markdown
Owner

Purpose (Study-Only)

This is an evaluation Draft PR to inspect the upstream AI-skills/generic stack in isolation before deciding whether to adapt it to our fork architecture.

Target Upstream Stack

What Was Imported in This Draft

  • .github/workflows/codeql.yml
  • Release packaging/release-script deltas related to generic agent packaging
  • Agent context script updates
  • tests/test_ai_skills.py and follow-up test deltas from the stack

Explicitly Deferred (Conflict-Heavy Core)

For this study pass, conflict-heavy files were intentionally kept on fork baseline to avoid unintended replacement of current architecture:

  • src/specify_cli/__init__.py
  • README.md
  • CHANGELOG.md
  • pyproject.toml
  • AGENTS.md

Current Expected State

  • This Draft is not merge-ready.
  • tests/test_ai_skills.py currently fails to collect because core implementation symbols are not yet ported (e.g., _get_skills_dir).

Why This Is Useful

It allows us to:

  1. Review surface area, CI/script effects, and packaging impacts of the upstream stack.
  2. Estimate the real adaptation cost into our roadmap (without forcing premature adoption).
  3. Plan a future dedicated adaptation PR if approved.

Credit / Provenance

Authorship is preserved via cherry-pick -x on all four upstream commits.

mnriem and others added 4 commits February 25, 2026 12:57
* implement ai-skills command line switch

* fix: address review comments, remove breaking change for existing projects, add tests

* fix: review comments

* fix: review comments

* fix: review comments

* fix: review comments

* fix: review comments, add test cases for all the agents

* fix: review comments

* fix: review comments

* chore: trigger CI

* chore: trigger CodeQL

* ci: add CodeQL workflow for code scanning

* ci: add actions language to CodeQL workflow, disable default setup

---------

Co-authored-by: dhilipkumars <s.dhilipkumar@gmail.com>
(cherry picked from commit 9402ebd)
* feat: add GitHub Actions workflow for testing and linting Python code

* fix: resolve ruff lint errors in specify_cli

- Remove extraneous f-string prefixes (F541)
- Split multi-statement lines (E701, E702)
- Remove unused variable assignments (F841)
- Remove ruff format check from CI workflow (format-only PR to follow)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: strip ANSI codes in ai-skills help text test

The Rich/Typer CLI injects ANSI escape codes into option names in
--help output, causing plain string matching to fail.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
(cherry picked from commit 24d76b5)
…ub#1639)

- Add --ai generic option for unsupported AI agents (bring your own agent)
- Require --ai-commands-dir to specify where agent reads commands from
- Generate Markdown commands with $ARGUMENTS format (compatible with most agents)
- Rebuild CHANGELOG from GitHub releases (last 10 releases)
- Align version to 0.1.3

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
(cherry picked from commit 6150f1e)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3 to 4.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](github/codeql-action@v3...v4)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: '4'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit 04fc3fd)
@github-actions
Copy link
Copy Markdown

📝 AI PR Summary

Summary

Add CodeQL security analysis workflow, support new "generic" agent in release packaging and update scripts, and introduce comprehensive unit tests for AI agent skills installation.

Changes

  • Added .github/workflows/codeql.yml for CodeQL scanning on main branch pushes and PRs.
  • Enhanced release workflow to commit version bumps automatically.
  • Updated release packaging scripts (bash and PowerShell) to include new "generic" agent.
  • Modified agent context update scripts to recognize "generic" agent with appropriate messaging.
  • Added new unit tests (tests/test_ai_skills.py) covering AI agent skills installation, YAML frontmatter parsing, and CLI validation.

Impact

  • Security scanning integrated into CI via CodeQL.
  • Release process now supports automatic version bump commits.
  • Build and release pipelines support the new "generic" agent.
  • Agent context update tools handle "generic" agent gracefully.
  • Improved test coverage for AI skills installation functionality.

🤖 Auto-generated · openai/gpt-4.1-mini · GitHub Models free tier · 0 premium requests

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @nsalvacao, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request serves as a preliminary investigation into integrating an upstream AI-skills/generic stack. It selectively imports components to assess their surface area, CI/script interactions, and packaging implications, while consciously avoiding immediate changes to core, conflict-prone files. The aim is to gather data and insights to inform a strategic decision on future adaptation, rather than to deliver a functional, mergeable feature at this stage.

Highlights

  • Evaluation Draft PR: This pull request is explicitly marked as a 'study-only' evaluation draft, intended to inspect an upstream AI-skills/generic stack without being merge-ready.
  • Upstream Stack Integration: The PR imports specific components from four upstream commits, including CI workflows, release packaging deltas, agent context script updates, and new AI skills test files.
  • Deferred Files: Several conflict-heavy core files (e.g., __init__.py, README.md, pyproject.toml) were intentionally excluded from this study pass to avoid premature architectural changes.
  • Known Limitations: The newly added tests/test_ai_skills.py currently fails to collect tests because core implementation symbols required for AI skills are not yet ported.
  • Purpose of Study: The primary goals are to review the impact of the upstream stack, estimate adaptation costs, and inform future dedicated adaptation efforts.
  • Agent Context Updates: Both Bash and PowerShell agent context update scripts were modified to recognize and handle a new 'generic' agent type.
  • New AI Skills Test Suite: A new Python test file, tests/test_ai_skills.py, was added to provide unit tests for AI agent skills installation, covering skill directory resolution, YAML frontmatter parsing, and CLI validation.
Changelog
  • scripts/bash/update-agent-context.sh
    • Added 'generic' agent type to the list of supported agents in the update_specific_agent function.
    • Included a log message for the 'generic' agent indicating no predefined context file and advising agent-specific update scripts.
  • scripts/powershell/update-agent-context.ps1
    • Updated the ValidateSet parameter for AgentType to include 'generic'.
    • Modified the Update-SpecificAgent function to handle the 'generic' agent type with an informational message.
    • Updated the usage message in Print-Summary to reflect the inclusion of the 'generic' agent.
  • tests/test_ai_skills.py
    • Added a new file containing comprehensive unit tests for AI agent skills installation.
    • Implemented tests for skill directory resolution (_get_skills_dir), YAML frontmatter parsing, and SKILL.md generation.
    • Included tests for handling missing or empty template directories, malformed YAML, and ensuring existing commands are preserved.
    • Added tests for CLI validation, specifically ensuring --ai-skills requires the --ai flag.
Ignored Files
  • Ignored by pattern: .github/workflows/** (5)
    • .github/workflows/codeql.yml
    • .github/workflows/release.yml
    • .github/workflows/scripts/create-github-release.sh
    • .github/workflows/scripts/create-release-packages.ps1
    • .github/workflows/scripts/create-release-packages.sh
Activity
  • The pull request was opened as a 'draft' by nsalvacao, indicating it is under active development and not yet ready for final review or merge.
  • The author explicitly stated the PR's purpose as 'Study-Only' and an 'evaluation Draft PR', highlighting its investigative nature.
  • The description notes that tests/test_ai_skills.py currently fails, suggesting the author is aware of the current state and is likely working on further integration or debugging.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions
Copy link
Copy Markdown

🔍 AI Code Review

Chunk 01

Security Vulnerabilities

No critical vulnerabilities found.
The diff mainly adds agent support, workflow automation, and tests. No secrets, unsafe shell usage, or privilege escalation are present.


Bugs

🔴 Critical

  • Release Workflow: Git Push on Main
    - name: Commit version bump to main
      ...
      git push
    If this workflow runs on a fork or without proper permissions, git push may fail or push to the wrong remote. This can break CI or cause confusion.
    Recommendation: Use git push origin main and ensure this step only runs in trusted contexts (e.g., not on forks).

🟡 Warning

  • CodeQL Language Matrix

    matrix:
      language: [ 'actions', 'python' ]

    If your repository does not contain GitHub Actions code or Python, this matrix is unnecessary and may waste CI resources.
    Recommendation: Only specify languages present in your codebase.

  • Shell Script: Agent List Drift

    • In create-release-packages.sh and .ps1, the agent lists are duplicated and may drift over time.
      Recommendation: Centralize agent lists to avoid inconsistencies.

Best Practice Violations

🟡 Warning

  • Workflow: Git User Configuration

    git config user.name "github-actions[bot]"
    git config user.email "41898282+github-actions[bot]@users.noreply.github.com"

    Hardcoding bot credentials is common, but consider using ${{ github.actor }} for traceability.

  • Workflow: Conditional Commit

    git diff --cached --quiet || git commit ...

    This is good, but consider adding --allow-empty if you want to commit even when nothing changed, or clarify intent.

🔵 Info

  • Tests: Large Test File

    • tests/test_ai_skills.py is very large (632 lines). Consider splitting into multiple files for maintainability.
  • Tests: Direct Use of Internal Functions

    • Tests call internal functions like _get_skills_dir. This is fine for unit tests, but consider also testing public interfaces.
  • Shell Scripts: set -euo pipefail

    • Good practice, but ensure all scripts use this for consistency.
  • YAML: Explicit Permissions

    • Good use of permissions in CodeQL workflow.

Summary Table

Issue Severity Recommendation
Git push in workflow 🔴 Critical Use git push origin main, restrict to trusted contexts
CodeQL language matrix 🟡 Warning Only include languages present
Agent list duplication 🟡 Warning Centralize agent lists
Hardcoded git user 🟡 Warning Consider ${{ github.actor }}
Large test file 🔵 Info Split for maintainability
Internal function tests 🔵 Info Add public interface tests

No security vulnerabilities found. Address the critical workflow bug and warnings for improved reliability and maintainability.

Chunk 02

Review Summary

The diff consists of test cases for the install_ai_skills function and related CLI logic. The tests cover various scenarios including YAML parsing, file preservation, error handling, and CLI validation. Below is a review based on security, bugs, and best practices.


1. Security Vulnerabilities

🔵 Info

  • No direct security vulnerabilities found.
    The code is test code and does not expose sensitive operations or insecure patterns. However, always ensure that test fixtures and mocks do not leak sensitive data or credentials.

2. Bugs

🟡 Warning

  • YAML Parsing Robustness:
    In tests like test_generated_skill_has_parseable_yaml, the code splits content by "---" and parses parts[1]. If the YAML frontmatter contains "---" within its content, this could break parsing.
    Recommendation: Use a proper YAML frontmatter parser (e.g., python-frontmatter) instead of manual splitting.

  • Partial Test Name:
    The last test method, test_ai_skills_fla, is incomplete.
    Recommendation: Remove or complete this test to avoid confusion and potential test runner errors.


3. Best Practice Violations

🟡 Warning

  • Direct File Manipulation:
    Many tests manipulate files and directories directly. While this is common in integration tests, ensure proper cleanup and isolation (which pytest fixtures likely provide).
    Recommendation: Confirm that all temporary files/directories are cleaned up after tests.

🔵 Info

  • Use of assert Statements:
    Using bare assert is fine in pytest, but for more informative errors, consider using assert ... , "message" or pytest's built-in assertion introspection.

  • Mocking and Patching:
    The tests use patch for mocking. Ensure that patches are properly scoped and do not leak into other tests.

  • Hardcoded Strings:
    Some tests check for hardcoded content in files. If the templates change, these tests may break.
    Recommendation: Consider parametrizing content checks or using fixtures for template content.


4. Additional Observations

🔵 Info

  • Coverage:
    The tests provide thorough coverage for edge cases, including malformed YAML, missing directories, and preservation of user files.

  • Skill Descriptions:
    The test for SKILL_DESCRIPTIONS ensures all commands have adequate descriptions, which is good for maintainability.


Action Items

  • Replace manual YAML frontmatter parsing with a robust parser.
  • Complete or remove the partial test test_ai_skills_fla.
  • Ensure all file manipulations are isolated and cleaned up.
  • Consider improving assertion messages for easier debugging.

No critical vulnerabilities found.
Main issues are robustness and completeness of tests.

Chunk 03

Review Summary

1. Security Vulnerabilities

🔵 Info
No direct security vulnerabilities are apparent in this diff. The code is strictly test code and does not handle user input or sensitive data directly.

2. Bugs

🟡 Warning

  • Missing Imports:
    • The code uses app and re but does not show their imports in the diff. If app and re are not imported elsewhere in the test module, these tests will fail with NameError.
      • Recommendation: Ensure app and re are imported at the top of the test file.
  • Test Isolation:
    • Multiple tests use CliRunner() and invoke the CLI. If the CLI modifies global state or writes files, tests may interfere with each other or with the environment.
      • Recommendation: Use temporary directories or fixtures to isolate CLI invocations.

3. Best Practice Violations

🟡 Warning

  • Hardcoded String Checks:
    • Tests check for specific strings in CLI output (e.g., "Usage:", "--ai-skills requires --ai"). If CLI output changes (e.g., wording, formatting), tests may fail unnecessarily.
      • Recommendation: Consider using more robust checks or documenting that output changes require test updates.
  • Test Naming:
    • The test class is named TestCliValidation, which is good, but method names could be more descriptive (e.g., test_ai_skills_flag_requires_ai_option).
      • Recommendation: Use descriptive test method names for clarity.

🔵 Info

  • Description Length Assertion:
    • The assertion assert len(SKILL_DESCRIPTIONS[cmd]) > 20 is arbitrary. If descriptions are intentionally concise, this may cause false negatives.
      • Recommendation: Document rationale for minimum length or make it configurable.

Summary Table

Issue Type Severity Description/Recommendation
Missing Imports 🟡 Ensure app and re are imported.
Test Isolation 🟡 Use temp dirs/fixtures for CLI tests.
Hardcoded String Checks 🟡 Document or improve robustness of output checks.
Test Naming 🟡 Use descriptive method names.
Description Length 🔵 Document rationale for minimum length assertion.

No critical issues found. Address warnings for improved reliability and maintainability.


🤖 AI Review · openai/gpt-4.1 · 11825 tokens · GitHub Models free tier · 0 premium requests

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a draft to evaluate upstream changes for AI skills and generic agent support. The changes introduce a 'generic' agent type in the agent context update scripts and add a comprehensive test suite for the new AI skills functionality. My review focuses on the new test file, tests/test_ai_skills.py. I've provided suggestions to improve test robustness, code structure, and adherence to Python style guidelines. The changes in the shell scripts are consistent and look good.

Comment thread tests/test_ai_skills.py
Comment on lines +459 to +466
def _fake_extract(self, agent, project_path, **_kwargs):
"""Simulate template extraction: create agent commands dir."""
agent_cfg = AGENT_CONFIG.get(agent, {})
agent_folder = agent_cfg.get("folder", "")
if agent_folder:
cmds_dir = project_path / agent_folder.rstrip("/") / "commands"
cmds_dir.mkdir(parents=True, exist_ok=True)
(cmds_dir / "speckit.specify.md").write_text("# spec")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This helper method does not use the instance (self). It can be converted to a static method to make it clear that it doesn't depend on the state of the test class instance. This improves code clarity and structure.

Suggested change
def _fake_extract(self, agent, project_path, **_kwargs):
"""Simulate template extraction: create agent commands dir."""
agent_cfg = AGENT_CONFIG.get(agent, {})
agent_folder = agent_cfg.get("folder", "")
if agent_folder:
cmds_dir = project_path / agent_folder.rstrip("/") / "commands"
cmds_dir.mkdir(parents=True, exist_ok=True)
(cmds_dir / "speckit.specify.md").write_text("# spec")
@staticmethod
def _fake_extract(agent, project_path, **_kwargs):
"""Simulate template extraction: create agent commands dir."""
agent_cfg = AGENT_CONFIG.get(agent, {})
agent_folder = agent_cfg.get("folder", "")
if agent_folder:
cmds_dir = project_path / agent_folder.rstrip("/") / "commands"
cmds_dir.mkdir(parents=True, exist_ok=True)
(cmds_dir / "speckit.specify.md").write_text("# spec")

Comment thread tests/test_ai_skills.py

def test_new_project_commands_removed_after_skills_succeed(self, tmp_path):
"""For new projects, commands should be removed when skills succeed."""
from typer.testing import CliRunner
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

According to PEP 8, imports should be at the top of the file. Please move from typer.testing import CliRunner to the top-level imports section of this file. This avoids repeated imports within test methods and improves code organization.

This comment applies to all other local imports of CliRunner in this file.

References
  1. Imports should usually be on separate lines, at the top of the file, just after any module comments and docstrings, and before module globals and constants. They should be grouped in the following order: standard library imports, third-party imports, local application/library specific imports. Each group should be separated by a blank line. (link)

Comment thread tests/test_ai_skills.py
patch("specify_cli.install_ai_skills", return_value=True) as mock_skills, \
patch("specify_cli.is_git_repo", return_value=False), \
patch("specify_cli.shutil.which", return_value="/usr/bin/git"):
result = runner.invoke(app, ["init", str(target), "--ai", "claude", "--ai-skills", "--script", "sh", "--no-git"])
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The result of runner.invoke is captured but not checked. It's a good practice to assert that the command executed successfully by checking the exit code. This makes the test more robust by ensuring that the CLI command itself didn't fail for an unexpected reason before the main assertions are checked.

This also applies to the runner.invoke calls in test_commands_preserved_when_skills_fail and test_here_mode_commands_preserved.

Suggested change
result = runner.invoke(app, ["init", str(target), "--ai", "claude", "--ai-skills", "--script", "sh", "--no-git"])
result = runner.invoke(app, ["init", str(target), "--ai", "claude", "--ai-skills", "--script", "sh", "--no-git"])
assert result.exit_code == 0

Base automatically changed from baseline/upstream-intake-2026-02-25 to main February 25, 2026 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants