Skip to content

Proposal: Semantic Issue and PR Triage via Simili Bot #400

@Kavirubc

Description

@Kavirubc

Problem or use case

As the entireio/cli repository grows, managing and triaging issues becomes increasingly time-consuming. Duplicate issues are filed frequently, related discussions stay disconnected, and maintainers spend significant time on manual triage that could be automated.

Keyword-based search is insufficient — users describing the same problem often use different terminology, meaning duplicates go undetected.

Desired behavior

Image

Proposed solution

Integrate Simili Bot, an open source semantic issue triage tool that uses vector embeddings to find similar issues based on meaning rather than keywords. The bot would:

  • Automatically comment on new issues or PRs with links to semantically similar existing issues
  • Warn maintainers when a likely duplicate is detected
  • Suggest relevant labels based on issue content

Technical Approach

  • Gemini/ Open AI API for generating issue embeddings
  • Gemini / Open AI (gemini-2.5-flash) for label suggestions and duplicate detection
  • Qdrant as the vector database (self-hostable, no vendor lock-in)
  • GitHub Actions workflow triggered on issues/PR: [opened, reopened]

Infrastructure Requirements

Three secrets would need to be added to the repository:

  • GEMINI_API_KEY
  • QDRANT_URL
  • QDRANT_API_KEY

An initial backfill run (Which you can do with the CLI) would also be needed to index existing issues.

Considerations / Open Questions

  • Is the team comfortable taking a dependency on a third-party GitHub Action (similigh/simili-bot) with access to repository secrets and issue write permissions?
  • Should the action be pinned to a commit SHA rather than a version tag for supply-chain security?
  • Who would own the Qdrant instance and API keys?
  • How should index staleness be handled? (Edits to issues are not currently re-indexed.)
  • Should duplicate auto-closing ever be considered, or should it remain warn-only?

Related

PR #283 was opened proposing this integration. This issue is intended to establish alignment on the approach before that PR is reviewed.

References

Alternatives or workarounds

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions