-
Notifications
You must be signed in to change notification settings - Fork 2
Description
The current design (see #6) has three distinct phases:
data collection → data analysis → report generation
-
For data collection we seem well covered. Basic API queries should provide us with a rich data set.
-
Data analysis could definitely benefit from AI-assistance. Alongside simple analyzers with deterministic rules (e.g. check for presence or absence of data, count data points and match against threshold, etc...), we could experiment with prompt-based analyzers, which e.g. evaluate documentation quality (SECURITY.md, or other community files) using existing best practice examples.
-
At a later stage, we might also want to experiment with AI-assisted report generation, e.g. to synthesize analysis results into summaries or recommendations. (Note that the separation between analysis and report generation allows us to keep individual analyzers independent, while operating on the larger set of analyzer results in the report generation phase).
Notes on security and privacy
- In case, we need to make authenticated requests, we must maintain full control of credentials
- Some of the collected data might be privacy sensitive, and must not be shared with third-party AI providers (should support local LLMs)
Alternatives
In an alternative approach an overarching "agentic tool" could orchestrate the entire process, deciding which data to collect, what analysis to run, and when to stop. The check runner implemented here would become a set of tools (or skills), which the AI can call as needed.
While this sounds interesting, it has a few important downsides:
- Less predictable/reproducible (might skip checks or hallucinate conclusions between runs)
- Less control (see privacy/security concerns above)
- More expensive (in terms of money for 3rd party provider, or local compute)
I suggest we start with the "pipeline tool" described above, as it already covers most of what we are interested in without any AI use at all, while allowing us to experiment with AI in a controlled manner.