|
| 1 | +# Confetti Component Reuse Plan for Galette-Knarr Integration |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +Our Knarr-to-Galette migration is essentially complete (95%). [Confetti](https://github.com/neu-se/confetti) (CONcolic Fuzzer Employing Taint Tracking Information) provides a higher-level hybrid fuzzing architecture built *on top of* Knarr — the question is which of those orchestration components add value for model transformation analysis. |
| 6 | + |
| 7 | +## Confetti Architecture Summary |
| 8 | + |
| 9 | +Confetti has a three-component architecture: |
| 10 | + |
| 11 | +1. **Parametric Fuzzer (Zest-based)** — input generation and execution |
| 12 | +2. **Whitebox Analysis Process (Knarr)** — dynamic taint tracking and constraint collection |
| 13 | +3. **Coordinator** — orchestrates fuzzer, analyzer, and Z3 solver interactions |
| 14 | + |
| 15 | +## Key Confetti Components Beyond Knarr |
| 16 | + |
| 17 | +| Component | Purpose | Relevance for Our Use Case | |
| 18 | +|-----------|---------|---------------------------| |
| 19 | +| **Coordinator** | Multi-process orchestration of fuzzer + analyzer + solver | Low — adds complexity our use case doesn't need | |
| 20 | +| **Branch State Tracking** (`Branch` class) | Tracks `trueExplored`/`falseExplored`, controlling bytes, solved status | **High** — useful for systematic path exploration | |
| 21 | +| **Global Hinting** | Inserts "interesting bytes" from constraints at any position in inputs | Medium — valuable if generating test inputs | |
| 22 | +| **String Hint Types** | `EQUALS`, `INDEXOF`, `STARTSWITH`, `ENDSWITH`, `LENGTH`, `ISEMPTY`, `Z3`, `CHAR`, `GLOBAL_DICTIONARY` | Medium — could enrich string analysis | |
| 23 | +| **Constraint Serialization & GC** | File-based storage, eviction of unhelpful constraints | Medium — improves scalability | |
| 24 | +| **Controlling Bytes Analysis** | `findControllingBytes()` identifies which inputs influence each branch | Medium — helps identify impactful user inputs | |
| 25 | +| **Fuzzing Loop (Zest integration)** | Coverage-guided fuzzing | Low — not needed for model transformation | |
| 26 | +| **Remote Z3 Worker** | Out-of-process Z3 execution with crash recovery | Low — Green solver integration is sufficient | |
| 27 | + |
| 28 | +## Recommendations |
| 29 | + |
| 30 | +### 1. Keep Current Architecture ✅ |
| 31 | + |
| 32 | +Our single-process Galette+Knarr integration is sufficient for model transformation analysis. Confetti's multi-process coordinator adds unnecessary complexity for our use case. |
| 33 | + |
| 34 | +### 2. Consider Adopting Branch State Tracking ⭐⭐⭐ |
| 35 | + |
| 36 | +Implement branch exploration state tracking in `knarr-runtime` to track which branches are fully explored vs. unsolved. Key elements from Confetti's `Branch` class: |
| 37 | + |
| 38 | +- `trueExplored` / `falseExplored` flags |
| 39 | +- `controllingBytes` — which inputs influence the branch |
| 40 | +- `inputsTried` — which inputs have been used for this branch |
| 41 | +- `isSolved` / `isTimedOut` — exploration state |
| 42 | +- `armsExplored[]` — for switch statements |
| 43 | + |
| 44 | +This would help prioritize which paths to explore during systematic path exploration. |
| 45 | + |
| 46 | +### 3. Optionally Port String Hint Types ⭐⭐ |
| 47 | + |
| 48 | +Confetti's `StringHint` categories could enrich our `StringSymbolicTracker` for string-heavy models: |
| 49 | + |
| 50 | +- `EQUALS` — exact string matches |
| 51 | +- `INDEXOF` — substring searches |
| 52 | +- `STARTSWITH` / `ENDSWITH` — prefix/suffix checks |
| 53 | +- `LENGTH` / `ISEMPTY` — string length constraints |
| 54 | + |
| 55 | +### 4. Add Constraint GC If Scaling Issues Arise ⭐⭐ |
| 56 | + |
| 57 | +Confetti's constraint eviction logic (`Coordinator.garbageCollectConstraints()`) can be adapted if memory becomes a bottleneck during large model transformation analyses. |
| 58 | + |
| 59 | +## Decision Points |
| 60 | + |
| 61 | +Before implementing any Confetti components, answer these questions: |
| 62 | + |
| 63 | +1. **Is automated test input generation a goal?** |
| 64 | + - If yes → Global hinting strategy adds significant value |
| 65 | + - If no (primarily path analysis) → Not needed |
| 66 | + |
| 67 | +2. **Do we need systematic branch exploration?** |
| 68 | + - If yes → Adopt `Branch` class pattern |
| 69 | + - If no → Skip the bookkeeping overhead |
| 70 | + |
| 71 | +3. **Are we hitting memory limits?** |
| 72 | + - If yes → Implement constraint serialization and GC |
| 73 | + - If no → Current approach is sufficient |
| 74 | + |
| 75 | +4. **Is Z3 stability an issue?** |
| 76 | + - If yes → Consider remote Z3 worker for crash isolation |
| 77 | + - If no → Green solver integration is fine |
| 78 | + |
| 79 | +## Current Gaps vs. Confetti |
| 80 | + |
| 81 | +| Aspect | Our Current State | Confetti Addition Possible | |
| 82 | +|--------|------------------|---------------------------| |
| 83 | +| **Knarr Core** | ✅ Migrated to Galette | N/A (complete) | |
| 84 | +| **Path Constraints** | ✅ Automatic collection | Branch state tracking | |
| 85 | +| **Solver Integration** | ✅ Green/Z3 | Remote Z3 worker for stability | |
| 86 | +| **Input Tracking** | ✅ Symbolic values | Controlling bytes analysis | |
| 87 | +| **String Analysis** | ✅ Basic | String hint types | |
| 88 | +| **Exploration** | ❌ Manual | Branch exploration tracking, input scoring | |
| 89 | +| **Scalability** | ⚠️ Memory-bound | Constraint serialization, GC | |
| 90 | + |
| 91 | +## Conclusion |
| 92 | + |
| 93 | +Our current integration is **production-ready for model transformation analysis**. The Confetti components that would add the most value are: |
| 94 | + |
| 95 | +1. **Branch exploration tracking** — for systematic path exploration |
| 96 | +2. **Global hinting strategy** — only if automated test generation becomes a goal |
| 97 | + |
| 98 | +The fuzzing-specific components (Zest integration, multi-process coordinator) are not needed for our model transformation use case. |
| 99 | + |
| 100 | +## References |
| 101 | + |
| 102 | +- [Confetti GitHub Repository](https://github.com/neu-se/confetti) |
| 103 | +- [ICSE 2022 Paper](https://github.com/neu-se/confetti) — "CONFETTI: CONcolic Fuzzer Employing Taint Tracking Information" |
| 104 | +- Our existing documentation: `KNARR_INTEGRATION.md`, `knarr-integration-plan.md` |
0 commit comments