π Description
FireForm currently relies on LLM-generated structured outputs from unstructured incident reports. In practice, these outputs are not always consistent β they can be incomplete, slightly malformed, or contain incorrect values.
This can cause issues in downstream steps like PDF auto-fill and affects the overall reliability of the pipeline.
This issue focuses on improving the reliability of the LLM β structured JSON β PDF flow.
π‘ Rationale
LLM outputs are not guaranteed to strictly follow a schema. Some common issues observed:
- Missing required fields
- Incorrect data types
- Partially structured or noisy responses
Right now, there is no dedicated validation layer to catch or handle these issues before the data is used further.
Adding a validation + scoring layer would help ensure safer and more reliable processing.
π οΈ Proposed Solution
Introduce a lightweight validation and scoring step after LLM extraction:
This can be implemented as a modular component in the existing extraction pipeline.
β
Acceptance Criteria
π Additional Context
This would be an initial implementation focused on improving robustness.
It can later be extended with more advanced validation rules or human-in-the-loop correction if needed.
This can serve as a foundation for improving extraction quality during the GSoC development phase.
π Description
FireForm currently relies on LLM-generated structured outputs from unstructured incident reports. In practice, these outputs are not always consistent β they can be incomplete, slightly malformed, or contain incorrect values.
This can cause issues in downstream steps like PDF auto-fill and affects the overall reliability of the pipeline.
This issue focuses on improving the reliability of the LLM β structured JSON β PDF flow.
π‘ Rationale
LLM outputs are not guaranteed to strictly follow a schema. Some common issues observed:
Right now, there is no dedicated validation layer to catch or handle these issues before the data is used further.
Adding a validation + scoring layer would help ensure safer and more reliable processing.
π οΈ Proposed Solution
Introduce a lightweight validation and scoring step after LLM extraction:
Schema-Based Validation
Confidence Scoring
Structured Error Handling
This can be implemented as a modular component in the existing extraction pipeline.
β Acceptance Criteria
π Additional Context
This would be an initial implementation focused on improving robustness.
It can later be extended with more advanced validation rules or human-in-the-loop correction if needed.
This can serve as a foundation for improving extraction quality during the GSoC development phase.