Improve extraction accuracy and consistency by optimizing prompts, data formatting, and model selection.
Description:
- Apply prompt engineering techniques including few-shot examples and clearer instruction formatting
- Optimize how preprocessed text is structured and chunked before passing to the LLM
- Strengthen Pydantic validation and implement robust error handling for edge cases
- Benchmark different Ollama models to find optimal speed/accuracy tradeoff
Goal: Enhance extraction reliability and performance across diverse paper formats while balancing computational efficiency.