Consolidate and Fix Both Pipeline Entry Points

**Problem:**
The two entry point scripts have duplicated CSV writing and row building logic, inconsistent preprocessing steps (the secondary pipeline has text cleaning and section filtering that the main pipeline lacks) and divergent default model names. The main pipeline is effectively less capable than the secondary one.

**Tasks:**
Fold text_cleaner.py and section_filter.py preprocessing steps from extract-from-txt.py into classify_extract.py
Consolidate shared CSV writing and row building logic into a shared utility
Expose a --skip-classifier flag on classify_extract.py
Align default model names across both scripts
Mark extract-from-txt.py as deprecated pending removal

**Context:**
Once completed both pipelines should produce equivalent output quality on the same input and duplicate code paths should be eliminated. Source: [classify_extract.py](https://github.com/NovakLabOSU/FracFeedExtractor/blob/main/classify_extract.py) and [extract-from-txt.py](https://github.com/NovakLabOSU/FracFeedExtractor/blob/main/extract-from-txt.py)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consolidate and Fix Both Pipeline Entry Points #58

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Consolidate and Fix Both Pipeline Entry Points #58

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions