[DMP 2026]: Logs-to-training pipeline for agentic setups

### Ticket Contents

## Description
Build a pipeline that ingests production logs (question answering and agentic setup sessions), removes personally identifiable information and other sensitive content, and produces training datasets suitable for improving a language model used in that setup. Agentic portions must be treated as trajectory-oriented behavior cloning (state, tool actions, observations, multi-step recovery), not only single-turn instruction pairs. Exports must support supervised fine-tuning with LoRA and Direct Preference Optimization, and the data design should support eventually training a smaller model to replace a larger teacher model while preserving behavior. The work includes defining schemas, tagging compositional trajectory complexity, diversity and scheduling hooks for training, and validation so trajectories are consistent with tool results where possible.


### Goals & Mid-Point Milestone

## Goals
- [ ] log event schema and parsers for Q&A and agentic traces (user, assistant, tool call, tool result, errors).
- [ ] PII detection, redaction or placeholder replacement, and audit sampling workflow with documented residual risk.
- [ ] Export paths for SFT (LoRA-ready) JSONL and DPO JSONL aligned to one chat template and inference format.
- [ ] Trajectory tagging for compositional complexity (steps, tools, ambiguity, error recovery) and stratified sampling setup for diversity 
- [ ] Goals Achieved By Mid-point Milestone: working end-to-end prototype on a sampled log subset—PII-stripped JSONL for SFT and a small validated DPO pair set; documented schemas; gold-set alignment for one representative agent workflow; automated checks for schema, tool name validity, and basic trajectory consistency.


### Setup/Installation

NA


### Expected Outcome

A documented, repeatable pipeline runnable against designated log sources. It normalizes logs, segments Q&A versus agent trajectories, applies configurable PII rules and review hooks, and emits versioned datasets. SFT exports are valid multi-turn chat or completion records matching production templates including tool syntax. DPO exports provide shared prompts with chosen and rejected completions from governed sources (feedback, failure pairs, or approved synthetics). Each row or shard carries metadata for trajectory complexity and domain so trainers can apply staged mixtures or model-aware sampling without ad-hoc rewrites. Evaluation hooks or scripts exist to compare teacher and student checkpoints on held-out behavioral and tool-use checks. Student-training filters respect smaller context and tool sets.


### Acceptance Criteria

- No training artifact ships without passing the configured PII pipeline and a documented audit sample.
- SFT JSONL validates against the chosen trainer dry run (LoRA) on toy and production-shaped samples without template mismatch.
- DPO JSONL validates against a small DPO dry run with required prompt, chosen, and rejected fields.
- Agent trajectories excluded or flagged when tool calls contradict observations or fail schema validation.
- Documentation lists field definitions, split strategy (no near-duplicate leakage across train and preference sets), and how complexity tags map to recommended training schedules.
- Clear criteria defined for when a smaller replacement model is acceptable relative to the teacher on the agreed eval set.


### Implementation Details

Python-first CLI or service modules for ingest, transform, and export. Rule-based and model-assisted PII detection as appropriate; consistent synthetic placeholders. JSONL as primary interchange. Optional integration with Hugging Face datasets, PEFT, and TRL-style trainers for validation only unless scope expands. Tagging pipeline computes step counts, tool sets, recovery flags, and optional loss-based hardness when a reference model is available. No storage of raw secrets in config or outputs. Tests for redaction, schema validation, and split integrity.


### Mockups/Wireframes

Not applicable.


### Product Name

OpenAgriNet

### Organisation Name

COSS

### Domain

⁠Agriculture

### Tech Skills Needed

Python

### Mentor(s)

@Gautam-Rajeev 

### Category

Data Science

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DMP 2026]: Logs-to-training pipeline for agentic setups #1

Ticket Contents

Description

Goals & Mid-Point Milestone

Goals

Setup/Installation

Expected Outcome

Acceptance Criteria

Implementation Details

Mockups/Wireframes

Product Name

Organisation Name

Domain

Tech Skills Needed

Mentor(s)

Category

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[DMP 2026]: Logs-to-training pipeline for agentic setups #1

Description

Ticket Contents

Description

Goals & Mid-Point Milestone

Goals

Setup/Installation

Expected Outcome

Acceptance Criteria

Implementation Details

Mockups/Wireframes

Product Name

Organisation Name

Domain

Tech Skills Needed

Mentor(s)

Category

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions