Data Operations Pipeline Validator

Small Python scripts to generate sample pipeline output data and validate daily load quality checks.

Files

csv_data.py: Creates a sample pipeline_output.csv file for testing.
pipeline_validator.py: Validates today's records for duplicates, nulls, and status counts, then writes a report.

Requirements

Python 3.8+
pandas

Install dependency:

pip install pandas

Usage

Generate sample CSV:

python csv_data.py

Run validator:

python pipeline_validator.py

Check generated report file:

pipeline_report_<YYYY-MM-DD>.txt

What the validator checks

Records loaded for today
Duplicate customer_id values
Null values per column
Status distribution (for example: SUCCESS, FAILED)

Background

Built based on real production operations experience supporting AWS and GCP data pipelines at Accenture for global enterprise clients. The validation logic mirrors day-to-day data quality checks performed during live pipeline monitoring for Ingredion (GCP/Airflow) and Essilor (AWS batch pipeline).

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
csv_data.py		csv_data.py
pipeline_validator.py		pipeline_validator.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Operations Pipeline Validator

Files

Requirements

Usage

What the validator checks

Background

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Data Operations Pipeline Validator

Files

Requirements

Usage

What the validator checks

Background

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages