CTD Cast Processor

A full-stack Seabird CTD processing pipeline: raw .cnv → CF-1.8 NetCDF with TEOS-10 derived variables, rigorous quality control, and IOOS compliance validation.

T–S diagram with TEOS-10 isopycnals — four Gulf of Mexico CTD casts colored by pressure

Overview
Feature Summary
System Architecture
Installation
Quick Start
Step-by-Step: How the Pipeline Works
CLI Reference
Python API Reference
Quality Control Algorithms
TEOS-10 Derived Variables
Output Formats
Visualizations
Testing
Configuration & Tuning
Debugging Guide
Performance & Optimization
Standards Compliance
Project Structure
Sample Data
References

Overview

CTD (Conductivity-Temperature-Depth) profilers are the workhorses of physical oceanography. Every research cruise deploys them hundreds of times. The raw output — Seabird's .cnv format — requires a standardized processing chain before the data can be archived, shared, or used in models.

This project implements that complete chain:

Raw .cnv (Seabird binary/ASCII)
  → parsed header + data ingestion
  → loop editing (remove pressure reversals)
  → MAD spike detection
  → 1-dbar pressure bin averaging
  → TEOS-10 derived variables (GSW)
  → CF-1.8 / ACDD-1.3 NetCDF export
  → IOOS compliance validation
  → T-S diagrams + profile visualizations

The result is archive-ready, FAIR-compliant NetCDF files with complete provenance, fully auditable from raw instrument output.

Feature Summary

Category	What it does
Ingest	Parse Seabird `.cnv` format — header metadata + data block; seabird-package or manual fallback
Loop Edit	Detect and remove upcast pressure reversals (ship heave)
Despike	Rolling Median Absolute Deviation (MAD) spike detection on T and C
Bin Average	Standard 1-dbar pressure bins per SBE Application Note 51
TEOS-10	Absolute Salinity, Conservative Temperature, potential density anomaly, sound speed (gsw)
NetCDF Export	CF-1.8 + ACDD-1.3 compliant, compressed, with full provenance history
CSV Export	Flat CSV with per-variable QC flag columns (`_qc` suffix)
Metadata	JSON sidecar with processing log, QC summary, instrument info
Visualizations	T-S diagrams with σ₀ contours; 3-panel profile plots; multi-cast sections
Compliance	IOOS Compliance Checker + manual CF/ACDD attribute checks
CLI	Click-based commands: `process`, `batch`, `plot-ts`, `check-compliance`
Testing	20+ pytest tests covering I/O, QC, derivations, validation

System Architecture

ctd-cast-processor/
│
├── ctd_processor/              # Core package
│   ├── __init__.py             # Public API: CTDProfile, QCReport
│   ├── core.py                 # CTDProfile class (ingestion, QC, export)
│   ├── qc.py                   # QCReport dataclass + detection algorithms
│   ├── utils.py                # Timestamp helpers, range validators
│   ├── visualization.py        # T-S, profile, section plots
│   ├── compliance.py           # IOOS/CF/ACDD compliance checks
│   └── cli.py                  # Click CLI entry point
│
├── tests/                      # pytest test suite
│   ├── test_core.py            # CTDProfile init, I/O, processing
│   ├── test_qc.py              # Spike detection, density inversions, QC report
│   └── test_utils.py           # Timestamps, range validators
│
├── sample_data/
│   ├── raw/                    # Synthetic .cnv files (Gulf of Mexico profiles)
│   └── generate_sample_cnv.py  # Script to regenerate synthetic data
│
├── docs/
│   └── METHODS.md              # Algorithm deep-dive (math + references)
│
├── .github/workflows/ci.yml   # CI: lint → test (3.10/3.11/3.12) → build
├── pyproject.toml              # Package metadata and dependencies
├── Makefile                    # Developer convenience targets
└── run.sh                      # One-command pipeline runner

Data flow between modules

CLI (cli.py)
  └─► CTDProfile.from_seabird_cnv()  ← core.py
         ├─► manual CNV parser  (fallback)
         └─► seabird.cnv.fCNV  (if installed)

CTDProfile.apply_qc()
  ├─► _loop_edit()        → flags pressure reversals
  ├─► _despike()          → MAD spike detection  ← qc.py detect_spikes_mad()
  └─► _bin_average()      → 1-dbar bins

CTDProfile.calculate_derived_parameters()
  └─► gsw.*               → SA, CT, rho, sigma0, sound_speed

CTDProfile.to_netcdf()   ← xarray + netCDF4
CTDProfile.to_csv()
CTDProfile.save_metadata()

CTDVisualizer.plot_ts_diagram()  ← visualization.py
ComplianceChecker.check_ioos_compliance()  ← compliance.py

Installation

Prerequisites

Python 3.10, 3.11, or 3.12
System libraries for NetCDF4: libhdf5-dev libnetcdf-dev (Linux) or brew install hdf5 netcdf (macOS)

Standard install

git clone https://github.com/ranjithguggilla/ctd-cast-processor
cd ctd-cast-processor
pip install -e ".[dev]"

Verify installation

ctd-processor --help
pytest -v tests/

Optional extras

# Advanced CNV parser (handles edge-case headers better)
pip install seabird

# IOOS compliance checker
pip install ioos-compliance-checker

Quick Start

One command (uses sample data)

./run.sh
# or: make process

Single cast (Python)

from ctd_processor import CTDProfile
from ctd_processor.visualization import CTDVisualizer

# 1. Load raw CNV
ctd = CTDProfile.from_seabird_cnv(
    "sample_data/raw/cast_001.cnv",
    cruise="GOMECC-4",
    vessel="R/V Ronald H. Brown"
)

# 2. Apply standard QC
ctd.apply_qc(
    loop_edit_threshold=0.02,   # temp diff threshold for loop flagging
    despike_threshold=3.0,       # MAD multiplier
    bin_size=1.0                 # dbar
)

# 3. TEOS-10 derived variables
ctd.calculate_derived_parameters()

# 4. Export
ctd.to_netcdf("output/cast_001.nc")
ctd.to_csv("output/cast_001_processed.csv")
ctd.save_metadata("output/cast_001_metadata.json")

# 5. Visualize
viz = CTDVisualizer()
viz.plot_profile(ctd, output_path="output/cast_001_profile.png")

Step-by-Step: How the Pipeline Works

Visual overview: The figure below shows the raw 24 Hz scan data (left), after MAD spike removal (centre), and the final 1-dbar bin-averaged output (right) — all from a single 800 m cast.

Step 1 — CNV ingestion

File: ctd_processor/core.py → CTDProfile.from_seabird_cnv()

The Seabird .cnv file has two sections separated by *END*:

* Sea-Bird SBE 9/11 CTD
* Cast ID: cast_001
* Cruise: GOMECC-4
* Latitude: 27.50
* Longitude: -96.50
* Start time: 2024-04-12T14:23:00
*END*
   0.0  28.5123  54.2145  35.021  7.51
   1.0  28.5098  54.2100  35.019  7.49
   ...

The parser:

Reads all lines; finds *END* index
Scans header lines for latitude, longitude, start_time
Reads data lines as whitespace-delimited floats into a numpy array
Assigns column names: pressure, temperature, conductivity, then var_3, var_4, … for additional columns
Stores raw data in CTDProfile.raw_data; CTDProfile.data is a copy that gets mutated through QC steps

If the seabird package is installed, it is tried first (handles edge cases like multi-channel headers). The manual parser is the guaranteed fallback.

What you get: profile.data — a DataFrame with one row per raw scan at native 24 Hz sampling.

Step 2 — Loop edit

File: ctd_processor/core.py → CTDProfile._loop_edit()

Ship heave causes the CTD to momentarily ascend during a downcast, creating a pressure loop. These duplicated depth ranges produce erroneous T/S profiles.

Algorithm:

for i in range(1, len(pressure) - 1):
    if pressure[i] < pressure[i-1] and pressure[i] < pressure[i+1]:
        if abs(temp[i] - temp[i-1]) > threshold:
            bad[i-1:i+1] = True   # QC flag 3

Flags are stored in profile.qc_flags[column] — one integer array per variable (1 = good, 2 = suspicious, 3 = bad).

Step 3 — Despike (MAD)

File: ctd_processor/core.py → CTDProfile._despike() File: ctd_processor/qc.py → detect_spikes_mad()

Rolling window MAD test:

For each point i:
    window = data[i-w : i+w+1]
    median_i = median(window)
    MAD_i    = median(|window - median_i|)
    if |data[i] - median_i| > k * MAD_i:
        flag as spike (QC flag 2)

Applied to: temperature, conductivity
Default: window=3, k=3.0

MAD is preferred over standard deviation because a single large spike cannot inflate the reference statistic — MAD is bounded by the spike itself.

Step 4 — Pressure bin averaging

File: ctd_processor/core.py → CTDProfile._bin_average()

Reduces 24 Hz raw data to standard 1-dbar levels for archival:

bins = np.arange(int(p_min), int(p_max) + bin_size, bin_size)
for each bin [n, n+1):
    row = mean of all data points in this pressure range
    row["pressure"] = n   # bin centre

After binning, QC flags are reset to 1 (good) for all binned columns — the averaging process suppresses isolated noise below the detection floor.

What you get: profile.data reduced from ~24,000 rows (1000 m cast at 24 Hz) to ~1000 rows (one per dbar).

Step 5 — TEOS-10 derived variables

File: ctd_processor/core.py → CTDProfile.calculate_derived_parameters()

Uses the gsw (Gibbs Seawater) library:

# Practical salinity from conductivity (mS/cm), temperature, pressure
SP = gsw.SP_from_C(conductivity, temperature, pressure)

# Absolute salinity (accounts for seawater composition anomalies)
SA = gsw.SA_from_SP(SP, pressure, longitude, latitude)

# Conservative temperature (heat content per unit mass)
CT = gsw.CT_from_t(SA, temperature, pressure)

# In-situ density and potential density anomaly
rho   = gsw.rho(SA, CT, pressure)
sigma0 = rho - 1000            # potential density anomaly

# Speed of sound
c = gsw.sound_speed(SA, CT, pressure)

All results are added as new columns in profile.data and listed in profile.derived_params.

Step 6 — CF-1.8 / ACDD-1.3 NetCDF export

File: ctd_processor/core.py → CTDProfile.to_netcdf()

Uses xarray to build the Dataset, then writes with netCDF4 (zlib, complevel=4):

ds = xr.Dataset(
    data_vars={
        "pressure":    (["obs"], data["pressure"].values),
        "temperature": (["obs"], data["temperature"].values),
        ...
    }
)
ds["pressure"].attrs = {
    "standard_name": "sea_water_pressure",
    "units": "dbar",
    "long_name": "Sea water pressure",
}
ds.attrs = {
    "Conventions": "CF-1.8, ACDD-1.3",
    "title": f"CTD Profile {cast_id}",
    "history": " → ".join([p["action"] for p in processing_log]),
    ...
}

Step 7 — IOOS compliance validation

File: ctd_processor/compliance.py

Two-stage check:

IOOS Compliance Checker (external tool via subprocess): validates CF-1.8 and ACDD-1.3 rules if compliance-checker is installed.
Manual CF checks: verifies Conventions global attribute, standard_name and units on required variables.

ctd-processor check-compliance output/cast_001.nc

CLI Reference

Usage: ctd-processor [OPTIONS] COMMAND [ARGS]...

Commands:
  process           Process a single CTD cast (.cnv → NC/CSV/JSON)
  batch             Batch process all .cnv files in a directory
  plot-ts           Generate T-S diagram from processed casts
  check-compliance  Validate NetCDF files against IOOS/CF standards

`process`

ctd-processor process INPUTFILE [OPTIONS]

  --output, -o TEXT         Output directory            [default: output/]
  --cruise TEXT             Cruise identifier           [default: UNKNOWN]
  --vessel TEXT             Vessel name                 [default: UNKNOWN]
  --mad-threshold FLOAT     Despike MAD threshold       [default: 3.0]
  --bin-size FLOAT          Depth bin size (dbar)       [default: 1.0]

Example:

ctd-processor process sample_data/raw/cast_001.cnv \
  --output output/ \
  --cruise GOMECC-4 \
  --vessel "R/V Ronald H. Brown" \
  --mad-threshold 2.5

`batch`

ctd-processor batch INPUTDIR [OPTIONS]

  --output, -o TEXT         Output directory            [default: output/]
  --cruise TEXT             Cruise identifier
  --pattern TEXT            Glob pattern                [default: *.cnv]
  --parallel, -p INTEGER    Number of parallel workers  [default: 1]

Example:

ctd-processor batch sample_data/raw/ --output output/ --cruise GOMECC-4

`plot-ts`

ctd-processor plot-ts INPUTDIR [OPTIONS]

  --output, -o TEXT         Save plots here (optional)

`check-compliance`

ctd-processor check-compliance FILE [FILE ...]

Python API Reference

`CTDProfile`

class CTDProfile:
    cast_id: str
    cruise: str
    vessel: str
    raw_data: pd.DataFrame     # Raw 24-Hz scans from CNV
    data: pd.DataFrame         # QC'd and derived data (mutated in-place)
    metadata: dict             # Cruise/instrument metadata
    qc_flags: dict[str, ndarray]  # Per-variable integer flag arrays
    derived_params: list[str]  # Names of computed TEOS-10 variables
    processing_log: list[dict] # Timestamped action history

Class methods

CTDProfile.from_seabird_cnv(filepath, cruise="UNKNOWN", vessel="UNKNOWN")
    # → CTDProfile
    # Loads a .cnv file; tries seabird package first, then manual parser

Instance methods

profile.apply_qc(
    loop_edit_threshold=0.02,  # Temperature diff threshold for loop flagging
    despike_window=3,          # MAD rolling window half-width
    despike_threshold=3.0,     # MAD multiplier
    bin_size=1.0               # Pressure bin size (dbar)
)
# Mutates profile.data and populates profile.qc_flags

profile.calculate_derived_parameters()
# Adds columns: salinity_practical, salinity_absolute,
#               temperature_conservative, density_potential_anomaly, sound_speed

profile.to_netcdf(filepath, compress=True)
# Writes CF-1.8 + ACDD-1.3 NetCDF-4 with zlib compression

profile.to_csv(filepath)
# Writes CSV with QC flag columns (_qc suffix per variable)

profile.save_metadata(filepath)
# Writes JSON with profile metadata, processing_log, qc_summary

`QCReport`

@dataclass
class QCReport:
    cast_id: str
    total_observations: int
    good_count: dict[str, int]
    suspicious_count: dict[str, int]
    bad_count: dict[str, int]
    density_inversion_found: bool

# Generate:
from ctd_processor.qc import calculate_qc_report
report = calculate_qc_report(cast_id, data_df, qc_flags_dict)

`CTDVisualizer`

from ctd_processor.visualization import CTDVisualizer

viz = CTDVisualizer(figsize=(12, 8))
viz.plot_ts_diagram(profiles, output_path="ts.png", show=False)
viz.plot_profile(profile, output_path="profile.png", show=False)
CTDVisualizer.plot_section(profiles, output_path="section.png", show=False)

Standalone QC functions

from ctd_processor.qc import detect_spikes_mad, detect_density_inversions

spikes = detect_spikes_mad(series, window=3, threshold=3.0)
# → np.ndarray of bool

inversions, max_inv = detect_density_inversions(
    pressure, salinity, temperature, threshold=0.1
)
# → (np.ndarray of bool, float)

Validation utilities

from ctd_processor.utils import (
    iso8601_timestamp,
    validate_pressure_range,
    validate_temperature_range,
    validate_salinity_range,
)

is_valid, msg = validate_pressure_range(pressure, min_p=0, max_p=6000)
is_valid, msg = validate_temperature_range(temp, min_t=-2, max_t=40)
is_valid, msg = validate_salinity_range(sal, min_s=0, max_s=41)

Quality Control Algorithms

MAD spike detection at 24 Hz resolution — orange triangles are suspicious, red × marks are flagged bad

Loop Edit

Detects pressure reversals — a segment where the instrument briefly ascends before continuing the downcast (caused by ship heave). These segments cause the same depth range to appear twice with different T/S values, corrupting subsequent bin averaging.

Detection logic:

if pressure[i] < pressure[i-1]  (instrument ascending)
   AND pressure[i] < pressure[i+1]  (then descending again)
   AND |temperature[i] - temperature[i-1]| > threshold:
   flag both scans as bad (QC=3)

Why temperature check? Pure pressure reversals below ≈0.02 dbar might be sensor noise rather than real loops. The temperature check confirms whether the reversal crossed a real water mass boundary.

Spike Detection — MAD

The MAD test is more resistant to contamination than standard deviation: a single large spike inflates σ but cannot inflate MAD by more than a bounded amount (the spike minus the median).

Threshold: |x_i - rolling_median| > k × rolling_MAD

Default k=3.0 is a conservative setting. Reduce to k=2.0 for aggressive spike removal in noisy coastal data.

Density Inversion Check

Computed from σ₀ ≈ 1027.6 − 0.5T + 0.78S (linear approximation for speed). Used in QCReport generation and the detect_density_inversions() utility.

The full TEOS-10 calculation is used in calculate_derived_parameters(): σ₀ = gsw.rho(SA, CT, 0) − 1000.

TEOS-10 Derived Variables

TEOS-10 (Thermodynamic Equation of Seawater – 2010) is the international standard for seawater thermodynamics, replacing EOS-80.

Variable	Symbol	Units	Physical Meaning
Practical Salinity	SP	PSU	Electrical conductivity ratio
Absolute Salinity	SA	g/kg	True mass fraction of dissolved salts
Conservative Temperature	CT	°C	Proportional to heat content per unit mass
Potential Density Anomaly	σ₀	kg/m³	Density at surface pressure − 1000
In-situ Density	ρ	kg/m³	Actual density at measurement pressure
Sound Speed	c	m/s	Medwin/Del Grosso formula via gsw

Why TEOS-10?

SA corrects for regional anomalies in seawater composition (Baltic, Arctic)
CT is exactly conserved in mixing (unlike potential temperature θ)
σ₀ is the standard for water mass identification on T-S diagrams

Output Formats

NetCDF-4 (`.nc`)

Primary archival format. Fully CF-1.8 and ACDD-1.3 compliant:

Dimensions: obs = N (one per dbar bin)
Variables:
  pressure                  (obs) [dbar]
  temperature               (obs) [degree_Celsius]
  conductivity              (obs) [S/m]
  salinity_practical        (obs) [1]
  salinity_absolute         (obs) [g/kg]
  temperature_conservative  (obs) [degree_Celsius]
  density_potential_anomaly (obs) [kg/m3]
  sound_speed               (obs) [m/s]

Global attributes:
  Conventions = "CF-1.8, ACDD-1.3"
  title, summary, cruise_id, platform, instrument
  date_created, history, source, references
  geospatial_lat_min/max, geospatial_lon_min/max

CSV (`_processed.csv`)

One row per dbar bin, all derived columns included, plus _qc suffix columns:

pressure,temperature,conductivity,...,pressure_qc,temperature_qc,...
0.0,28.51,54.21,...,1,1,...
1.0,28.50,54.20,...,1,1,...

JSON sidecar (`_metadata.json`)

{
  "profile": {
    "cast_id": "cast_001",
    "cruise": "GOMECC-4",
    "vessel": "R/V Ronald H. Brown",
    "latitude": 27.5,
    "longitude": -96.5,
    "created_at": "2026-05-14T10:00:00Z"
  },
  "processing_log": [
    {"timestamp": "...", "action": "load_seabird_cnv", "rows": 150},
    {"timestamp": "...", "action": "apply_qc", "suspicious_points": 3},
    {"timestamp": "...", "action": "bin_average", "output_rows": 145}
  ],
  "qc_summary": {
    "temperature": {"good": 142, "suspicious": 3, "bad": 0}
  }
}

Visualizations

All plots are generated from real pipeline output using the four synthetic Gulf of Mexico CTD casts in sample_data/raw/.

T-S Diagram

Reveals water masses and mixing. TEOS-10 isopycnal contours (σ₀) overlaid; data points colored by pressure to show the depth structure of each water mass.

viz = CTDVisualizer()
viz.plot_ts_diagram(profiles, output_path="output/ts_diagram.png")

Profile Comparison — All 4 Casts

Four oceanographic variables plotted vs. pressure for all four Gulf of Mexico casts. Each colored line represents one station; shared y-axis makes cross-cast comparison straightforward.

Single-Cast Detail — Cast 001 (Corpus Christi Shelf, 800 m)

Conservative Temperature, Absolute Salinity, σ₀, and sound speed with filled area shading to emphasize the vertical gradient structure.

viz.plot_profile(ctd, output_path="output/cast_001_profile.png")

QC Flag Visualization

MAD spike detection flags at full 24 Hz scan resolution.

Green dots — Good (flag 1)
Orange triangles — Suspicious / spike (flag 2)
Red × — Bad / loop reversal (flag 3)

Processing Pipeline — Before → After

Step-by-step transformation: raw 24 Hz data → QC applied (spikes removed) → 1-dbar bin-averaged archive-ready profile.

Testing

Running tests

# All tests
pytest -v tests/

# Individual modules
pytest tests/test_core.py -v
pytest tests/test_qc.py -v
pytest tests/test_utils.py -v

# With coverage
pytest --cov=ctd_processor --cov-report=term-missing tests/

Test structure

Module	Tests	What is covered
`test_core.py`	6	Profile init, CSV/NetCDF export, QC application, derived params, processing log
`test_qc.py`	5	MAD spike detection, density inversion detection, QC report generation
`test_utils.py`	7	ISO 8601 timestamps, pressure/temperature/salinity range validation

CI matrix

GitHub Actions runs tests on Python 3.10, 3.11, and 3.12 on Ubuntu. A separate build job verifies that python -m build produces a valid sdist and wheel.

Configuration & Tuning

Parameter	Default	Effect
`loop_edit_threshold`	0.02 °C	Lower = more aggressive loop flagging
`despike_window`	3 scans	Larger window = smoother reference median
`despike_threshold`	3.0 σ	Lower = more aggressive spike removal
`bin_size`	1.0 dbar	Reduce to 0.5 for high-resolution output

Coastal vs. open-ocean:

Coastal (high variability): despike_threshold=2.5, loop_edit_threshold=0.05
Open ocean (stable): defaults are appropriate

Debugging Guide

No data loaded

Error: No data loaded

Check that *END* line is present in the CNV file
Ensure data lines contain only whitespace-separated numbers
Run with Python logging: import logging; logging.basicConfig(level=logging.DEBUG)

TEOS-10 derivation fails

Derivation failed: ...

Verify conductivity column is in mS/cm (Seabird standard), not S/m
Typical mS/cm range for seawater: 40–60 mS/cm

NetCDF write fails

FileNotFoundError / PermissionError

Output directory is created automatically; check write permissions
Ensure libhdf5 and libnetcdf4 system libraries are installed

Compliance checker not installed

Warning: compliance-checker not installed

pip install ioos-compliance-checker
Or ignore — manual CF checks in ComplianceChecker.check_cf_conventions() run regardless

Tests fail with import errors

ModuleNotFoundError: No module named 'ctd_processor'

Run pip install -e ".[dev]" from the repo root
Or make dev

Performance & Optimization

Single cast (150 raw samples, 1000 m): < 0.5 s wall time
Batch of 100 casts: ~30 s single-threaded
--parallel N flag uses Python multiprocessing for batch jobs
NetCDF compression (zlib level 4) reduces file size ~60–70% with negligible write overhead

Memory: Each cast is held in memory as a DataFrame (~10–100 KB for typical cast lengths). Batch processing loads one cast at a time.

Standards Compliance

Standard	Scope	Implementation
CF-1.8	NetCDF variable/attribute naming	`standard_name`, `units`, `long_name`, `Conventions` global attr
ACDD-1.3	Dataset discovery metadata	`title`, `summary`, `date_created`, `geospatial_*` attrs
TEOS-10	Seawater thermodynamics	gsw library, SA/CT derivation chain
WOCE	QC flag scale	Integer flags 1–4, 9 per variable
ARGO QC	Spike detection	MAD test inspired by ARGO QC Manual v3.4 §3.4
GO-SHIP	CTD QC procedures	Loop edit, bin average per IOCCP Report 14

Project Structure

ctd-cast-processor/
├── ctd_processor/
│   ├── __init__.py          # CTDProfile, QCReport exports
│   ├── core.py              # Main CTDProfile class
│   ├── qc.py                # QCReport dataclass + algorithms
│   ├── utils.py             # Timestamp, range validators
│   ├── visualization.py     # T-S, profile, section plots
│   ├── compliance.py        # IOOS/CF/ACDD checker
│   └── cli.py               # Click CLI
├── tests/
│   ├── test_core.py
│   ├── test_qc.py
│   └── test_utils.py
├── sample_data/
│   ├── raw/                 # Synthetic .cnv files
│   └── generate_sample_cnv.py
├── docs/
│   └── METHODS.md           # Algorithm reference with equations
├── .github/workflows/
│   └── ci.yml               # CI: lint + test (3.10–3.12) + build
├── pyproject.toml
├── Makefile
├── run.sh
├── CONTRIBUTORS.md
└── LICENSE

Sample Data

Four synthetic Gulf of Mexico CTD profiles (GOMECC-4 style), generated with realistic T/S structures, sensor noise, and intentional spikes for QC testing:

File	Location	Max depth
`cast_001.cnv`	Corpus Christi Shelf (27.5°N 96.5°W)	800 m
`cast_002.cnv`	Matagorda Slope (27.75°N 95.8°W)	1000 m
`cast_003.cnv`	Galveston Approach (28.1°N 94.5°W)	500 m
`cast_004.cnv`	De Soto Canyon (26.8°N 92.0°W)	1500 m

Each profile includes:

Warm surface mixed layer (24–28 °C)
Sharp thermocline at 50–200 m
Cold deep water (~4 °C)
2 artificial spikes in temperature for QC validation

Generate or regenerate:

python sample_data/generate_sample_cnv.py
# or: make sample

References

WOCE Hydrographic Operations Manual (1994).
ARGO Quality Control Manual for CTD and Trajectory Data, v3.4 (2023). https://doi.org/10.13155/33951
IOC, SCOR and IAPSO (2010). TEOS-10. http://www.teos-10.org
McDougall & Barker (2011). Getting started with TEOS-10. SCOR/IAPSO WG127. ISBN 978-0-646-55621-5.
GO-SHIP Repeat Hydrography Manual (2010). IOCCP Report No. 14.
Seabird Scientific (2023). SBE Data Processing User's Manual.
CF Conventions v1.8. http://cfconventions.org/
Attribute Convention for Data Discovery 1.3. https://wiki.esipfed.org/ACDD_1.3
IOOS Compliance Checker. https://github.com/ioos/compliance-checker

License

MIT License — see LICENSE file.

Author

Ranjith Guggilla

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
ctd_processor		ctd_processor
docs		docs
sample_data		sample_data
tests		tests
.gitignore		.gitignore
CONTRIBUTORS.md		CONTRIBUTORS.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
run.sh		run.sh

Folders and files

Latest commit

History

Repository files navigation

CTD Cast Processor

Table of Contents

Overview

Feature Summary

System Architecture

Data flow between modules

Installation

Prerequisites

Standard install

Verify installation

Optional extras

Quick Start

One command (uses sample data)

Single cast (Python)

Step-by-Step: How the Pipeline Works

Step 1 — CNV ingestion

Step 2 — Loop edit

Step 3 — Despike (MAD)

Step 4 — Pressure bin averaging

Step 5 — TEOS-10 derived variables

Step 6 — CF-1.8 / ACDD-1.3 NetCDF export

Step 7 — IOOS compliance validation

CLI Reference

process

batch

plot-ts

check-compliance

Python API Reference

CTDProfile

Class methods

Instance methods

QCReport

CTDVisualizer

Standalone QC functions

Validation utilities

Quality Control Algorithms

Loop Edit

Spike Detection — MAD

Density Inversion Check

TEOS-10 Derived Variables

Output Formats

NetCDF-4 (.nc)

CSV (_processed.csv)

JSON sidecar (_metadata.json)

Visualizations

T-S Diagram

Profile Comparison — All 4 Casts

Single-Cast Detail — Cast 001 (Corpus Christi Shelf, 800 m)

QC Flag Visualization

Processing Pipeline — Before → After

Testing

Running tests

Test structure

CI matrix

Configuration & Tuning

Debugging Guide

No data loaded

TEOS-10 derivation fails

NetCDF write fails

Compliance checker not installed

Tests fail with import errors

Performance & Optimization

Standards Compliance

Project Structure

Sample Data

References

License

Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

`process`

`batch`

`plot-ts`

`check-compliance`

`CTDProfile`

`QCReport`

`CTDVisualizer`

NetCDF-4 (`.nc`)

CSV (`_processed.csv`)

JSON sidecar (`_metadata.json`)

Packages