TRACER: TRAjectory Clustering & Environment Research
Version 0.9 (SASA-enabled build)
Author: Dr. Alessandro Mariani
Description TRACER.py is an interactive toolkit for analyzing molecular dynamics (MD) trajectories. It provides RDF-driven cutoff discovery (with an interactive first-minimum picker), cluster analysis (residue COMs + union–find/DSU), multi-type hydrogen-bond analysis with RDF-based distance selection and parallel processing, radius of gyration (Rg) of clusters, system introspection (bond-length matrices and residue counts), and Solvent Accessible Surface Area (SASA) via FreeSASA. The program is menu-driven; no config.ini is used.
Main Features • Input and setup – Auto-detects topology (.prmtop/.top) and trajectory (.mdcrd) in the current directory; otherwise prompts for paths – Creates an MDAnalysis Universe and unwraps atoms by fragments – Lets you choose a frame interval (e.g., 1 = every frame, 2 = every second frame) – Redirects warnings/stderr to analysis.log
• RDF analysis – Averages an InterRDF over the selected frame range – Provides g(r) and bin edges; serves as backend for cutoff picking
• Interactive RDF-based cutoff selection – Computes/smooths g(r) (first ~10 frames), skips the first 10 points to avoid the initial spike – Auto-detects first major peak and subsequent minimum; you can click to confirm the cutoff (fallback = auto minimum) – Reused for cluster and Rg (with an X-factor)
• Cluster analysis (residue COMs + DSU) – Cutoffs: manual; X × molecule length (by a chosen resid); X × RDF first-minimum (interactive) – Outputs per-frame cluster counts and final averages – Saves distributions and plots (probability-normalized): • cluster_analysis_results.txt • cluster_size_distributions.csv • cluster_size_P_cluster.png (fraction of clusters vs size) • cluster_size_F_molecule.png (molecule-weighted fraction vs size)
• Hydrogen-bond analysis (multi-type, parallel) – Define N HB “types” via donor and acceptor selections (MDAnalysis syntax) – For each type: max donor–acceptor distance picked via RDF-first-minimum; min distance fixed at 1.0 Å – Parallel frame processing; outputs: • optimized_hbond_results.csv (per-contact distances/angles per frame) • hbond_analysis_with_std_dev.csv (per-type ⟨distance⟩±σ and ⟨angle⟩±σ with sensible rounding) • hbond_analysis_results.txt (water-only vs mixed fractions, α and β summary, per-frame counts)
• Radius of gyration of clusters – Builds clusters at chosen cutoff; computes per-frame average Rg – Writes radius_of_gyration_clusters.txt
• SASA (Solvent Accessible Surface Area) – Function: perform_sasa_analysis(u, selected_frames) – Excludes water (not resname WAT) and computes SASA for the remaining atoms – Workflow per selected frame:
- write non-water atoms to a temporary PDB
- call FreeSASA (Python bindings) to compute SASA
- accumulate total and average over frames – Output: • sasa_results.txt containing Average SASA (excluding water): Å^2 • Any FreeSASA stdout/stderr captured and appended to analysis.log – Temporary PDBs are cleaned up after use
Outputs (files) • analysis.log • cluster_analysis_results.txt • cluster_size_distributions.csv • cluster_size_P_cluster.png • cluster_size_F_molecule.png • optimized_hbond_results.csv • hbond_analysis_with_std_dev.csv • hbond_analysis_results.txt • radius_of_gyration_clusters.txt • sasa_results.txt
Dependencies Python 3.x with: • numpy • matplotlib • pandas • scipy (signal.find_peaks, spatial.cKDTree) • MDAnalysis (core, analysis.rdf.InterRDF, lib.distances.distance_array) • freesasa (Python bindings) and a working FreeSASA installation (C library)
Example installation (Python packages): pip install numpy matplotlib pandas scipy MDAnalysis freesasa
Note: depending on your OS, you may need to install the FreeSASA C library via your package manager (e.g., apt, brew) before pip install freesasa works.
Usage 1. Place your topology (.prmtop or .top) and trajectory (.mdcrd) in the current directory, or be ready to provide paths. 2. Run: python TRACER.py – Choose a frame interval (e.g., 1, 2, 5). 3. Initial introspection prints per-residue bond-length matrices (pm) and residue counts. 4. Menu: 1. Radial Distribution Function (RDF) 2. Solvent Accessible Surface Area (SASA) 3. Hydrogen Bond Analysis 4. Cluster Analysis 5. Radius of Gyration h. Help for selecting atoms 6. Exit 5. For SASA: select option 2. The code will exclude water, run FreeSASA on temporary PDBs for the chosen frames, and write sasa_results.txt. 6. For RDF-based cutoffs: an RDF plot appears; click once to set the cutoff or press Enter to accept the auto minimum.
Version History v0.9 (SASA-enabled build) – Added SASA analysis via FreeSASA, excluding water (temporary PDB per frame; average over selected_frames; results in sasa_results.txt; stdout/stderr captured to analysis.log). – Retains v0.9 hydrogen-bond overhaul (multi-type; RDF-picked max distance; parallel processing), improved logging, and cluster/Rg workflows.
v0.8 – Robust interactive RDF cutoff picker; reused for cluster and Rg; improved GUI/backend handling.
v0.7 – Cluster analysis polish; probability-normalized histograms; residue counter and per-residue bond-length matrices.
v0.6 – Refactor (centralized KD-tree queries; unified validators; stronger logging).
v0.5 – Added radius-of-gyration analysis for clusters (radius_of_gyration_clusters.txt).
v0.4 – Generalized RDF module (perform_rdf_analysis with arbitrary selections and frame ranges).
v0.3 – Introduced cluster analysis (KD-tree + DSU; text summary + histogram/plots).
v0.2 – Interactive CLI menu and selection-language help; logging initialization.
v0.1 – Initial prototype (Universe setup and frame iteration; placeholder hooks).
Notes and Limitations • The SASA routine relies on freesasa and a working FreeSASA installation; if either is missing, SASA will fail. • Formats: expected Amber-style .prmtop/.top and .mdcrd; convert other formats beforehand. • Interactive cutoff selection requires a GUI backend that supports mouse clicks (matplotlib ginput). • Large trajectories can be heavy; increase the frame interval to reduce load.