Core: Trajectory I/O and Parsers¶
The mdpp.core subpackage handles trajectory loading, atom selection, and parsing of MD engine output files.
Loading Trajectories¶
Single trajectory¶
With atom selection and striding:
Use start and stop to skip equilibration or limit the frame range.
Frame selection follows Python's range(start, stop, stride) convention:
# Skip the first 1000 frames, read up to frame 5000, every 10th frame
traj = load_trajectory(
"md.xtc",
topology_path="topol.gro",
start=1000,
stop=5000,
stride=10,
)
When atom_selection is provided, only the selected atoms are loaded
(atom indices are passed directly to mdtraj's reader, avoiding a
full-atom load followed by slicing).
Multiple trajectories¶
from mdpp.core import load_trajectories
trajs = load_trajectories(
["run1.xtc", "run2.xtc", "run3.xtc"],
topology_paths=["topol.gro", "topol.gro", "topol.gro"],
stride=5,
)
For parallel loading, set max_workers to use multiprocessing.Pool.
Processes are used instead of threads because mdtraj's C-level parsers
hold the GIL during decoding:
trajs = load_trajectories(
["run1.xtc", "run2.xtc", "run3.xtc"],
topology_paths=["topol.gro", "topol.gro", "topol.gro"],
stride=10,
max_workers=3,
)
| Method | Time | Speedup | Parent RSS delta |
|---|---|---|---|
| Sequential | 9.7 s | 1.0x | +16.8 MB |
| Threads (6) | 4.5 s | 2.2x | +7.7 MB |
| mp.Pool (6) | 0.9 s | 11.2x | +0.0 MB |
Worker processes allocate trajectory data in their own address space. When the pool closes, that memory is fully released to the OS, leaving zero RSS growth in the parent process.
Atom Selection¶
Use MDTraj's DSL selection syntax:
from mdpp.core import select_atom_indices
indices = select_atom_indices(traj.topology, "name CA and resid 1 to 50")
Map atom indices back to residue IDs:
from mdpp.core import residue_ids_from_indices
res_ids = residue_ids_from_indices(traj.topology, indices)
Time Extraction¶
Get the raw time array in picoseconds:
Trajectory Alignment¶
Superpose all frames onto a reference frame:
from mdpp.core import align_trajectory
aligned = align_trajectory(traj, atom_selection="name CA", reference_frame=0)
Parsing GROMACS Files¶
XVG files¶
Parse any GROMACS .xvg file (from gmx energy, gmx rms, etc.):
from mdpp.core import read_xvg
df = read_xvg("rmsd.xvg")
# DataFrame columns are labeled from XVG legend entries
print(df.head())
EDR files¶
Parse binary energy files using panedr: