Core API Reference¶

Trajectory¶

`mdpp.core.trajectory` ¶

Trajectory loading and selection helpers based on MDTraj.

`select_atom_indices(topology, selection)` ¶

Return atom indices selected by an MDTraj DSL selection.

Parameters:

Name	Type	Description	Default
`topology`	`Topology`	Trajectory topology.	required
`selection`	`str`	MDTraj selection string (for example, `"name CA"`).	required

Returns:

Type	Description
`NDArray[int_]`	Atom indices matching the selection.

Raises:

Type	Description
`ValueError`	If the selection matches no atoms.

`residue_ids_from_indices(topology, atom_indices)` ¶

Map atom indices to residue sequence IDs.

Parameters:

Name	Type	Description	Default
`topology`	`Topology`	Trajectory topology.	required
`atom_indices`	`NDArray[int_]`	Atom indices to map.	required

Returns:

Type	Description
`NDArray[int_]`	Residue IDs for each atom index.

`trajectory_time_ps(traj, *, timestep_ps=None, dtype=None)` ¶

Return per-frame time values in picoseconds.

Parameters:

Name	Type	Description	Default
`traj`	`Trajectory`	Input trajectory.	required
`timestep_ps`	`float \| None`	Optional fixed timestep to enforce. If provided, generated time values are `np.arange(n_frames) * timestep_ps`.	`None`
`dtype`	`DtypeArg`	Output float dtype. If `None`, uses the package default (see :func:`mdpp.set_default_dtype`).	`None`

Returns:

Type	Description
`NDArray[floating]`	Time array in picoseconds.

`load_trajectory(trajectory_path, *, topology_path=None, start=0, stop=None, stride=1, atom_selection=None)` ¶

Load a single trajectory with optional frame and atom selection.

Frame selection follows Python's range(start, stop, stride) convention: start is included, stop is excluded, and stride controls the step size. All three refer to raw frame indices in the trajectory file.

When atom_selection is provided, the selected atom indices are passed directly to the underlying mdtraj reader so that only those atoms are read from disk. This avoids loading the full-atom trajectory into memory.

Parameters:

Name	Type	Description	Default
`trajectory_path`	`PathLike`	Path to trajectory file (for example, `.xtc`).	required
`topology_path`	`PathLike \| None`	Optional topology path (for example, `.pdb`).	`None`
`start`	`int`	First raw frame index to load (inclusive). Default is 0.	`0`
`stop`	`int \| None`	Raw frame index at which to stop loading (exclusive). If `None`, read to the end of the file.	`None`
`stride`	`int`	Frame stride (step size). Default is 1.	`1`
`atom_selection`	`str \| None`	Optional MDTraj atom selection string. Matching atoms are loaded directly from disk (no post-load slicing).	`None`

Returns:

Type	Description
`Trajectory`	Loaded trajectory containing only the selected frames and atoms.

Raises:

Type	Description
`ValueError`	If `stride` is less than 1, `start` is negative, or `stop` is not greater than `start`.

`load_trajectories(trajectory_paths, *, topology_paths=None, start=0, stop=None, stride=1, atom_selection=None, max_workers=None)` ¶

Load a list of trajectories with a shared interface.

Frame selection follows Python's range(start, stop, stride) convention: start is included, stop is excluded, and stride controls the step size. All three refer to raw frame indices.

When max_workers is set, trajectories are loaded in parallel using :class:multiprocessing.Pool (process-based parallelism).

Why processes instead of threads

mdtraj's C-level XTC/TRR parsers hold the GIL during frame decoding, so threads cannot run concurrently on the CPU-bound parsing step. Benchmarks on 6 replicas (stride=10, 1000 frames each, ~5000 atoms) show:

============ ====== ========= =========== Method Time Speedup RSS delta ============ ====== ========= =========== Sequential 9.7 s 1.0x +16.8 MB Threads (6) 4.5 s 2.2x +7.7 MB mp.Pool (6) 0.9 s 11.2x +0.0 MB ============ ====== ========= ===========

Processes win on both speed and memory. Worker processes allocate trajectory data in their own address space; when the pool closes that memory is fully released to the OS, leaving zero RSS growth in the parent. Threads allocate within the parent and rely on Python's allocator to (possibly) return pages.

Why multiprocessing.Pool instead of ProcessPoolExecutor: Both perform identically in benchmarks for this workload. Pool is chosen for its simpler API (map returns results directly) and maxtasksperchild support, which can guard against memory leaks from large trajectory allocations.

Parameters:

Name	Type	Description	Default
`trajectory_paths`	`Sequence[PathLike]`	Trajectory paths.	required
`topology_paths`	`Sequence[PathLike \| None] \| None`	Optional topology paths. If provided, must match `trajectory_paths` length.	`None`
`start`	`int`	First raw frame index to load (inclusive). Default is 0.	`0`
`stop`	`int \| None`	Raw frame index at which to stop (exclusive). If `None`, read to the end of each file.	`None`
`stride`	`int`	Frame stride (step size). Default is 1.	`1`
`atom_selection`	`str \| None`	Optional atom selection for slicing.	`None`
`max_workers`	`int \| None`	If set, load trajectories in parallel using processes. The value controls the maximum number of concurrent worker processes. If `None`, trajectories are loaded sequentially.	`None`

Returns:

Type	Description
`list[Trajectory]`	Loaded trajectories in the same order as `trajectory_paths`.

Raises:

Type	Description
`ValueError`	If `topology_paths` length does not match trajectories.

`align_trajectory(traj, *, atom_selection='name CA', reference_frame=0, inplace=False)` ¶

Align a trajectory to a reference frame.

md.Trajectory.superpose modifies coordinates in place. When inplace=False (the default), only the xyz array is copied; topology and time are shared with the original trajectory. This avoids the expensive deepcopy(topology) that traj[:] performs.

Parameters:

Name	Type	Description	Default
`traj`	`Trajectory`	Input trajectory.	required
`atom_selection`	`str`	Atoms used for alignment.	`'name CA'`
`reference_frame`	`int`	Reference frame index.	`0`
`inplace`	`bool`	If `True`, align `traj` in place and return it. If `False` (default), return a new trajectory that shares topology and time but has its own aligned coordinates.	`False`

Returns:

Type	Description
`Trajectory`	The aligned trajectory.

Raises:

Type	Description
`ValueError`	If `reference_frame` is out of range.

Parsers¶

`mdpp.core.parsers` ¶

Thin wrappers around external parsers for MD engine output files.

`read_xvg(path, *, dtype=None)` ¶

Read a GROMACS XVG file into a DataFrame.

Parses metadata lines (lines starting with @) to extract column labels from legend entries. Data lines are read with NumPy for performance.

Parameters:

Name	Type	Description	Default
`path`	`StrPath`	Path to a `.xvg` file.	required
`dtype`	`DtypeArg`	Float dtype for the data. If `None`, uses the package default.	`None`

Returns:

Type	Description
`DataFrame`	DataFrame whose first column is typically time and remaining columns
`DataFrame`	are labeled from the XVG legend entries (or `"col_0"`, `"col_1"`,
`DataFrame`	etc. when legends are absent).

`read_edr(path)` ¶

Read a GROMACS EDR energy file into a DataFrame.

Uses panedr internally. Install it with pip install panedr.

Parameters:

Name	Type	Description	Default
`path`	`StrPath`	Path to a `.edr` file.	required

Returns:

Type	Description
`DataFrame`	DataFrame with a `Time` column and one column per energy term.

Raises:

Type	Description
`ImportError`	If `panedr` is not installed.

Core API Reference¶

Trajectory¶

mdpp.core.trajectory ¶

select_atom_indices(topology, selection) ¶

residue_ids_from_indices(topology, atom_indices) ¶

trajectory_time_ps(traj, *, timestep_ps=None, dtype=None) ¶

load_trajectory(trajectory_path, *, topology_path=None, start=0, stop=None, stride=1, atom_selection=None) ¶

load_trajectories(trajectory_paths, *, topology_paths=None, start=0, stop=None, stride=1, atom_selection=None, max_workers=None) ¶

align_trajectory(traj, *, atom_selection='name CA', reference_frame=0, inplace=False) ¶

Parsers¶

mdpp.core.parsers ¶

read_xvg(path, *, dtype=None) ¶

read_edr(path) ¶

`mdpp.core.trajectory` ¶

`select_atom_indices(topology, selection)` ¶

`residue_ids_from_indices(topology, atom_indices)` ¶

`trajectory_time_ps(traj, *, timestep_ps=None, dtype=None)` ¶

`load_trajectory(trajectory_path, *, topology_path=None, start=0, stop=None, stride=1, atom_selection=None)` ¶

`load_trajectories(trajectory_paths, *, topology_paths=None, start=0, stop=None, stride=1, atom_selection=None, max_workers=None)` ¶

`align_trajectory(traj, *, atom_selection='name CA', reference_frame=0, inplace=False)` ¶

`mdpp.core.parsers` ¶

`read_xvg(path, *, dtype=None)` ¶

`read_edr(path)` ¶