Preparation API Reference¶

Protein¶

`mdpp.prep.protein` ¶

Protein structure preparation and manipulation utilities.

`PropkaResidue(residue_type, res_num, chain_id, pka, model_pka)` `dataclass` ¶

PROPKA pKa prediction for a single titratable residue.

Attributes:

Name	Type	Description
`residue_type`	`str`	Group label (e.g. `ASP`, `HIS`, `N+`, `C-`).
`res_num`	`int`	Residue sequence number.
`chain_id`	`str`	PDB chain identifier.
`pka`	`float`	PROPKA-predicted pKa value.
`model_pka`	`float`	Reference model pKa value.

`label` `property` ¶

Formatted residue label matching PROPKA output style.

`is_protonated_at(pH)` ¶

Whether PROPKA predicts the residue to be protonated at the given pH.

`is_default_protonated_at(pH)` ¶

Whether the model pKa predicts the residue to be protonated at the given pH.

`PropkaResult(residues)` `dataclass` ¶

PROPKA pKa prediction results for all titratable residues.

Attributes:

Name	Type	Description
`residues`	`tuple[PropkaResidue, ...]`	pKa predictions for each titratable residue.

`get_nonstandard(pH)` ¶

Return residues where PROPKA and model pKa disagree on protonation state.

A residue is "non-standard" when pKa > pH and model_pKa <= pH (or vice versa), meaning PDBFixer would assign a different protonation state than what PROPKA predicts.

Parameters:

Name	Type	Description	Default
`pH`	`float`	pH value for protonation state comparison.	required

Returns:

Type	Description
`tuple[PropkaResidue, ...]`	Residues with non-standard predicted protonation.

`ChainSelect(chain_ids)` ¶

Bases: Select

Biopython Select subclass that accepts only specified PDB chains.

Parameters:

Name	Type	Description	Default
`chain_ids`	`str \| list[str]`	One or more chain identifiers to keep.	required

Example::

from Bio.PDB import PDBIO, PDBParser
from mdpp.prep import ChainSelect

parser = PDBParser(QUIET=True)
structure = parser.get_structure("complex", "complex.pdb")
io = PDBIO()
io.set_structure(structure)
io.save("protein.pdb", ChainSelect("A"))

Initialize the ChainSelect object.

Parameters:

Name	Type	Description	Default
`chain_ids`	`str \| list[str]`	The chain IDs to keep.	required

`accept_chain(chain)` ¶

Return 1 if the chain should be kept, 0 otherwise.

`run_propka(pdb_path)` ¶

Run PROPKA to predict pKa values for titratable protein residues.

Parameters:

Name	Type	Description	Default
`pdb_path`	`StrPath`	Path to the input PDB file.	required

Returns:

Type	Description
`PropkaResult`	pKa predictions for all titratable residues found.

`fix_pdb(pdb_path, fixed_pdb_path, pH=7.0, *, protonation='model')` ¶

Fix a PDB file by adding missing residues, atoms, and hydrogens.

Removes heterogens (excluding water by default), identifies missing residues and atoms, then adds them back along with hydrogens at the specified pH.

Runs PROPKA to check for residues whose environment-shifted pKa predicts a different protonation state than the model-pKa default used by PDBFixer, and logs a warning for each such residue.

Parameters:

Name	Type	Description	Default
`pdb_path`	`StrPath`	Path to the input PDB file.	required
`fixed_pdb_path`	`StrPath`	Path where the fixed PDB will be written.	required
`pH`	`float`	pH value for hydrogen placement.	`7.0`
`protonation`	`Literal['model', 'propka']`	Protonation policy. `"model"` (default) uses PDBFixer's built-in model pKa values. `"propka"` keeps the model default for most residues but overrides the residues where PROPKA disagrees (`PropkaResult.get_nonstandard`) with PROPKA's predicted state, applied via OpenMM `Modeller` variants. Supported overrides are ASP/GLU/LYS/HIS/CYS (a neutral histidine uses the HIE tautomer); unsupported residue types (e.g. termini) keep the default and are logged.	`'model'`

`strip_solvent(traj, *, keep_ions=False)` ¶

Remove solvent molecules from a trajectory.

Parameters:

Name	Type	Description	Default
`traj`	`Trajectory`	Input trajectory.	required
`keep_ions`	`bool`	If `True`, retain common ions (Na+, Cl-, K+, etc.) while still removing water.	`False`

Returns:

Type	Description
`Trajectory`	A new trajectory with solvent removed.

`extract_chain(traj, chain_id)` ¶

Extract a single chain from a trajectory.

Parameters:

Name	Type	Description	Default
`traj`	`Trajectory`	Input trajectory.	required
`chain_id`	`int`	Zero-based chain index to extract.	required

Returns:

Type	Description
`Trajectory`	A new trajectory containing only the specified chain.

Raises:

Type	Description
`ValueError`	If `chain_id` is out of range.

Ligand¶

`mdpp.prep.ligand` ¶

Ligand parameterization and topology assignment utilities.

`assign_topology(mol, template_mol)` ¶

Assign bond orders and hydrogens from a template molecule to a ligand.

Uses the template (typically from SMILES) without hydrogens to assign bond orders to the ligand's heavy-atom coordinates, then adds hydrogens with 3D coordinates.

Parameters:

Name	Type	Description	Default
`mol`	`Mol`	The ligand molecule (usually from a PDB/MOL2 with no bond orders).	required
`template_mol`	`Mol`	The reference molecule with correct bond orders.	required

Returns:

Type	Description
`Mol`	A new molecule with assigned bond orders and added hydrogens.

`constraint_minimization(mol, *, max_iters=5000)` ¶

Minimize hydrogen positions while keeping heavy atoms fixed.

Uses the Universal Force Field (UFF) with fixed-point constraints on all non-hydrogen atoms.

Parameters:

Name	Type	Description	Default
`mol`	`Mol`	Input molecule with 3D coordinates (conformer 0).	required
`max_iters`	`int`	Maximum number of minimization iterations.	`5000`

Returns:

Type	Description
`Mol`	The molecule with optimized hydrogen positions.

Topology¶

`mdpp.prep.topology` ¶

Trajectory manipulation utilities for system preparation.

`merge_trajectories(trajectories)` ¶

Concatenate multiple trajectories along the time axis.

All trajectories must share the same topology and number of atoms.

Parameters:

Name	Type	Description	Default
`trajectories`	`Sequence[Trajectory]`	Sequence of trajectories to concatenate.	required

Returns:

Type	Description
`Trajectory`	A single trajectory containing all frames in order.

Raises:

Type	Description
`ValueError`	If fewer than two trajectories are provided or topologies do not match.

`slice_trajectory(traj, *, start=None, stop=None, stride=None)` ¶

Slice a trajectory by frame range with validation.

Parameters:

Name	Type	Description	Default
`traj`	`Trajectory`	Input trajectory.	required
`start`	`int \| None`	Starting frame index (inclusive). Defaults to `0`.	`None`
`stop`	`int \| None`	Stopping frame index (exclusive). Defaults to `n_frames`.	`None`
`stride`	`int \| None`	Frame stride. Defaults to `1`.	`None`

Returns:

Type	Description
`Trajectory`	A new trajectory with the selected frames.

`subsample_trajectory(traj, n_frames)` ¶

Evenly subsample a trajectory to a target number of frames.

Parameters:

Name	Type	Description	Default
`traj`	`Trajectory`	Input trajectory.	required
`n_frames`	`int`	Desired number of output frames.	required

Returns:

Type	Description
`Trajectory`	A new trajectory with approximately `n_frames` evenly spaced frames.

Raises:

Type	Description
`ValueError`	If `n_frames` is less than 1 or exceeds the trajectory length.

APBS¶

`mdpp.prep.apbs` ¶

APBS Poisson-Boltzmann input generation and log parsing.

Helpers for driving APBS (Adaptive Poisson-Boltzmann Solver) from Python.

Two pure functions are exposed:

:func:write_apbs_input generates a multigrid APBS .in file from an existing .pqr by deriving the grid bounding box from the radius-inflated atom coordinates and rounding dime up to the nearest c * 2**n + 1 value required by APBS multigrid.
:func:infer_debye_length parses a Debye length (in Angstrom) out of an APBS log; used to bootstrap downstream BrownDye input from the same APBS run.

Both functions are pure Python with no subprocess calls; they only read/write text files. Run APBS itself by calling the apbs CLI separately.

`write_apbs_input(stem, work_dir, *, ionic_strength_m=DEFAULT_IONIC_STRENGTH_M, solute_dielectric=DEFAULT_SOLUTE_DIELECTRIC, solvent_dielectric=DEFAULT_SOLVENT_DIELECTRIC, solvent_radius_a=DEFAULT_SOLVENT_RADIUS_A, temperature_k=DEFAULT_TEMPERATURE_K, fine_spacing_a=DEFAULT_FINE_SPACING_A, fine_padding_a=DEFAULT_FINE_PADDING_A, coarse_padding_a=DEFAULT_COARSE_PADDING_A)` ¶

Write an APBS multigrid input file {stem}.in for {stem}.pqr.

Physics defaults mirror pdb2pqr --apbs-input canonical defaults (lpbe / bcfl sdh / srfm smol / chgm spl2 / pdie 2.0 / sdie 78.54 / srad 1.4 / sdens 10 / swin 0.30 / temp 298.15) with two intentional overrides:

Explicit Na+/Cl- ion lines at ionic_strength_m with Pauling radii (1.875 / 1.815 A). The pdb2pqr canonical input omits ions, which sets the Debye length to infinity. Explicit ions are required when the resulting .dx feeds a BrownDye2 simulation that needs a finite Debye length for far-field electrostatics.
Larger grid padding (fine_padding_a / coarse_padding_a) than pdb2pqr's fadd=20 / cfac=1.7 defaults so the outer grid comfortably exceeds the BrownDye b-radius. dime is rounded up to the nearest c * 2**n + 1 value required by APBS multigrid.

Parameters:

Name	Type	Description	Default
`stem`	`str`	PQR file stem (without extension). `work_dir / "{stem}.pqr"` must already exist; the input file is written next to it at `work_dir / "{stem}.in"`.	required
`work_dir`	`StrPath`	directory containing `{stem}.pqr`.	required
`ionic_strength_m`	`float`	ion concentration in mol/L (default 0.150 M).	`DEFAULT_IONIC_STRENGTH_M`
`solute_dielectric`	`float`	interior (solute) dielectric constant.	`DEFAULT_SOLUTE_DIELECTRIC`
`solvent_dielectric`	`float`	bulk solvent dielectric constant.	`DEFAULT_SOLVENT_DIELECTRIC`
`solvent_radius_a`	`float`	probe solvent radius in Angstrom.	`DEFAULT_SOLVENT_RADIUS_A`
`temperature_k`	`float`	simulation temperature in Kelvin.	`DEFAULT_TEMPERATURE_K`
`fine_spacing_a`	`float`	target fine grid spacing in Angstrom.	`DEFAULT_FINE_SPACING_A`
`fine_padding_a`	`float`	fine grid padding added to the radius-inflated atom bounding box (per axis).	`DEFAULT_FINE_PADDING_A`
`coarse_padding_a`	`float`	coarse grid padding added to the radius-inflated atom bounding box (per axis); the coarse grid never shrinks below the fine grid.	`DEFAULT_COARSE_PADDING_A`

Returns:

Type	Description
`Path`	Path to the written `{stem}.in` file.

Raises:

Type	Description
`ValueError`	if `{stem}.pqr` has no ATOM/HETATM records.

`infer_debye_length(*apbs_logs)` ¶

Return the first Debye length (Angstrom) parsed from any APBS log.

Scans logs in argument order; returns as soon as a Debye length is found in any one of them. Missing log files are skipped silently.

Parameters:

Name	Type	Description	Default
`*apbs_logs`	`StrPath`	paths to one or more APBS log files.	`()`

Returns:

Type	Description
`float`	Debye length in Angstrom.

Raises:

Type	Description
`RuntimeError`	if no log contains a recognisable Debye length entry.

BrownDye2¶

`mdpp.prep.browndye` ¶

BrownDye2 input.xml and contact_types.xml generation.

Helpers for building the XML inputs consumed by BrownDye2's bd_top:

:func:write_contact_types writes contact_types.xml with one entry per unique (atom_name, residue_name) heavy-atom pair per body.
:func:build_input_xml and :func:write_input_xml produce the top-level input.xml from two :class:BrownDyeBody descriptors and a shared :class:BrownDyeSolvent configuration.

All helpers are pure Python: no subprocess calls, no XML schema validation. Run BrownDye's own tools (pqr2xml, make_rxn_pairs, make_rxn_file, bd_top, nam_simulation) separately.

The Debye length feeding :class:BrownDyeSolvent is typically obtained from an APBS run via :func:mdpp.prep.apbs.infer_debye_length.

`BrownDyeBody(name, atoms_xml, grid_dx, is_protein=True, dielectric=DEFAULT_BODY_DIELECTRIC, all_in_surface=False)` `dataclass` ¶

Configuration for one BrownDye core/body block in input.xml.

Attributes:

Name	Type	Description
`name`	`str`	BrownDye body name. Also used as the `<core><name>` tag.
`atoms_xml`	`str`	Relative path to the `atoms.xml` produced by `pqr2xml`, as it should appear inside `input.xml` (typically just `"{name}_atoms.xml"` when running BrownDye from the same directory).
`grid_dx`	`str`	Relative path to the APBS `.dx` grid for this body.
`is_protein`	`bool`	Maps to the `<is_protein>` tag (lowercase `true`/`false` in the serialised XML).
`dielectric`	`float`	Interior dielectric for this body.
`all_in_surface`	`bool`	Maps to the `<all_in_surface>` tag.

`BrownDyeSolvent(debye_length_a, dielectric=DEFAULT_BD_SOLVENT_DIELECTRIC, relative_viscosity=DEFAULT_RELATIVE_VISCOSITY, kT=DEFAULT_KT, desolvation_parameter=DEFAULT_DESOLVATION_PARAMETER, solvent_radius_a=DEFAULT_SOLVENT_RADIUS_A)` `dataclass` ¶

Solvent block parameters shared by all bodies in a BrownDye system.

BrownDye uses kT-units internally, so :attr:dielectric is the BrownDye solvent dielectric (typically 78.0) and may differ from the APBS sdie value used to compute the electrostatic grid.

Attributes:

Name	Type	Description
`debye_length_a`	`float`	Debye length in Angstrom (usually obtained from the APBS log via :func:`mdpp.prep.apbs.infer_debye_length`).
`dielectric`	`float`	BrownDye solvent dielectric (kT-units).
`relative_viscosity`	`float`	Relative solvent viscosity.
`kT`	`float`	Thermal energy unit (BrownDye uses `kT = 1`).
`desolvation_parameter`	`float`	BrownDye desolvation scale factor.
`solvent_radius_a`	`float`	Probe solvent radius in Angstrom.

`write_contact_types(mol0_pqr, mol1_pqr, out_path)` ¶

Write a BrownDye contact_types.xml from two PQR files.

Lists every unique heavy-atom (atom_name, residue_name) per body. The output is consumed by make_rxn_pairs to enumerate candidate contact pairs between the two bodies.

Parameters:

Name	Type	Description	Default
`mol0_pqr`	`StrPath`	PQR file for the first body (writes `<molecule0>` block).	required
`mol1_pqr`	`StrPath`	PQR file for the second body (writes `<molecule1>` block).	required
`out_path`	`StrPath`	Destination `contact_types.xml` path.	required

Returns:

Type	Description
`Path`	`out_path` as a :class:`Path`, for chaining.

`build_input_xml(body0, body1, *, solvent, reaction_file=DEFAULT_REACTION_FILE, n_threads=DEFAULT_N_THREADS, seed=DEFAULT_SEED, n_trajectories=DEFAULT_N_TRAJECTORIES, n_trajectories_per_output=DEFAULT_N_TRAJECTORIES_PER_OUTPUT, max_n_steps=DEFAULT_MAX_N_STEPS, n_steps_per_output=DEFAULT_N_STEPS_PER_OUTPUT, results_file=DEFAULT_RESULTS_FILE, trajectory_file=DEFAULT_TRAJECTORY_FILE)` ¶

Build the BrownDye top-level input.xml as a string.

The minimum core dt tolerances are hardcoded to 0.0 (BrownDye's own defaults); the time step is determined dynamically. Override after the fact if you need non-default tolerances.

Parameters:

Name	Type	Description	Default
`body0`	`BrownDyeBody`	First body descriptor.	required
`body1`	`BrownDyeBody`	Second body descriptor.	required
`solvent`	`BrownDyeSolvent`	Shared solvent parameters (including Debye length).	required
`reaction_file`	`str`	Filename of the BrownDye reaction definition XML.	`DEFAULT_REACTION_FILE`
`n_threads`	`int`	Number of BrownDye worker threads.	`DEFAULT_N_THREADS`
`seed`	`int`	Random seed for trajectory propagation.	`DEFAULT_SEED`
`n_trajectories`	`int`	Total number of trajectories to launch.	`DEFAULT_N_TRAJECTORIES`
`n_trajectories_per_output`	`int`	Trajectories per `results.xml` flush.	`DEFAULT_N_TRAJECTORIES_PER_OUTPUT`
`max_n_steps`	`int`	Maximum BrownDye steps per trajectory.	`DEFAULT_MAX_N_STEPS`
`n_steps_per_output`	`int`	Stride between trajectory frames written to `trajectory{N}.xml`. Set to `1` to record every step.	`DEFAULT_N_STEPS_PER_OUTPUT`
`results_file`	`str`	Filename for cumulative results.	`DEFAULT_RESULTS_FILE`
`trajectory_file`	`str`	Base name for per-thread trajectory XML dumps (BrownDye writes `{trajectory_file}{thread}.xml` plus a matching `.index.xml`).	`DEFAULT_TRAJECTORY_FILE`

Returns:

Type	Description
`str`	The full `input.xml` content as a UTF-8 string.

`write_input_xml(out_path, body0, body1, *, solvent, reaction_file=DEFAULT_REACTION_FILE, n_threads=DEFAULT_N_THREADS, seed=DEFAULT_SEED, n_trajectories=DEFAULT_N_TRAJECTORIES, n_trajectories_per_output=DEFAULT_N_TRAJECTORIES_PER_OUTPUT, max_n_steps=DEFAULT_MAX_N_STEPS, n_steps_per_output=DEFAULT_N_STEPS_PER_OUTPUT, results_file=DEFAULT_RESULTS_FILE, trajectory_file=DEFAULT_TRAJECTORY_FILE)` ¶

Write the BrownDye top-level input.xml to out_path.

Thin filesystem wrapper around :func:build_input_xml; see that function for parameter semantics.

Returns:

Type	Description
`Path`	`out_path` as a :class:`Path`, for chaining.

Preparation API Reference¶

Protein¶

mdpp.prep.protein ¶

PropkaResidue(residue_type, res_num, chain_id, pka, model_pka) dataclass ¶

label property ¶

is_protonated_at(pH) ¶

is_default_protonated_at(pH) ¶

PropkaResult(residues) dataclass ¶

get_nonstandard(pH) ¶

ChainSelect(chain_ids) ¶

accept_chain(chain) ¶

run_propka(pdb_path) ¶

fix_pdb(pdb_path, fixed_pdb_path, pH=7.0, *, protonation='model') ¶

strip_solvent(traj, *, keep_ions=False) ¶

extract_chain(traj, chain_id) ¶

Ligand¶

mdpp.prep.ligand ¶

assign_topology(mol, template_mol) ¶

constraint_minimization(mol, *, max_iters=5000) ¶

Topology¶

mdpp.prep.topology ¶

merge_trajectories(trajectories) ¶

slice_trajectory(traj, *, start=None, stop=None, stride=None) ¶

subsample_trajectory(traj, n_frames) ¶

APBS¶

mdpp.prep.apbs ¶

infer_debye_length(*apbs_logs) ¶

BrownDye2¶

mdpp.prep.browndye ¶

BrownDyeBody(name, atoms_xml, grid_dx, is_protein=True, dielectric=DEFAULT_BODY_DIELECTRIC, all_in_surface=False) dataclass ¶

BrownDyeSolvent(debye_length_a, dielectric=DEFAULT_BD_SOLVENT_DIELECTRIC, relative_viscosity=DEFAULT_RELATIVE_VISCOSITY, kT=DEFAULT_KT, desolvation_parameter=DEFAULT_DESOLVATION_PARAMETER, solvent_radius_a=DEFAULT_SOLVENT_RADIUS_A) dataclass ¶

write_contact_types(mol0_pqr, mol1_pqr, out_path) ¶

`mdpp.prep.protein` ¶

`PropkaResidue(residue_type, res_num, chain_id, pka, model_pka)` `dataclass` ¶

`label` `property` ¶

`is_protonated_at(pH)` ¶

`is_default_protonated_at(pH)` ¶

`PropkaResult(residues)` `dataclass` ¶

`get_nonstandard(pH)` ¶

`ChainSelect(chain_ids)` ¶

`accept_chain(chain)` ¶

`run_propka(pdb_path)` ¶

`fix_pdb(pdb_path, fixed_pdb_path, pH=7.0, *, protonation='model')` ¶

`strip_solvent(traj, *, keep_ions=False)` ¶

`extract_chain(traj, chain_id)` ¶

`mdpp.prep.ligand` ¶

`assign_topology(mol, template_mol)` ¶

`constraint_minimization(mol, *, max_iters=5000)` ¶

`mdpp.prep.topology` ¶

`merge_trajectories(trajectories)` ¶

`slice_trajectory(traj, *, start=None, stop=None, stride=None)` ¶

`subsample_trajectory(traj, n_frames)` ¶

`mdpp.prep.apbs` ¶

`infer_debye_length(*apbs_logs)` ¶

`mdpp.prep.browndye` ¶

`BrownDyeBody(name, atoms_xml, grid_dx, is_protein=True, dielectric=DEFAULT_BODY_DIELECTRIC, all_in_surface=False)` `dataclass` ¶

`BrownDyeSolvent(debye_length_a, dielectric=DEFAULT_BD_SOLVENT_DIELECTRIC, relative_viscosity=DEFAULT_RELATIVE_VISCOSITY, kT=DEFAULT_KT, desolvation_parameter=DEFAULT_DESOLVATION_PARAMETER, solvent_radius_a=DEFAULT_SOLVENT_RADIUS_A)` `dataclass` ¶

`write_contact_types(mol0_pqr, mol1_pqr, out_path)` ¶