Preparation API Reference¶
Protein¶
mdpp.prep.protein
¶
Protein structure preparation and manipulation utilities.
PropkaResidue(residue_type, res_num, chain_id, pka, model_pka)
dataclass
¶
PROPKA pKa prediction for a single titratable residue.
Attributes:
| Name | Type | Description |
|---|---|---|
residue_type |
str
|
Group label (e.g. |
res_num |
int
|
Residue sequence number. |
chain_id |
str
|
PDB chain identifier. |
pka |
float
|
PROPKA-predicted pKa value. |
model_pka |
float
|
Reference model pKa value. |
PropkaResult(residues)
dataclass
¶
PROPKA pKa prediction results for all titratable residues.
Attributes:
| Name | Type | Description |
|---|---|---|
residues |
tuple[PropkaResidue, ...]
|
pKa predictions for each titratable residue. |
get_nonstandard(pH)
¶
Return residues where PROPKA and model pKa disagree on protonation state.
A residue is "non-standard" when pKa > pH and model_pKa <= pH
(or vice versa), meaning PDBFixer would assign a different protonation
state than what PROPKA predicts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pH
|
float
|
pH value for protonation state comparison. |
required |
Returns:
| Type | Description |
|---|---|
tuple[PropkaResidue, ...]
|
Residues with non-standard predicted protonation. |
ChainSelect(chain_ids)
¶
Bases: Select
Biopython Select subclass that accepts only specified PDB chains.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chain_ids
|
str | list[str]
|
One or more chain identifiers to keep. |
required |
Example::
from Bio.PDB import PDBIO, PDBParser
from mdpp.prep import ChainSelect
parser = PDBParser(QUIET=True)
structure = parser.get_structure("complex", "complex.pdb")
io = PDBIO()
io.set_structure(structure)
io.save("protein.pdb", ChainSelect("A"))
Initialize the ChainSelect object.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chain_ids
|
str | list[str]
|
The chain IDs to keep. |
required |
accept_chain(chain)
¶
Return 1 if the chain should be kept, 0 otherwise.
run_propka(pdb_path)
¶
Run PROPKA to predict pKa values for titratable protein residues.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pdb_path
|
StrPath
|
Path to the input PDB file. |
required |
Returns:
| Type | Description |
|---|---|
PropkaResult
|
pKa predictions for all titratable residues found. |
fix_pdb(pdb_path, fixed_pdb_path, pH=7.0, *, protonation='model')
¶
Fix a PDB file by adding missing residues, atoms, and hydrogens.
Removes heterogens (excluding water by default), identifies missing residues and atoms, then adds them back along with hydrogens at the specified pH.
Runs PROPKA to check for residues whose environment-shifted pKa predicts a different protonation state than the model-pKa default used by PDBFixer, and logs a warning for each such residue.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pdb_path
|
StrPath
|
Path to the input PDB file. |
required |
fixed_pdb_path
|
StrPath
|
Path where the fixed PDB will be written. |
required |
pH
|
float
|
pH value for hydrogen placement. |
7.0
|
protonation
|
Literal['model', 'propka']
|
Protonation policy. |
'model'
|
strip_solvent(traj, *, keep_ions=False)
¶
Remove solvent molecules from a trajectory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
traj
|
Trajectory
|
Input trajectory. |
required |
keep_ions
|
bool
|
If |
False
|
Returns:
| Type | Description |
|---|---|
Trajectory
|
A new trajectory with solvent removed. |
extract_chain(traj, chain_id)
¶
Extract a single chain from a trajectory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
traj
|
Trajectory
|
Input trajectory. |
required |
chain_id
|
int
|
Zero-based chain index to extract. |
required |
Returns:
| Type | Description |
|---|---|
Trajectory
|
A new trajectory containing only the specified chain. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
Ligand¶
mdpp.prep.ligand
¶
Ligand parameterization and topology assignment utilities.
assign_topology(mol, template_mol)
¶
Assign bond orders and hydrogens from a template molecule to a ligand.
Uses the template (typically from SMILES) without hydrogens to assign bond orders to the ligand's heavy-atom coordinates, then adds hydrogens with 3D coordinates.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mol
|
Mol
|
The ligand molecule (usually from a PDB/MOL2 with no bond orders). |
required |
template_mol
|
Mol
|
The reference molecule with correct bond orders. |
required |
Returns:
| Type | Description |
|---|---|
Mol
|
A new molecule with assigned bond orders and added hydrogens. |
constraint_minimization(mol, *, max_iters=5000)
¶
Minimize hydrogen positions while keeping heavy atoms fixed.
Uses the Universal Force Field (UFF) with fixed-point constraints on all non-hydrogen atoms.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mol
|
Mol
|
Input molecule with 3D coordinates (conformer 0). |
required |
max_iters
|
int
|
Maximum number of minimization iterations. |
5000
|
Returns:
| Type | Description |
|---|---|
Mol
|
The molecule with optimized hydrogen positions. |
Topology¶
mdpp.prep.topology
¶
Trajectory manipulation utilities for system preparation.
merge_trajectories(trajectories)
¶
Concatenate multiple trajectories along the time axis.
All trajectories must share the same topology and number of atoms.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
trajectories
|
Sequence[Trajectory]
|
Sequence of trajectories to concatenate. |
required |
Returns:
| Type | Description |
|---|---|
Trajectory
|
A single trajectory containing all frames in order. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If fewer than two trajectories are provided or topologies do not match. |
slice_trajectory(traj, *, start=None, stop=None, stride=None)
¶
Slice a trajectory by frame range with validation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
traj
|
Trajectory
|
Input trajectory. |
required |
start
|
int | None
|
Starting frame index (inclusive). Defaults to |
None
|
stop
|
int | None
|
Stopping frame index (exclusive). Defaults to |
None
|
stride
|
int | None
|
Frame stride. Defaults to |
None
|
Returns:
| Type | Description |
|---|---|
Trajectory
|
A new trajectory with the selected frames. |
subsample_trajectory(traj, n_frames)
¶
Evenly subsample a trajectory to a target number of frames.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
traj
|
Trajectory
|
Input trajectory. |
required |
n_frames
|
int
|
Desired number of output frames. |
required |
Returns:
| Type | Description |
|---|---|
Trajectory
|
A new trajectory with approximately |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
APBS¶
mdpp.prep.apbs
¶
APBS Poisson-Boltzmann input generation and log parsing.
Helpers for driving APBS (Adaptive Poisson-Boltzmann Solver) from Python.
Two pure functions are exposed:
- :func:
write_apbs_inputgenerates a multigrid APBS.infile from an existing.pqrby deriving the grid bounding box from the radius-inflated atom coordinates and roundingdimeup to the nearestc * 2**n + 1value required by APBS multigrid. - :func:
infer_debye_lengthparses a Debye length (in Angstrom) out of an APBS log; used to bootstrap downstream BrownDye input from the same APBS run.
Both functions are pure Python with no subprocess calls; they only read/write
text files. Run APBS itself by calling the apbs CLI separately.
write_apbs_input(stem, work_dir, *, ionic_strength_m=DEFAULT_IONIC_STRENGTH_M, solute_dielectric=DEFAULT_SOLUTE_DIELECTRIC, solvent_dielectric=DEFAULT_SOLVENT_DIELECTRIC, solvent_radius_a=DEFAULT_SOLVENT_RADIUS_A, temperature_k=DEFAULT_TEMPERATURE_K, fine_spacing_a=DEFAULT_FINE_SPACING_A, fine_padding_a=DEFAULT_FINE_PADDING_A, coarse_padding_a=DEFAULT_COARSE_PADDING_A)
¶
Write an APBS multigrid input file {stem}.in for {stem}.pqr.
Physics defaults mirror pdb2pqr --apbs-input canonical defaults
(lpbe / bcfl sdh / srfm smol / chgm spl2 / pdie 2.0 /
sdie 78.54 / srad 1.4 / sdens 10 / swin 0.30 /
temp 298.15) with two intentional overrides:
- Explicit Na+/Cl- ion lines at
ionic_strength_mwith Pauling radii (1.875 / 1.815 A). The pdb2pqr canonical input omits ions, which sets the Debye length to infinity. Explicit ions are required when the resulting.dxfeeds a BrownDye2 simulation that needs a finite Debye length for far-field electrostatics. - Larger grid padding (
fine_padding_a/coarse_padding_a) than pdb2pqr'sfadd=20/cfac=1.7defaults so the outer grid comfortably exceeds the BrownDye b-radius.dimeis rounded up to the nearestc * 2**n + 1value required by APBS multigrid.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
stem
|
str
|
PQR file stem (without extension). |
required |
work_dir
|
StrPath
|
directory containing |
required |
ionic_strength_m
|
float
|
ion concentration in mol/L (default 0.150 M). |
DEFAULT_IONIC_STRENGTH_M
|
solute_dielectric
|
float
|
interior (solute) dielectric constant. |
DEFAULT_SOLUTE_DIELECTRIC
|
solvent_dielectric
|
float
|
bulk solvent dielectric constant. |
DEFAULT_SOLVENT_DIELECTRIC
|
solvent_radius_a
|
float
|
probe solvent radius in Angstrom. |
DEFAULT_SOLVENT_RADIUS_A
|
temperature_k
|
float
|
simulation temperature in Kelvin. |
DEFAULT_TEMPERATURE_K
|
fine_spacing_a
|
float
|
target fine grid spacing in Angstrom. |
DEFAULT_FINE_SPACING_A
|
fine_padding_a
|
float
|
fine grid padding added to the radius-inflated atom bounding box (per axis). |
DEFAULT_FINE_PADDING_A
|
coarse_padding_a
|
float
|
coarse grid padding added to the radius-inflated atom bounding box (per axis); the coarse grid never shrinks below the fine grid. |
DEFAULT_COARSE_PADDING_A
|
Returns:
| Type | Description |
|---|---|
Path
|
Path to the written |
Raises:
| Type | Description |
|---|---|
ValueError
|
if |
infer_debye_length(*apbs_logs)
¶
Return the first Debye length (Angstrom) parsed from any APBS log.
Scans logs in argument order; returns as soon as a Debye length is found in any one of them. Missing log files are skipped silently.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*apbs_logs
|
StrPath
|
paths to one or more APBS log files. |
()
|
Returns:
| Type | Description |
|---|---|
float
|
Debye length in Angstrom. |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
if no log contains a recognisable Debye length entry. |
BrownDye2¶
mdpp.prep.browndye
¶
BrownDye2 input.xml and contact_types.xml generation.
Helpers for building the XML inputs consumed by BrownDye2's bd_top:
- :func:
write_contact_typeswritescontact_types.xmlwith one entry per unique(atom_name, residue_name)heavy-atom pair per body. - :func:
build_input_xmland :func:write_input_xmlproduce the top-levelinput.xmlfrom two :class:BrownDyeBodydescriptors and a shared :class:BrownDyeSolventconfiguration.
All helpers are pure Python: no subprocess calls, no XML schema validation.
Run BrownDye's own tools (pqr2xml, make_rxn_pairs, make_rxn_file,
bd_top, nam_simulation) separately.
The Debye length feeding :class:BrownDyeSolvent is typically obtained from
an APBS run via :func:mdpp.prep.apbs.infer_debye_length.
BrownDyeBody(name, atoms_xml, grid_dx, is_protein=True, dielectric=DEFAULT_BODY_DIELECTRIC, all_in_surface=False)
dataclass
¶
Configuration for one BrownDye core/body block in input.xml.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
BrownDye body name. Also used as the |
atoms_xml |
str
|
Relative path to the |
grid_dx |
str
|
Relative path to the APBS |
is_protein |
bool
|
Maps to the |
dielectric |
float
|
Interior dielectric for this body. |
all_in_surface |
bool
|
Maps to the |
BrownDyeSolvent(debye_length_a, dielectric=DEFAULT_BD_SOLVENT_DIELECTRIC, relative_viscosity=DEFAULT_RELATIVE_VISCOSITY, kT=DEFAULT_KT, desolvation_parameter=DEFAULT_DESOLVATION_PARAMETER, solvent_radius_a=DEFAULT_SOLVENT_RADIUS_A)
dataclass
¶
Solvent block parameters shared by all bodies in a BrownDye system.
BrownDye uses kT-units internally, so :attr:dielectric is the BrownDye
solvent dielectric (typically 78.0) and may differ from the APBS
sdie value used to compute the electrostatic grid.
Attributes:
| Name | Type | Description |
|---|---|---|
debye_length_a |
float
|
Debye length in Angstrom (usually obtained from the
APBS log via :func: |
dielectric |
float
|
BrownDye solvent dielectric (kT-units). |
relative_viscosity |
float
|
Relative solvent viscosity. |
kT |
float
|
Thermal energy unit (BrownDye uses |
desolvation_parameter |
float
|
BrownDye desolvation scale factor. |
solvent_radius_a |
float
|
Probe solvent radius in Angstrom. |
write_contact_types(mol0_pqr, mol1_pqr, out_path)
¶
Write a BrownDye contact_types.xml from two PQR files.
Lists every unique heavy-atom (atom_name, residue_name) per body.
The output is consumed by make_rxn_pairs to enumerate candidate
contact pairs between the two bodies.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mol0_pqr
|
StrPath
|
PQR file for the first body (writes |
required |
mol1_pqr
|
StrPath
|
PQR file for the second body (writes |
required |
out_path
|
StrPath
|
Destination |
required |
Returns:
| Type | Description |
|---|---|
Path
|
|
build_input_xml(body0, body1, *, solvent, reaction_file=DEFAULT_REACTION_FILE, n_threads=DEFAULT_N_THREADS, seed=DEFAULT_SEED, n_trajectories=DEFAULT_N_TRAJECTORIES, n_trajectories_per_output=DEFAULT_N_TRAJECTORIES_PER_OUTPUT, max_n_steps=DEFAULT_MAX_N_STEPS, n_steps_per_output=DEFAULT_N_STEPS_PER_OUTPUT, results_file=DEFAULT_RESULTS_FILE, trajectory_file=DEFAULT_TRAJECTORY_FILE)
¶
Build the BrownDye top-level input.xml as a string.
The minimum core dt tolerances are hardcoded to 0.0 (BrownDye's
own defaults); the time step is determined dynamically. Override after
the fact if you need non-default tolerances.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
body0
|
BrownDyeBody
|
First body descriptor. |
required |
body1
|
BrownDyeBody
|
Second body descriptor. |
required |
solvent
|
BrownDyeSolvent
|
Shared solvent parameters (including Debye length). |
required |
reaction_file
|
str
|
Filename of the BrownDye reaction definition XML. |
DEFAULT_REACTION_FILE
|
n_threads
|
int
|
Number of BrownDye worker threads. |
DEFAULT_N_THREADS
|
seed
|
int
|
Random seed for trajectory propagation. |
DEFAULT_SEED
|
n_trajectories
|
int
|
Total number of trajectories to launch. |
DEFAULT_N_TRAJECTORIES
|
n_trajectories_per_output
|
int
|
Trajectories per |
DEFAULT_N_TRAJECTORIES_PER_OUTPUT
|
max_n_steps
|
int
|
Maximum BrownDye steps per trajectory. |
DEFAULT_MAX_N_STEPS
|
n_steps_per_output
|
int
|
Stride between trajectory frames written to
|
DEFAULT_N_STEPS_PER_OUTPUT
|
results_file
|
str
|
Filename for cumulative results. |
DEFAULT_RESULTS_FILE
|
trajectory_file
|
str
|
Base name for per-thread trajectory XML dumps
(BrownDye writes |
DEFAULT_TRAJECTORY_FILE
|
Returns:
| Type | Description |
|---|---|
str
|
The full |
write_input_xml(out_path, body0, body1, *, solvent, reaction_file=DEFAULT_REACTION_FILE, n_threads=DEFAULT_N_THREADS, seed=DEFAULT_SEED, n_trajectories=DEFAULT_N_TRAJECTORIES, n_trajectories_per_output=DEFAULT_N_TRAJECTORIES_PER_OUTPUT, max_n_steps=DEFAULT_MAX_N_STEPS, n_steps_per_output=DEFAULT_N_STEPS_PER_OUTPUT, results_file=DEFAULT_RESULTS_FILE, trajectory_file=DEFAULT_TRAJECTORY_FILE)
¶
Write the BrownDye top-level input.xml to out_path.
Thin filesystem wrapper around :func:build_input_xml; see that
function for parameter semantics.
Returns:
| Type | Description |
|---|---|
Path
|
|