vermouth.dssp.dssp module

Assign protein secondary structures using DSSP.

class vermouth.dssp.dssp.AnnotateDSSP(executable=None, savedir=None)[source]

Bases: Processor

name = 'AnnotateDSSP'
run_molecule(molecule)[source]
class vermouth.dssp.dssp.AnnotateMartiniSecondaryStructures[source]

Bases: Processor

name = 'AnnotateMartiniSecondaryStructures'
static run_molecule(molecule)[source]
class vermouth.dssp.dssp.AnnotateResidues(attribute, sequence, molecule_selector=<function select_all>)[source]

Bases: Processor

Set an attribute of the nodes from a sequence with one element per residue.

Read a sequence with one element per residue and assign an attribute of each node based on that sequence, so each node has the value corresponding to its residue. In most cases, the length of the sequence has to match the total number of residues in the system. The sequence must be ordered in the same way as the residues in the system. If all the molecules have the same number of residues, and if the length of the sequence corresponds to the number of residue of one molecule, then the sequence is repeated to all molecules. If the sequence contains only one element, then it is repeated to all the residues ofthe system.

Parameters:
name = 'AnnotateResidues'
run_molecule(molecule)[source]

Run the processor on a single molecule.

Parameters:

molecule (vermouth.molecule.Molecule)

Return type:

vermouth.molecule.Molecule

run_system(system)[source]

Run the processor on a system.

Parameters:

system (vermouth.system.System)

Return type:

vermouth.system.System

exception vermouth.dssp.dssp.DSSPError[source]

Bases: Exception

Exception raised if DSSP fails.

vermouth.dssp.dssp.annotate_dssp(molecule, callable=None, attribute='secstruct')[source]

Adds the DSSP assignation to the atoms of a molecule.

Runs DSSP on the molecule and adds the secondary structure assignation as an attribute of its atoms. The attribute name in which the assignation is stored is controlled with the “attribute” argument.

Only proteins can be annotated. Non-protein molecules are returned unmodified, so are empty molecules, and molecules for which no positions are set.

The atom names are assumed to be compatible with DSSP. Atoms with no known position are not passed to DSSP which may lead to an error in DSSP.

Warning

The molecule is annotated in-place.

Parameters:
  • molecule (Molecule) – The molecule to annotate. Its atoms must have the attributes required to write a PDB file; other atom attributes, edges, or molecule attributes are not used.

  • callable (Callable) – The function to call to generate DSSP secondary structure assignments. See also: run_dssp(), run_mdtraj()

  • attribute (str) – The name of the atom attribute in which to store the annotation.

vermouth.dssp.dssp.annotate_residues_from_sequence(molecule, attribute, sequence)[source]

Sets the attribute attribute to a value from sequence for every node in molecule. Nodes in the n’th residue of molecule are given the n’th value of sequence.

Parameters:
Raises:

ValueError – If the length of sequence is different from the number of residues in molecule.

vermouth.dssp.dssp.convert_dssp_annotation_to_martini(molecule, from_attribute='secstruct', to_attribute='cgsecstruct')[source]

For every node in molecule, translate the from_attribute with convert_dssp_to_martini(), and assign it to the attribute to_attribute.

Parameters:
Raises:

ValueError – If not all nodes have a from_attribute.

vermouth.dssp.dssp.convert_dssp_to_martini(sequence)[source]

Convert a sequence of secondary structure to martini secondary sequence.

Martini treats some secondary structures with less resolution than dssp. For instance, the different types of helices that dssp discriminates are seen the same by martini. Yet, different parts of the same helix are seen differently in martini.

In the Martini force field, the B and E secondary structures from DSSP are both treated as extended regions. All the DSSP helices are treated the same, but the different part of the helices (beginning, end, core of a short helix, core of a long helix) are treated differently.

After the conversion, the secondary structures are: * :F: Collagenous Fiber * :E: Extended structure (β sheet) * :H: Helix structure * :1: Helix start (H-bond donor) * :2: Helix end (H-bond acceptor) * :3: Ambivalent helix type (short helices) * :T: Turn * :S: Bend * :C: Coil

Parameters:

sequence (str) – A sequence of secondary structures as read from dssp. One letter per residue.

Returns:

A sequence of secondary structures usable for martini. One letter per residue.

Return type:

str

vermouth.dssp.dssp.read_dssp2(lines)[source]

Read the secondary structure from a DSSP output.

Only the first column of the “STRUCTURE” block is read. See the documentation of the DSSP format for more details.

The secondary structures that can be read are:

H:

α-helix

B:

residue in isolated β-bridge

E:

extended strand, participates in β ladder

G:

3-helix (3-10 helix)

I:

5 helix (π-helix)

T:

hydrogen bonded turn

S:

bend

C:

loop or irregular

The “C” code for loops and random coil is translated from the gap used in the DSSP file for an improved readability.

Only the version 2 and 3 of DSSP is supported. If the format is not recognized as comming from that version of DSSP, then a IOError is raised.

Parameters:

lines – An iterable over the lines of the DSSP output. This can be e.g. a list of lines, or a file handler. The new line character is ignored.

Returns:

secstructs – The secondary structure assigned by DSSP as a list of one-letter secondary structure code.

Return type:

list[str]

Raises:

IOError – When a line could not be parsed, or if the version of DSSP is not supported.

vermouth.dssp.dssp.run_dssp(system, executable='dssp', savedir=None, defer_writing=True)[source]

Run DSSP on a system and return the assigned secondary structures.

Run DSSP using the path (or name in the research PATH) given by “executable”. Return the secondary structure parsed from the output of the program.

In order to call DSSP, a PDB file is produced. Therefore, all the molecules in the system must contain the required attributes for such a file to be generated. Also, the atom names are assumed to be compatible with the ‘charmm’ force field for DSSP to recognize them. However, the molecules do not require the edges to be defined.

DSSP is assumed to be in version 2 or 3. The secondary structure codes are described in read_dssp2().

Parameters:
  • system (System)

  • executable (str) – Where to find the DSSP executable.

  • savefile (None or str or pathlib.Path) – If set to a path, the output of DSSP is written in this directory.

  • defer_writing (bool) – Whether to use write() for writing data

Returns:

The assigned secondary structures as a list of one-letter codes. The secondary structure sequences of all the molecules are combined in a single list without delimitation.

Return type:

list[str]

Raises:
  • DSSPError – DSSP failed to run.

  • IOError – The output of DSSP could not be parsed.

See also

read_dssp2

Parse a DSSP output.

vermouth.dssp.dssp.run_mdtraj(system)[source]

Compute DSSP secondary structure assignments for the system by using mdtraj.compute_dssp.

During processing, a PDB file is produced. Therefore, all the molecules in the system must contain the required attributes for such a file to be generated. Also, the atom names are assumed to be compatible with the ‘charmm’ force field for MDTraj to recognize them. However, the molecules do not require the edges to be defined.

Parameters:

system (System) – The system to process

Returns:

The assigned secondary structures as a list of one-letter codes. The secondary structure sequences of all the molecules are combined in a single list without delimitation.

Return type:

list[str]

vermouth.dssp.dssp.sequence_from_residues(molecule, attribute, default=None)[source]

Generates a sequence of attribute, one per residue in molecule.

Parameters:
Yields:

object – The value of attribute for every residue in molecule.