esm.doi.bio/esm33/out24

Frames are representations that encapsulate the 3D positional and rotational information of residue backbones and sidechains in a protein structure. We use a formulation similar to Ingraham et al. (66). Each frame $T \in S E(3)$ consists of a rotation matrix $\mathbf{R} \in S O(3)$ and a translation vector $\mathbf{t} \in \mathbb{R}^{3}$

Definition: A frame $T_{i}$ for residue $i$ is defined as:

$$ T{i}=\left[egin{array}{cc} \mathbf{R}{i} & \mathbf{t}{i} \ \mathbf{0}{1 imes 3} & 1 \end{array} ight] \in S E(3) $$

where $\mathbf{R}{i} \in S O(3)$ and $\mathbf{t}{i} \in \mathbb{R}^{3}$.

Rotation Matrix: The rotation matrix $\mathbf{R}{i}$ for residue $i$ is composed of three 3-dimensional vectors $\left[\hat{x}, \hat{e}{1}, \hat{e}_{2} ight]$ :

  1. $\hat{x}$ and $\hat{e}{1}$ are orthogonal unit vectors on the $N-$ $C{lpha}-C$ plane.
  2. $\hat{e}{2}$ is a unit vector perpendicular to both $\hat{x}$ and $\hat{e}{1}$.

This matrix rotates vectors to a local coordinate system where the $N-C_{lpha}-C$ plane for the corresponding residue spans the $x y$ plane.

Translation Vector: The translation vector $\mathbf{t}{i}$ specifies the position of the residue's $C{lpha}$.

Transformation: To transform a point $\mathbf{p} \in \mathbb{R}^{3}$ from the local frame of residue $i$ to the global coordinate system, the following equation is used:

$$ \mathbf{p}{ ext {global }}=T{i}(\mathbf{p})=\mathbf{R}{i} \mathbf{p}+\mathbf{t}{i} $$

Inverse Transformation: To transform a point $\mathbf{p}_{ ext {global }} \in$ $\mathbb{R}^{3}$ from the global coordinate system back to the local frame of residue $i$, the following equation is used:

$$ \mathbf{p}=T{i}^{-1}\left(\mathbf{p}{ ext {global }} ight)=\mathbf{R}{i}^{-1}\left(\mathbf{p}{ ext {global }}-\mathbf{t}_{i} ight) $$

Please explain this in detail to a beginning grad student in machine learning.

Response:


Understanding Frames in Protein Structures

Welcome! Let's delve into the concept of frames as they apply to protein structures, specifically focusing on how they represent the 3D positions and orientations of residues (the building blocks of proteins). We'll break down each component to ensure clarity.


What Are Frames in This Context?

In the study of protein structures, a frame is a mathematical construct used to represent both the position and orientation of a residue in three-dimensional space. Frames allow us to:

This is crucial because proteins are complex molecules where the spatial arrangement of residues influences their function.


Mathematical Representation of a Frame

A frame ( T ) is represented as an element of the group ( SE(3) ), which stands for the Special Euclidean Group in 3D. This group encompasses all possible rotations and translations in three-dimensional space.

Definition of a Frame ( T_i ) for Residue ( i ):

[ Ti = \begin{bmatrix} \mathbf{R}i & \mathbf{t}i \ \mathbf{0}{1 \times 3} & 1 \end{bmatrix} \in SE(3) ]

This matrix ( T_i ) combines rotation and translation into a single transformation.


Rotation Matrix ( \mathbf{R}_i )

The rotation matrix ( \mathbf{R}_i ) defines how the residue is oriented in space. It's constructed from three orthogonal unit vectors (vectors of length 1 that are mutually perpendicular), forming a local coordinate system for the residue.

Components of ( \mathbf{R}_i ):

  1. ( \hat{x} ): A unit vector in the direction from the nitrogen atom ( N ) to the alpha carbon atom ( C_\alpha ) in the residue.
  2. ( \hat{e}1 ): A unit vector orthogonal to ( \hat{x} ) lying in the plane formed by atoms ( N ), ( C\alpha ), and ( C ) (the backbone atoms of the amino acid residue).
  3. ( \hat{e}2 ): A unit vector perpendicular to both ( \hat{x} ) and ( \hat{e}1 ), completing the right-handed coordinate system.

Purpose of ( \mathbf{R}_i ):


Translation Vector ( \mathbf{t}_i )

The translation vector ( \mathbf{t}i ) specifies the position of the residue's alpha carbon ( C\alpha ) in the global coordinate system.


Transforming Points Using the Frame

Goal: To transform a point from the local coordinate system of a residue to the global coordinate system or vice versa.

From Local to Global Coordinates

To transform a point ( \mathbf{p} ) (in the local frame of residue ( i )) to the global coordinate system:

[ \mathbf{p}{\text{global}} = Ti(\mathbf{p}) = \mathbf{R}i \mathbf{p} + \mathbf{t}i ]

Interpretation:

  1. Rotation: Adjust the point's orientation from the local frame to match the global frame.
  2. Translation: Move the point to the appropriate location in the global coordinate system.

From Global to Local Coordinates (Inverse Transformation)

To transform a global point ( \mathbf{p}_{\text{global}} ) back to the local frame of residue ( i ):

[ \mathbf{p} = Ti^{-1}(\mathbf{p}{\text{global}}) = \mathbf{R}i^{-1} (\mathbf{p}{\text{global}} - \mathbf{t}_i) ]

Interpretation:

  1. Translation: Shift the global point so that the origin is at ( \mathbf{t}_i ) (the position of the residue).
  2. Inverse Rotation: Rotate the point to match the local orientation of the residue.

Why Is This Important in Machine Learning?

In machine learning, especially in computational biology and bioinformatics, understanding the spatial configuration of proteins is essential for tasks like:

By representing residues with frames, we can:


Step-by-Step Example

Let's walk through an example to solidify the concepts.

Suppose we have:

Transformation to Global Coordinates:

  1. Rotate ( \mathbf{p} ):

    Compute ( \mathbf{R}_i \mathbf{p} ). This adjusts the orientation based on how residue ( i ) is rotated in space.

  2. Translate ( \mathbf{p} ):

    Add ( \mathbf{t}_i ) to the rotated point to position it correctly in the global coordinate system.

    [ \mathbf{p}{\text{global}} = \mathbf{R}i \mathbf{p} + \mathbf{t}_i ]

Transformation Back to Local Coordinates:

  1. Translate ( \mathbf{p}_{\text{global}} ):

    Subtract ( \mathbf{t}_i ) from the global point to move the origin back to the local frame.

  2. Inverse Rotate ( \mathbf{p}_{\text{global}} ):

    Apply ( \mathbf{R}_i^{-1} ) to the translated point to adjust the orientation back to the local frame.

    [ \mathbf{p} = \mathbf{R}i^{-1} (\mathbf{p}{\text{global}} - \mathbf{t}_i) ]


Key Takeaways


Further Reading

To deepen your understanding, consider exploring the following topics:


Feel free to ask any questions or request clarification on any of the points above!