
 
A sequence motif is a nucleotide or amino-acid sequence pattern. Sequence motifs are formed by three-dimensional arrangement of amino acids which may not be adjacent. Biopython provides a separate module, Bio.motifs to access the functionalities of sequence motif as specified below −
from Bio import motifs
Let us create a simple DNA motif sequence using the below command −
>>> from Bio import motifs 
>>> from Bio.Seq import Seq 
>>> DNA_motif = [ Seq("AGCT"), 
...               Seq("TCGA"), 
...               Seq("AACT"), 
...             ] 
>>> seq = motifs.create(DNA_motif) 
>>> print(seq) AGCT TCGA AACT
To count the sequence values, use the below command −
>>> print(seq.counts) 
         0       1      2       3 
A:    2.00    1.00   0.00    1.00 
C:    0.00    1.00   2.00    0.00 
G:    0.00    1.00   1.00    0.00 
T:    1.00    0.00   0.00    2.00
Use the following code to count ‘A’ in the sequence −
>>> seq.counts["A", :] (2, 1, 0, 1)
If you want to access the columns of counts, use the below command −
>>> seq.counts[:, 3] 
{'A': 1, 'C': 0, 'T': 2, 'G': 0}
We shall now discuss how to create a Sequence Logo.
Consider the below sequence −
AGCTTACG ATCGTACC TTCCGAAT GGTACGTA AAGCTTGG
You can create your own logo using the following link − http://weblogo.berkeley.edu/
Add the above sequence and create a new logo and save the image named seq.png in your biopython folder.
seq.png
 
After creating the image, now run the following command −
>>> seq.weblogo("seq.png")
This DNA sequence motif is represented as a sequence logo for the LexA-binding motif.
JASPAR is one of the most popular databases. It provides facilities of any of the motif formats for reading, writing and scanning sequences. It stores meta-information for each motif. The module Bio.motifs contains a specialized class jaspar.Motif to represent meta-information attributes.
It has the following notable attributes types −
Let us create a JASPAR sites format named in sample.sites in biopython folder. It is defined below −
sample.sites >MA0001 ARNT 1 AACGTGatgtccta >MA0001 ARNT 2 CAGGTGggatgtac >MA0001 ARNT 3 TACGTAgctcatgc >MA0001 ARNT 4 AACGTGacagcgct >MA0001 ARNT 5 CACGTGcacgtcgt >MA0001 ARNT 6 cggcctCGCGTGc
In the above file, we have created motif instances. Now, let us create a motif object from the above instances −
>>> from Bio import motifs 
>>> with open("sample.sites") as handle: 
... data = motifs.read(handle,"sites") 
... 
>>> print(data) 
TF name None 
Matrix ID None 
Matrix:
            0       1       2       3       4       5 
A:       2.00    5.00    0.00    0.00    0.00    1.00 
C:       3.00    0.00    5.00    0.00    0.00    0.00 
G:       0.00    1.00    1.00    6.00    0.00    5.00 
T:       1.00    0.00    0.00    0.00    6.00    0.00
Here, data reads all the motif instances from sample.sites file.
To print all the instances from data, use the below command −
>>> for instance in data.instances: ... print(instance) ... AACGTG CAGGTG TACGTA AACGTG CACGTG CGCGTG
Use the below command to count all the values −
>>> print(data.counts)
            0       1       2       3       4       5 
A:       2.00    5.00    0.00    0.00    0.00    1.00 
C:       3.00    0.00    5.00    0.00    0.00    0.00 
G:       0.00    1.00    1.00    6.00    0.00    5.00 
T:       1.00    0.00    0.00    0.00    6.00    0.00
>>>