EMBOSS¶
Description¶
EMBOSS is the European Molecular Biology Open Software Suite. EMBOSS contains a wide array of general purpose bioinformatics programs. For the GEM-PRO pipeline, we mainly need the needle pairwise alignment tool (although this can be replaced with Biopython’s built-in pairwise alignment function), and the pepstats protein sequence statistics tool.
Installation instructions (Ubuntu)¶
Note
These instructions were created on an Ubuntu 17.04 system.
Install the EMBOSS package which contains many programs
sudo apt-get install emboss
And then once that installs, try running the
needle
program:needle
Installation instructions (Mac OSX, other Unix)¶
Just install after downloading the EMBOSS source code
./configure make sudo make install
FAQs¶
How do I cite EMBOSS?
- Rice P, Longden I & Bleasby A (2000) EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 16: 276–277 Available at: http://www.ncbi.nlm.nih.gov/pubmed/10827456
I’m having issues running EMBOSS programs…
- See the ssbio wiki for (hopefully) some solutions - or add yours in when you find the answer!
API¶
-
ssbio.protein.sequence.properties.residues.
biopython_protein_analysis
(inseq)[source]¶ Utiize Biopython’s ProteinAnalysis module to return general sequence properties of an amino acid string.
For full definitions see: http://biopython.org/DIST/docs/api/Bio.SeqUtils.ProtParam.ProteinAnalysis-class.html
Parameters: inseq – Amino acid sequence Returns: Dictionary of sequence properties. Some definitions include: instability_index: Any value above 40 means the protein is unstable (has a short half life). secondary_structure_fraction: Percentage of protein in helix, turn or sheet Return type: dict Todo
Finish definitions of dictionary
-
ssbio.protein.sequence.properties.residues.
emboss_pepstats_on_fasta
(infile, outfile='', outdir='', outext='.pepstats', force_rerun=False)[source]¶ Run EMBOSS pepstats on a FASTA file.
Parameters: - infile – Path to FASTA file
- outfile – Name of output file without extension
- outdir – Path to output directory
- outext – Extension of results file, default is “.pepstats”
- force_rerun – Flag to rerun pepstats
Returns: Path to output file.
Return type: str
-
ssbio.protein.sequence.properties.residues.
emboss_pepstats_parser
(infile)[source]¶ Get dictionary of pepstats results.
Parameters: infile – Path to pepstats outfile Returns: Parsed information from pepstats Return type: dict Todo
Only currently parsing the bottom of the file for percentages of properties.
-
ssbio.protein.sequence.properties.residues.
flexibility_index
(aa_one)[source]¶ From Smith DK, Radivoja P, ObradovicZ, et al. Improved amino acid flexibility parameters, Protein Sci.2003, 12:1060
Author: Ke Chen
Parameters: aa_one – Returns:
-
ssbio.protein.sequence.properties.residues.
grantham_score
(ref_aa, mut_aa)[source]¶ https://github.com/ashutoshkpandey/Annotation/blob/master/Grantham_score_calculator.py