View: |
Part 1: Document Description
|
Citation |
|
---|---|
Title: |
Expansin homologues in actinobacterial genomes from South Africa |
Identification Number: |
doi:10.18419/darus-699 |
Distributor: |
DaRUS |
Date of Distribution: |
2020-04-09 |
Version: |
1 |
Bibliographic Citation: |
Buchholz, Patrick C. F., 2020, "Expansin homologues in actinobacterial genomes from South Africa", https://doi.org/10.18419/darus-699, DaRUS, V1 |
Citation |
|
Title: |
Expansin homologues in actinobacterial genomes from South Africa |
Identification Number: |
doi:10.18419/darus-699 |
Authoring Entity: |
Buchholz, Patrick C. F. (Universität Stuttgart) |
Other identifications and acknowledgements: |
Le Roes-Hill, Marilize (Cape Peninsula University of Technology, South Africa) |
Distributor: |
DaRUS |
Access Authority: |
Pleiss, Jürgen |
Depositor: |
Buchholz, Patrick C. F. |
Date of Deposit: |
2020-03-06 |
Holdings Information: |
https://doi.org/10.18419/darus-699 |
Study Scope |
|
Keywords: |
Medicine, Health and Life Sciences, protein sequence, amino acid sequence, genome mining, nucleic acid sequence, DNA sequence |
Abstract: |
Hit sequences for putative expansins (or expansin domains) are reported from an exemplary genome screening. Five actinobacterial genomes were selected to show the application of the Expansin Engineering Database (ExED) for the identification of expansin domains. The original nucleic acid sequences were translated by the standard codon usage table implemented in the transeq tool from the EMBOSS software suite. The hmmscan tool from the HMMER software suite was used to scan the translated amino acid sequences with profile hidden Markov models (profile HMMs) representing the N- and C-terminal expansin domains. The hits from hmmscan were filtered by a minimal domain-based score of 35 and a minimal coverage of 75% (defined as the ratio of hit length without insertions divided by the length of the profile HMM). The matches for the profile HMMs of both expansin domains were extended to find the adjacent start methionine and stop codon along the contig sequence of each match. The first or last available amino acid position in a contig was used to extend the hits in case of a missing start or stop codon, respectively. |
Notes: |
The five extended hit sequences correspond to five actinobacterial genomes from different habitats in South Africa: (1) Rooibos (<i>Aspalathus linearis</i>) plant material, Clanwilliam; (2) Sediments collected from the banks of the Gamka River, Swartberg Mountain Range; (3) Marine ascidian collected from Algoa Bay; (4) Sediment from the roots of a giant quiver tree, Worcester; (5) Wild garlic plant material, Wellington. |
Methodology and Processing |
|
Sources Statement |
|
Data Access |
|
Other Study Description Materials |
|
Related Studies |
|
Lohoff, Caroline, 2020, "Profile hidden Markov models of the ExED", <a href="https://doi.org/10.18419/darus-623">doi:10.18419/darus-623</a>, DaRUS |
|
Related Publications |
|
Citation |
|
Title: |
Lohoff C., Buchholz P. C. F., Le Roes-Hill M. & Pleiss J. (2020). The Expansin Engineering Database: a navigation and classification tool for expansins and homologues. Proteins: Structure, Function, and Bioinformatics 89:2. |
Identification Number: |
10.1002/prot.26001 |
Bibliographic Citation: |
Lohoff C., Buchholz P. C. F., Le Roes-Hill M. & Pleiss J. (2020). The Expansin Engineering Database: a navigation and classification tool for expansins and homologues. Proteins: Structure, Function, and Bioinformatics 89:2. |
Other Reference Note(s) |
|
Coil D, Jospin G, Darling AE. A5-miseq: An updated pipeline to assemble microbial genomes from Illumina MiSeq data. Bioinformatics. 2015;31:587-589. <a href="doi:10.1093/bioinformatics/btu661">doi:10.1093/bioinformatics/btu661</a> |
|
Label: |
hits_aa.fasta |
Text: |
FASTA file of protein sequences (amino acid symbols). The numbers in the headers correspond to the hits mentioned in the Supporting Information file from Lohoff et al. 2020. |
Notes: |
application/octet-stream |
Label: |
hits_nt.fasta |
Text: |
FASTA file of nucleic acid sequences. The numbers in the headers correspond to the hits mentioned in the Supporting Information file from Lohoff et al. 2020. |
Notes: |
application/octet-stream |