Expansin homologues in actinobacterial genomes from South Africa (doi:10.18419/darus-699)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

(external link)

Document Description

Citation

Title:

Expansin homologues in actinobacterial genomes from South Africa

Identification Number:

doi:10.18419/darus-699

Distributor:

DaRUS

Date of Distribution:

2020-04-09

Version:

1

Bibliographic Citation:

Buchholz, Patrick C. F., 2020, "Expansin homologues in actinobacterial genomes from South Africa", https://doi.org/10.18419/darus-699, DaRUS, V1

Study Description

Citation

Title:

Expansin homologues in actinobacterial genomes from South Africa

Identification Number:

doi:10.18419/darus-699

Authoring Entity:

Buchholz, Patrick C. F. (Universität Stuttgart)

Other identifications and acknowledgements:

Le Roes-Hill, Marilize (Cape Peninsula University of Technology, South Africa)

Distributor:

DaRUS

Access Authority:

Pleiss, Jürgen

Depositor:

Buchholz, Patrick C. F.

Date of Deposit:

2020-03-06

Holdings Information:

https://doi.org/10.18419/darus-699

Study Scope

Keywords:

Medicine, Health and Life Sciences, protein sequence, amino acid sequence, genome mining, nucleic acid sequence, DNA sequence

Abstract:

Hit sequences for putative expansins (or expansin domains) are reported from an exemplary genome screening. Five actinobacterial genomes were selected to show the application of the Expansin Engineering Database (ExED) for the identification of expansin domains. The original nucleic acid sequences were translated by the standard codon usage table implemented in the transeq tool from the EMBOSS software suite. The hmmscan tool from the HMMER software suite was used to scan the translated amino acid sequences with profile hidden Markov models (profile HMMs) representing the N- and C-terminal expansin domains. The hits from hmmscan were filtered by a minimal domain-based score of 35 and a minimal coverage of 75% (defined as the ratio of hit length without insertions divided by the length of the profile HMM). The matches for the profile HMMs of both expansin domains were extended to find the adjacent start methionine and stop codon along the contig sequence of each match. The first or last available amino acid position in a contig was used to extend the hits in case of a missing start or stop codon, respectively.

Notes:

The five extended hit sequences correspond to five actinobacterial genomes from different habitats in South Africa: (1) Rooibos (<i>Aspalathus linearis</i>) plant material, Clanwilliam; (2) Sediments collected from the banks of the Gamka River, Swartberg Mountain Range; (3) Marine ascidian collected from Algoa Bay; (4) Sediment from the roots of a giant quiver tree, Worcester; (5) Wild garlic plant material, Wellington.

Methodology and Processing

Sources Statement

Data Access

Other Study Description Materials

Related Studies

Lohoff, Caroline, 2020, "Profile hidden Markov models of the ExED", <a href="https://doi.org/10.18419/darus-623">doi:10.18419/darus-623</a>, DaRUS

Related Publications

Citation

Title:

Lohoff C., Buchholz P. C. F., Le Roes-Hill M. & Pleiss J. (2020). The Expansin Engineering Database: a navigation and classification tool for expansins and homologues. Proteins: Structure, Function, and Bioinformatics 89:2.

Identification Number:

10.1002/prot.26001

Bibliographic Citation:

Lohoff C., Buchholz P. C. F., Le Roes-Hill M. & Pleiss J. (2020). The Expansin Engineering Database: a navigation and classification tool for expansins and homologues. Proteins: Structure, Function, and Bioinformatics 89:2.

Other Reference Note(s)

Coil D, Jospin G, Darling AE. A5-miseq: An updated pipeline to assemble microbial genomes from Illumina MiSeq data. Bioinformatics. 2015;31:587-589. <a href="doi:10.1093/bioinformatics/btu661">doi:10.1093/bioinformatics/btu661</a>

Other Study-Related Materials

Label:

hits_aa.fasta

Text:

FASTA file of protein sequences (amino acid symbols). The numbers in the headers correspond to the hits mentioned in the Supporting Information file from Lohoff et al. 2020.

Notes:

application/octet-stream

Other Study-Related Materials

Label:

hits_nt.fasta

Text:

FASTA file of nucleic acid sequences. The numbers in the headers correspond to the hits mentioned in the Supporting Information file from Lohoff et al. 2020.

Notes:

application/octet-stream