GraphML files for sequence networks of PETases and PURases (doi:10.18419/darus-2054)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description

Citation

Title:

GraphML files for sequence networks of PETases and PURases

Identification Number:

doi:10.18419/darus-2054

Distributor:

DaRUS

Date of Distribution:

2021-12-01

Version:

1

Bibliographic Citation:

Buchholz, Patrick C. F., 2021, "GraphML files for sequence networks of PETases and PURases", https://doi.org/10.18419/darus-2054, DaRUS, V1

Study Description

Citation

Title:

GraphML files for sequence networks of PETases and PURases

Identification Number:

doi:10.18419/darus-2054

Authoring Entity:

Buchholz, Patrick C. F. (Universität Stuttgart)

Distributor:

DaRUS

Access Authority:

Pleiss, Jürgen

Depositor:

Buchholz, Patrick C. F.

Date of Deposit:

2021-06-28

Holdings Information:

https://doi.org/10.18419/darus-2054

Study Scope

Keywords:

Medicine, Health and Life Sciences, Alignment, Network, Amino Acid Sequence, Graph, Protein Sequence, Sequence Clustering

Abstract:

The GraphML files contain the sequence networks and annotated metadata for protein sequences.

Notes:

The GraphML attributes for the edges comprise the edge weights (pairwise sequence identity, "weight"). The GraphML attributes for the nodes comprise the identifiers from the ExED ("sequence_id", "protein_id", "hfam_id", and "sfam_id" for sequence, protein, homologous family and superfamily identifiers, respectively), the NCBI taxonomy ID ("tax_id"), the annotated (organism) source name ("tax_name"), the taxonomic lineage of the source organism ("lineage", with taxa separated by "<--"), and the length of the amino acid sequence ("sequence_length"). In addition, suggested color names are given for both fill color and border color of each node ("color" and "color_border").

Methodology and Processing

Sources Statement

Data Access

Other Study Description Materials

Other Study-Related Materials

Label:

PET_local_08_55sim.graphml

Text:

Protein sequence network for PETase homologues. Edges were selected at a threshold of 55% pairwise sequence identity.

Notes:

text/xml-graphml

Other Study-Related Materials

Label:

Sfam11_09_60sim.graphml

Text:

Protein sequence network for PURase homologues from LED superfamily 11. Edges were selected at a threshold of 60% pairwise sequence identity.

Notes:

text/xml-graphml

Other Study-Related Materials

Label:

Sfam13_09_60sim.graphml

Text:

Protein sequence network for PURase homologues from LED superfamily 13. Edges were selected at a threshold of 60% pairwise sequence identity.

Notes:

text/xml-graphml