View: |
Part 1: Document Description
|
Citation |
|
---|---|
Title: |
GraphML files for sequence networks of PETases and PURases |
Identification Number: |
doi:10.18419/darus-2054 |
Distributor: |
DaRUS |
Date of Distribution: |
2021-12-01 |
Version: |
1 |
Bibliographic Citation: |
Buchholz, Patrick C. F., 2021, "GraphML files for sequence networks of PETases and PURases", https://doi.org/10.18419/darus-2054, DaRUS, V1 |
Citation |
|
Title: |
GraphML files for sequence networks of PETases and PURases |
Identification Number: |
doi:10.18419/darus-2054 |
Authoring Entity: |
Buchholz, Patrick C. F. (Universität Stuttgart) |
Distributor: |
DaRUS |
Access Authority: |
Pleiss, Jürgen |
Depositor: |
Buchholz, Patrick C. F. |
Date of Deposit: |
2021-06-28 |
Holdings Information: |
https://doi.org/10.18419/darus-2054 |
Study Scope |
|
Keywords: |
Medicine, Health and Life Sciences, Alignment, Network, Amino Acid Sequence, Graph, Protein Sequence, Sequence Clustering |
Abstract: |
The GraphML files contain the sequence networks and annotated metadata for protein sequences. |
Notes: |
The GraphML attributes for the edges comprise the edge weights (pairwise sequence identity, "weight"). The GraphML attributes for the nodes comprise the identifiers from the ExED ("sequence_id", "protein_id", "hfam_id", and "sfam_id" for sequence, protein, homologous family and superfamily identifiers, respectively), the NCBI taxonomy ID ("tax_id"), the annotated (organism) source name ("tax_name"), the taxonomic lineage of the source organism ("lineage", with taxa separated by "<--"), and the length of the amino acid sequence ("sequence_length"). In addition, suggested color names are given for both fill color and border color of each node ("color" and "color_border"). |
Methodology and Processing |
|
Sources Statement |
|
Data Access |
|
Other Study Description Materials |
|
Label: |
PET_local_08_55sim.graphml |
Text: |
Protein sequence network for PETase homologues. Edges were selected at a threshold of 55% pairwise sequence identity. |
Notes: |
text/xml-graphml |
Label: |
Sfam11_09_60sim.graphml |
Text: |
Protein sequence network for PURase homologues from LED superfamily 11. Edges were selected at a threshold of 60% pairwise sequence identity. |
Notes: |
text/xml-graphml |
Label: |
Sfam13_09_60sim.graphml |
Text: |
Protein sequence network for PURase homologues from LED superfamily 13. Edges were selected at a threshold of 60% pairwise sequence identity. |
Notes: |
text/xml-graphml |