| The tab-separated tabular file comprises nine columns: (1) the sequence identifier from the GH19ED, integer (Sequence_id), (2) the protein sequence accessions from the NCBI, semicolon-separated (NCBI_accessions), (3) the PDB accessions, semicolon-separated (PDB_accessions), (4) the name of the source or source organism (Source_name), (5) the NCBI taxonomy identifier for the source (NCBI_taxonomy_id), (6) the taxonomic lineage from the lowest to the highest rank, as inferred from NCBI taxonomy (Lineage), (7) the "protein" identifier from the GH19ED, integer (Protein_id), (8) the "homologous family" (or group) identifier from the GH19ED, integer (Homologous_family_id), (9) the "superfamily" (or subfamily) identifier from the GH19ED, integer (Superfamily_id). For sequence entries assigned to more than one source organism name, only the first taxonomic lineage found in the GH19ED is listed. |