diff test-data/input_otu_rps_s1.tab @ 4:bb29ae8708b5 draft default tip

planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/virAnnot commit 7036ce0e06b6dc64332b1a5642fc58928523c5c6
author iuc
date Tue, 13 May 2025 11:52:17 +0000
parents f8ebd1e802d7
children
line wrap: on
line diff
--- a/test-data/input_otu_rps_s1.tab	Sun Sep 08 14:09:19 2024 +0000
+++ b/test-data/input_otu_rps_s1.tab	Tue May 13 11:52:17 2025 +0000
@@ -1,45 +1,45 @@
-#query_id	query_length	cdd_id	hit_id	evalue	startQ	endQ	frame	description	superkingdom
-Query_2	2436	pfam02123	gnl|CDD|280316	2.04111e-21	184	1476	1	pfam02123, RdRP_4, Viral RNA-directed RNA-polymerase.  This family includes RNA-dependent RNA polymerase proteins (RdRPs) from Luteovirus, Totivirus and Rotavirus.	Viruses(1);Riboviria(1);Orthornavirae(1);Duplornaviricota(1)
-Query_4	2297	pfam00680	gnl|CDD|279070	3.12197e-05	995	1873	-2	pfam00680, RdRP_1, RNA dependent RNA polymerase.  	Viruses(1);Riboviria(1);Orthornavirae(1);Pisuviricota(1)
-Query_5	2029	pfam00680	gnl|CDD|279070	8.86955e-06	840	1706	3	pfam00680, RdRP_1, RNA dependent RNA polymerase.  	Viruses(1);Riboviria(1);Orthornavirae(1);Pisuviricota(1)
-Query_6	1860	pfam02123	gnl|CDD|280316	1.27376e-17	1147	1764	-1	pfam02123, RdRP_4, Viral RNA-directed RNA-polymerase.  This family includes RNA-dependent RNA polymerase proteins (RdRPs) from Luteovirus, Totivirus and Rotavirus.	Viruses(1);Riboviria(1);Orthornavirae(1);Duplornaviricota(1)
-Query_8	1703	pfam00680	gnl|CDD|279070	3.19349e-12	685	1458	-3	pfam00680, RdRP_1, RNA dependent RNA polymerase.  	Viruses(1);Riboviria(1);Orthornavirae(1);Pisuviricota(1)
-Query_19	425	pfam00005	gnl|CDD|306511	3.70622e-07	129	275	-1	pfam00005, ABC_tran, ABC transporter.  ABC transporters for a large family of proteins responsible for translocation of a variety of compounds across biological membranes. ABC transporters are the largest family of proteins in many completely sequenced bacteria. ABC transporters are composed of two copies of this domain and two copies of a transmembrane domain pfam00664. These four domains may belong to a single polypeptide as in CFTR, or belong in different polypeptide chains.	Bacteria(2);cellular organisms(1);Terrabacteria group(1)
-Query_38	386	pfam01347	gnl|CDD|279663	0.000262768	129	275	-1	pfam01347, Vitellogenin_N, Lipoprotein amino terminal region.  This family contains regions from: Vitellogenin, Microsomal triglyceride transfer protein and apolipoprotein B-100. These proteins are all involved in lipid transport. This family contains the LV1n chain from lipovitellin, that contains two structural domains.	cellular organisms(1);Eukaryota(1);Opisthokonta(1);Metazoa(1)
-Query_41	380	pfam04879	gnl|CDD|282703	2.77416e-08	125	274	-2	pfam04879, Molybdop_Fe4S4, Molybdopterin oxidoreductase Fe4S4 domain.  This domain is found in formate dehydrogenase H for which the structure is known. This first domain (residues 1 to 60) of Structure 1aa6 is an Fe4S4 cluster just below the protein surface.	Bacteria(2);cellular organisms(1);Pseudomonadota(1)
-Query_42	379	pfam16203	gnl|CDD|318443	8.05104e-30	131	280	-1	pfam16203, ERCC3_RAD25_C, ERCC3/RAD25/XPB C-terminal helicase.  This is the C-terminal helicase domain of ERCC3, RAD25 and XPB helicases.	cellular organisms(2);Bacteria(1);Terrabacteria group(1)
-Query_44	376	pfam00401	gnl|CDD|306831	6.62013e-05	81	215	-3	pfam00401, ATP-synt_DE, ATP synthase, Delta/Epsilon chain, long alpha-helix domain.  Part of the ATP synthase CF(1). These subunits are part of the head unit of the ATP synthase. This subunit is called epsilon in bacteria and delta in mitochondria. In bacteria the delta (D) subunit is equivalent to the mitochondrial Oligomycin sensitive subunit, OSCP (pfam00213).	cellular organisms(2);Eukaryota(1);Viridiplantae(1)
-Query_58	347	pfam00471	gnl|CDD|306877	8.86568e-13	132	302	3	pfam00471, Ribosomal_L33, Ribosomal protein L33.  	cellular organisms(2);Bacteria(1);Eukaryota(1)
-Query_61	344	pfam00252	gnl|CDD|306711	1.17482e-22	107	295	2	pfam00252, Ribosomal_L16, Ribosomal protein L16p/L10e.  	cellular organisms(2);Eukaryota(1);Viridiplantae(1)
-Query_62	343	pfam00421	gnl|CDD|306845	7.93928e-41	92	337	-1	pfam00421, PSII, Photosystem II protein.  	cellular organisms(1);Eukaryota(1);Viridiplantae(1);Streptophyta(1)
-Query_64	339	pfam01333	gnl|CDD|307480	0.000362606	197	325	-3	pfam01333, Apocytochr_F_C, Apocytochrome F, C-terminal.  This is a sub-family of cytochrome C. See pfam00034.	cellular organisms(1);Eukaryota(1);Viridiplantae(1);Streptophyta(1)
-Query_74	330	pfam00680	gnl|CDD|279070	4.51414e-05	124	282	1	pfam00680, RdRP_1, RNA dependent RNA polymerase.  	Viruses(1);Riboviria(1);Orthornavirae(1);Pisuviricota(1)
-Query_83	320	pfam05860	gnl|CDD|310447	1.29746e-13	167	298	2	pfam05860, Haemagg_act, haemagglutination activity domain.  This domain is suggested to be a carbohydrate- dependent haemagglutination activity site. It is found in a range of haemagglutinins and haemolysins.	Bacteria(2);cellular organisms(1);Pseudomonadota(1)
-Query_87	252	pfam00585	gnl|CDD|278982	1.42752e-05	29	166	2	pfam00585, Thr_dehydrat_C, C-terminal regulatory domain of Threonine dehydratase.  Threonine dehydratases pfam00291 all contain a carboxy terminal region. This region may have a regulatory role. Some members contain two copies of this region. This family is homologous to the pfam01842 domain.	Bacteria(2);cellular organisms(1);Pseudomonadota(1)
-Query_90	251	pfam13188	gnl|CDD|315779	0.000739897	32	241	2	pfam13188, PAS_8, PAS domain.  	Bacteria(2);cellular organisms(1);Pseudomonadota(1)
-Query_91	251	pfam02123	gnl|CDD|280316	3.2928e-08	28	228	-3	pfam02123, RdRP_4, Viral RNA-directed RNA-polymerase.  This family includes RNA-dependent RNA polymerase proteins (RdRPs) from Luteovirus, Totivirus and Rotavirus.	Viruses(1);Riboviria(1);Orthornavirae(1);Duplornaviricota(1)
-Query_93	251	pfam00252	gnl|CDD|306711	7.50297e-12	78	206	-1	pfam00252, Ribosomal_L16, Ribosomal protein L16p/L10e.  	cellular organisms(2);Eukaryota(1);Viridiplantae(1)
-Query_98	250	pfam00227	gnl|CDD|306690	4.91252e-09	10	150	-2	pfam00227, Proteasome, Proteasome subunit.  The proteasome is a multisubunit structure that degrades proteins. Protein degradation is an essential component of regulation because proteins can become misfolded, damaged, or unnecessary. Proteasomes and their homologs vary greatly in complexity: from HslV (heat shock locus v), which is encoded by 1 gene in bacteria, to the eukaryotic 20S proteasome, which is encoded by more than 14 genes. Recently evidence of two novel groups of bacterial proteasomes was proposed. The first is Anbu, which is sparsely distributed among cyanobacteria and proteobacteria. The second is call beta-proteobacteria proteasome homolog (BPH).	cellular organisms(2);Eukaryota(1);Opisthokonta(1)
-Query_104	249	pfam13173	gnl|CDD|315764	2.6724e-08	106	249	1	pfam13173, AAA_14, AAA domain.  This family of domains contain a P-loop motif that is characteristic of the AAA superfamily.	Bacteria(2);cellular organisms(1);FCB group(1)
-Query_111	248	pfam00113	gnl|CDD|278539	3.9331e-13	15	116	-1	pfam00113, Enolase_C, Enolase, C-terminal TIM barrel domain.  	cellular organisms(2);Bacteria(2)
-Query_127	245	pfam00946	gnl|CDD|307203	3.13472e-05	1	141	1	pfam00946, Mononeg_RNA_pol, Mononegavirales RNA dependent RNA polymerase.  Members of the Mononegavirales including the Paramyxoviridae, like other non-segmented negative strand RNA viruses, have an RNA-dependent RNA polymerase composed of two subunits, a large protein L and a phosphoprotein P. This is a protein family of the L protein. The L protein confers the RNA polymerase activity on the complex. The P protein acts as a transcription factor.	Viruses(1);Riboviria(1);Orthornavirae(1);Negarnaviricota(1)
-Query_138	243	pfam00416	gnl|CDD|306841	5.30772e-05	15	134	-2	pfam00416, Ribosomal_S13, Ribosomal protein S13/S18.  This family includes ribosomal protein S13 from prokaryotes and S18 from eukaryotes.	cellular organisms(2);Bacteria(2)
-Query_139	243	pfam00216	gnl|CDD|306682	1.89202e-10	134	241	-3	pfam00216, Bac_DNA_binding, Bacterial DNA-binding protein.  	Bacteria(2);cellular organisms(1);Pseudomonadota(1)
-Query_140	243	pfam13041	gnl|CDD|315669	0.000344884	134	241	-3	pfam13041, PPR_2, PPR repeat family.  This repeat has no known function. It is about 35 amino acids long and is found in up to 18 copies in some proteins. The family appears to be greatly expanded in plants and fungi. The repeat has been called PPR.	cellular organisms(1);Eukaryota(1);Viridiplantae(1);Streptophyta(1)
-Query_144	243	pfam12137	gnl|CDD|314930	3.71293e-05	137	217	-3	pfam12137, RapA_C, RNA polymerase recycling family C-terminal.  This domain is found in bacteria. This domain is about 360 amino acids in length. This domain is found associated with pfam00271, pfam00176. The function of this domain is not known, but structurally it forms an alpha-beta fold in nature with a central beta-sheet flanked by helices and loops, the beta-sheet being mainly antiparallel and flanked by four alpha helices, among which the two longer helices exhibit a coiled-coil arrangement.	cellular organisms(1);Bacteria(1);Pseudomonadota(1);Gammaproteobacteria(1)
-Query_145	242	pfam00146	gnl|CDD|306623	2.12078e-10	22	111	1	pfam00146, NADHdh, NADH dehydrogenase.  	cellular organisms(1);Eukaryota(1);Opisthokonta(1);Metazoa(1)
-Query_149	242	pfam00124	gnl|CDD|306604	4.44151e-07	21	125	3	pfam00124, Photo_RC, Photosynthetic reaction centre protein.  	cellular organisms(1);Eukaryota(1);Viridiplantae(1);Streptophyta(1)
-Query_163	241	pfam02123	gnl|CDD|280316	5.78854e-08	35	214	-1	pfam02123, RdRP_4, Viral RNA-directed RNA-polymerase.  This family includes RNA-dependent RNA polymerase proteins (RdRPs) from Luteovirus, Totivirus and Rotavirus.	Viruses(1);Riboviria(1);Orthornavirae(1);Duplornaviricota(1)
-Query_177	239	pfam06122	gnl|CDD|310603	1.30391e-05	29	172	2	pfam06122, TraH, Conjugative relaxosome accessory transposon protein.  The TraH protein is thought to be a relaxosome accessory component, also necessary for transfer but not for H-pilus synthesis within the conjugative transposon.	cellular organisms(1);Bacteria(1);Pseudomonadota(1);Gammaproteobacteria(1)
-Query_179	239	pfam00361	gnl|CDD|306795	3.63199e-05	70	219	1	pfam00361, Proton_antipo_M, Proton-conducting membrane transporter.  This is a family of membrane transporters that inlcudes some 7 of potentially 14-16 TM regions. In many instances the family forms part of complex I that catalyzes the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane, and in this context is a combination predominantly of subunits 2, 4, 5, 14, L, M and N. In many bacterial species these proteins are probable stand-alone transporters not coupled with oxidoreduction. The family in total represents homologs across the phyla.	cellular organisms(1);Eukaryota(1);Opisthokonta(1);Metazoa(1)
-Query_182	239	pfam00177	gnl|CDD|306646	1.05327e-06	28	126	1	pfam00177, Ribosomal_S7, Ribosomal protein S7p/S5e.  This family contains ribosomal protein S7 from prokaryotes and S5 from eukaryotes.	cellular organisms(2);Eukaryota(1);Viridiplantae(1)
-Query_202	235	pfam03154	gnl|CDD|308660	0.000842762	28	126	1	pfam03154, Atrophin-1, Atrophin-1 family.  Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteristic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.	Eukaryota(1);cellular organisms(1);Opisthokonta(1);Metazoa(1)
-Query_203	235	pfam00164	gnl|CDD|278589	1.83229e-23	3	182	3	pfam00164, Ribosom_S12_S23, Ribosomal protein S12/S23.  This protein is known as S12 in bacteria and archaea and S23 in eukaryotes.	cellular organisms(2);Eukaryota(1);Viridiplantae(1)
-Query_211	234	pfam00155	gnl|CDD|306629	0.000251531	3	182	3	pfam00155, Aminotran_1_2, Aminotransferase class I and II.  	Bacteria(2);cellular organisms(1);Pseudomonadota(1)
-Query_219	233	pfam00680	gnl|CDD|279070	0.000703744	3	182	3	pfam00680, RdRP_1, RNA dependent RNA polymerase.  	Viruses(1);Riboviria(1);Orthornavirae(1);Pisuviricota(1)
-Query_232	231	pfam00481	gnl|CDD|306885	0.00063843	3	182	3	pfam00481, PP2C, Protein phosphatase 2C.  Protein phosphatase 2C is a Mn++ or Mg++ dependent protein serine/threonine phosphatase.	Eukaryota(2);cellular organisms(1);Viridiplantae(1)
-Query_241	230	pfam00072	gnl|CDD|306560	5.30837e-08	50	208	2	pfam00072, Response_reg, Response regulator receiver domain.  This domain receives the signal from the sensor partner in bacterial two-component systems. It is usually found N-terminal to a DNA binding effector domain.	Bacteria(2);cellular organisms(1);Pseudomonadota(1)
-Query_246	230	pfam00201	gnl|CDD|278624	2.93544e-07	46	210	1	pfam00201, UDPGT, UDP-glucoronosyl and UDP-glucosyl transferase.  	cellular organisms(1);Eukaryota(1);Viridiplantae(1);Streptophytina(1)
-Query_261	228	pfam17035	gnl|CDD|319097	3.87403e-09	108	203	3	pfam17035, BET, Bromodomain extra-terminal - transcription regulation.  The BET, or bromodomain extra-terminal domain, is found on bromodomain proteins that play key roles in development, cancer progression and virus-host pathogenesis. It interacts with NSD3, JMJD6, CHD4, GLTSCR1, and ATAD5 all of which are shown to impart a pTEFb-independent transcriptional activation function on the bromodomain proteins.	cellular organisms(1);Eukaryota(1);Opisthokonta(1);Metazoa(1)
-Query_280	207	pfam04061	gnl|CDD|309259	7.30581e-19	1	159	1	pfam04061, ORMDL, ORMDL family.  Evidence form suggests that ORMDLs are involved in protein folding in the ER. Orm proteins have been identified as negative regulators of sphingolipid synthesis that form a conserved complex with serine palmitoyltransferase, the first and rate-limiting enzyme in sphingolipid production. This novel and conserved protein complex, has been termed the SPOTS complex (serine palmitoyltransferase, Orm1/2, Tsc3, and Sac1).	cellular organisms(1);Eukaryota(1);Opisthokonta(1);Metazoa(1)
-Query_326	206	pfam10775	gnl|CDD|313884	0.00091969	1	159	1	pfam10775, ATP_sub_h, ATP synthase complex subunit h.  Subunit h is a component of the yeast mitochondrial F1-F0 ATP synthase. It is essential for the correct assembly and functioning of this enzyme. Subunit h occupies a central place in the peripheral stalk between the F1 sector and the membrane.	cellular organisms(1);Eukaryota(1);Opisthokonta(1);Fungi(1)
+#query_id	query_length	cdd_id	hit_id	evalue	startQ	endQ	frame	description	superkingdom	pident
+ds2020-267_5	2436	pfam02123	gnl|CDD|280316	2.04111e-21	184	1476	1	pfam02123, RdRP_4, Viral RNA-directed RNA-polymerase.  This family includes RNA-dependent RNA polymerase proteins (RdRPs) from Luteovirus, Totivirus and Rotavirus.	Viruses(1);Riboviria(1);Orthornavirae(1);Duplornaviricota(1)	22.535
+ds2020-267_7	2297	pfam00680	gnl|CDD|279070	3.12197e-05	995	1873	-2	pfam00680, RdRP_1, RNA dependent RNA polymerase.  	Viruses(1);Riboviria(1);Orthornavirae(1);Pisuviricota(1)	19.742
+ds2020-267_8	2029	pfam00680	gnl|CDD|279070	8.86955e-06	840	1706	3	pfam00680, RdRP_1, RNA dependent RNA polymerase.  	Viruses(1);Riboviria(1);Orthornavirae(1);Pisuviricota(1)	25.314
+ds2020-267_10	1860	pfam02123	gnl|CDD|280316	1.27376e-17	1147	1764	-1	pfam02123, RdRP_4, Viral RNA-directed RNA-polymerase.  This family includes RNA-dependent RNA polymerase proteins (RdRPs) from Luteovirus, Totivirus and Rotavirus.	Viruses(1);Riboviria(1);Orthornavirae(1);Duplornaviricota(1)	18.868
+ds2020-267_12	1703	pfam00680	gnl|CDD|279070	3.19349e-12	685	1458	-3	pfam00680, RdRP_1, RNA dependent RNA polymerase.  	Viruses(1);Riboviria(1);Orthornavirae(1);Pisuviricota(1)	27.456
+ds2020-267_75	425	pfam00005	gnl|CDD|306511	3.70622e-07	129	275	-1	pfam00005, ABC_tran, ABC transporter.  ABC transporters for a large family of proteins responsible for translocation of a variety of compounds across biological membranes. ABC transporters are the largest family of proteins in many completely sequenced bacteria. ABC transporters are composed of two copies of this domain and two copies of a transmembrane domain pfam00664. These four domains may belong to a single polypeptide as in CFTR, or belong in different polypeptide chains.	Bacteria(2);cellular organisms(1);Terrabacteria group(1)	33.974
+ds2020-267_76	386	pfam01347	gnl|CDD|279663	0.000262768	129	275	-1	pfam01347, Vitellogenin_N, Lipoprotein amino terminal region.  This family contains regions from: Vitellogenin, Microsomal triglyceride transfer protein and apolipoprotein B-100. These proteins are all involved in lipid transport. This family contains the LV1n chain from lipovitellin, that contains two structural domains.	cellular organisms(1);Eukaryota(1);Opisthokonta(1);Metazoa(1)	24.167
+ds2020-267_79	380	pfam04879	gnl|CDD|282703	2.77416e-08	125	274	-2	pfam04879, Molybdop_Fe4S4, Molybdopterin oxidoreductase Fe4S4 domain.  This domain is found in formate dehydrogenase H for which the structure is known. This first domain (residues 1 to 60) of Structure 1aa6 is an Fe4S4 cluster just below the protein surface.	Bacteria(2);cellular organisms(1);Pseudomonadota(1)	22.921
+ds2020-267_80	379	pfam16203	gnl|CDD|318443	8.05104e-30	131	280	-1	pfam16203, ERCC3_RAD25_C, ERCC3/RAD25/XPB C-terminal helicase.  This is the C-terminal helicase domain of ERCC3, RAD25 and XPB helicases.	cellular organisms(2);Bacteria(1);Terrabacteria group(1)	29.017
+ds2020-267_81	376	pfam00401	gnl|CDD|306831	6.62013e-05	81	215	-3	pfam00401, ATP-synt_DE, ATP synthase, Delta/Epsilon chain, long alpha-helix domain.  Part of the ATP synthase CF(1). These subunits are part of the head unit of the ATP synthase. This subunit is called epsilon in bacteria and delta in mitochondria. In bacteria the delta (D) subunit is equivalent to the mitochondrial Oligomycin sensitive subunit, OSCP (pfam00213).	cellular organisms(2);Eukaryota(1);Viridiplantae(1)	27.296
+ds2020-267_320	347	pfam00471	gnl|CDD|306877	8.86568e-13	132	302	3	pfam00471, Ribosomal_L33, Ribosomal protein L33.  	cellular organisms(2);Bacteria(1);Eukaryota(1)	27.649
+ds2020-267_322	344	pfam00252	gnl|CDD|306711	1.17482e-22	107	295	2	pfam00252, Ribosomal_L16, Ribosomal protein L16p/L10e.  	cellular organisms(2);Eukaryota(1);Viridiplantae(1)	18.354
+ds2020-267_323	343	pfam00421	gnl|CDD|306845	7.93928e-41	92	337	-1	pfam00421, PSII, Photosystem II protein.  	cellular organisms(1);Eukaryota(1);Viridiplantae(1);Streptophyta(1)	21.070
+ds2020-267_324	339	pfam01333	gnl|CDD|307480	0.000362606	197	325	-3	pfam01333, Apocytochr_F_C, Apocytochrome F, C-terminal.  This is a sub-family of cytochrome C. See pfam00034.	cellular organisms(1);Eukaryota(1);Viridiplantae(1);Streptophyta(1)	26.684
+ds2020-267_327	330	pfam00680	gnl|CDD|279070	4.51414e-05	124	282	1	pfam00680, RdRP_1, RNA dependent RNA polymerase.  	Viruses(1);Riboviria(1);Orthornavirae(1);Pisuviricota(1)	24.942
+ds2020-267_332	320	pfam05860	gnl|CDD|310447	1.29746e-13	167	298	2	pfam05860, Haemagg_act, haemagglutination activity domain.  This domain is suggested to be a carbohydrate- dependent haemagglutination activity site. It is found in a range of haemagglutinins and haemolysins.	Bacteria(2);cellular organisms(1);Pseudomonadota(1)	22.222
+ds2020-267_333	252	pfam00585	gnl|CDD|278982	1.42752e-05	29	166	2	pfam00585, Thr_dehydrat_C, C-terminal regulatory domain of Threonine dehydratase.  Threonine dehydratases pfam00291 all contain a carboxy terminal region. This region may have a regulatory role. Some members contain two copies of this region. This family is homologous to the pfam01842 domain.	Bacteria(2);cellular organisms(1);Pseudomonadota(1)	25.916
+ds2020-267_336	251	pfam13188	gnl|CDD|315779	0.000739897	32	241	2	pfam13188, PAS_8, PAS domain.  	Bacteria(2);cellular organisms(1);Pseudomonadota(1)	27.014
+ds2020-267_337	251	pfam02123	gnl|CDD|280316	3.2928e-08	28	228	-3	pfam02123, RdRP_4, Viral RNA-directed RNA-polymerase.  This family includes RNA-dependent RNA polymerase proteins (RdRPs) from Luteovirus, Totivirus and Rotavirus.	Viruses(1);Riboviria(1);Orthornavirae(1);Duplornaviricota(1)	37.500
+ds2020-267_338	251	pfam00252	gnl|CDD|306711	7.50297e-12	78	206	-1	pfam00252, Ribosomal_L16, Ribosomal protein L16p/L10e.  	cellular organisms(2);Eukaryota(1);Viridiplantae(1)	17.308
+ds2020-267_339	250	pfam00227	gnl|CDD|306690	4.91252e-09	10	150	-2	pfam00227, Proteasome, Proteasome subunit.  The proteasome is a multisubunit structure that degrades proteins. Protein degradation is an essential component of regulation because proteins can become misfolded, damaged, or unnecessary. Proteasomes and their homologs vary greatly in complexity: from HslV (heat shock locus v), which is encoded by 1 gene in bacteria, to the eukaryotic 20S proteasome, which is encoded by more than 14 genes. Recently evidence of two novel groups of bacterial proteasomes was proposed. The first is Anbu, which is sparsely distributed among cyanobacteria and proteobacteria. The second is call beta-proteobacteria proteasome homolog (BPH).	cellular organisms(2);Eukaryota(1);Opisthokonta(1)	21.244
+ds2020-267_343	249	pfam13173	gnl|CDD|315764	2.6724e-08	106	249	1	pfam13173, AAA_14, AAA domain.  This family of domains contain a P-loop motif that is characteristic of the AAA superfamily.	Bacteria(2);cellular organisms(1);FCB group(1)	24.583
+ds2020-267_362	248	pfam00113	gnl|CDD|278539	3.9331e-13	15	116	-1	pfam00113, Enolase_C, Enolase, C-terminal TIM barrel domain.  	cellular organisms(2);Bacteria(2)	21.656
+ds2020-267_363	245	pfam00946	gnl|CDD|307203	3.13472e-05	1	141	1	pfam00946, Mononeg_RNA_pol, Mononegavirales RNA dependent RNA polymerase.  Members of the Mononegavirales including the Paramyxoviridae, like other non-segmented negative strand RNA viruses, have an RNA-dependent RNA polymerase composed of two subunits, a large protein L and a phosphoprotein P. This is a protein family of the L protein. The L protein confers the RNA polymerase activity on the complex. The P protein acts as a transcription factor.	Viruses(1);Riboviria(1);Orthornavirae(1);Negarnaviricota(1)	26.562
+ds2020-267_364	243	pfam00416	gnl|CDD|306841	5.30772e-05	15	134	-2	pfam00416, Ribosomal_S13, Ribosomal protein S13/S18.  This family includes ribosomal protein S13 from prokaryotes and S18 from eukaryotes.	cellular organisms(2);Bacteria(2)	26.276
+ds2020-267_365	243	pfam00216	gnl|CDD|306682	1.89202e-10	134	241	-3	pfam00216, Bac_DNA_binding, Bacterial DNA-binding protein.  	Bacteria(2);cellular organisms(1);Pseudomonadota(1)	25.178
+ds2020-267_366	243	pfam13041	gnl|CDD|315669	0.000344884	134	241	-3	pfam13041, PPR_2, PPR repeat family.  This repeat has no known function. It is about 35 amino acids long and is found in up to 18 copies in some proteins. The family appears to be greatly expanded in plants and fungi. The repeat has been called PPR.	cellular organisms(1);Eukaryota(1);Viridiplantae(1);Streptophyta(1)	17.600
+ds2020-267_370	243	pfam12137	gnl|CDD|314930	3.71293e-05	137	217	-3	pfam12137, RapA_C, RNA polymerase recycling family C-terminal.  This domain is found in bacteria. This domain is about 360 amino acids in length. This domain is found associated with pfam00271, pfam00176. The function of this domain is not known, but structurally it forms an alpha-beta fold in nature with a central beta-sheet flanked by helices and loops, the beta-sheet being mainly antiparallel and flanked by four alpha helices, among which the two longer helices exhibit a coiled-coil arrangement.	cellular organisms(1);Bacteria(1);Pseudomonadota(1);Gammaproteobacteria(1)	24.942
+ds2020-267_372	242	pfam00146	gnl|CDD|306623	2.12078e-10	22	111	1	pfam00146, NADHdh, NADH dehydrogenase.  	cellular organisms(1);Eukaryota(1);Opisthokonta(1);Metazoa(1)	24.942
+ds2020-267_373	242	pfam00124	gnl|CDD|306604	4.44151e-07	21	125	3	pfam00124, Photo_RC, Photosynthetic reaction centre protein.  	cellular organisms(1);Eukaryota(1);Viridiplantae(1);Streptophyta(1)	33.663
+ds2020-267_374	241	pfam02123	gnl|CDD|280316	5.78854e-08	35	214	-1	pfam02123, RdRP_4, Viral RNA-directed RNA-polymerase.  This family includes RNA-dependent RNA polymerase proteins (RdRPs) from Luteovirus, Totivirus and Rotavirus.	Viruses(1);Riboviria(1);Orthornavirae(1);Duplornaviricota(1)	21.831
+ds2020-267_380	239	pfam06122	gnl|CDD|310603	1.30391e-05	29	172	2	pfam06122, TraH, Conjugative relaxosome accessory transposon protein.  The TraH protein is thought to be a relaxosome accessory component, also necessary for transfer but not for H-pilus synthesis within the conjugative transposon.	cellular organisms(1);Bacteria(1);Pseudomonadota(1);Gammaproteobacteria(1)	37.888
+ds2020-267_385	239	pfam00361	gnl|CDD|306795	3.63199e-05	70	219	1	pfam00361, Proton_antipo_M, Proton-conducting membrane transporter.  This is a family of membrane transporters that inlcudes some 7 of potentially 14-16 TM regions. In many instances the family forms part of complex I that catalyzes the transfer of two electrons from NADH to ubiquinone in a reaction that is associated with proton translocation across the membrane, and in this context is a combination predominantly of subunits 2, 4, 5, 14, L, M and N. In many bacterial species these proteins are probable stand-alone transporters not coupled with oxidoreduction. The family in total represents homologs across the phyla.	cellular organisms(1);Eukaryota(1);Opisthokonta(1);Metazoa(1)	18.868
+ds2020-267_386	239	pfam00177	gnl|CDD|306646	1.05327e-06	28	126	1	pfam00177, Ribosomal_S7, Ribosomal protein S7p/S5e.  This family contains ribosomal protein S7 from prokaryotes and S5 from eukaryotes.	cellular organisms(2);Eukaryota(1);Viridiplantae(1)	29.545
+ds2020-267_395	235	pfam03154	gnl|CDD|308660	0.000842762	28	126	1	pfam03154, Atrophin-1, Atrophin-1 family.  Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteristic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.	Eukaryota(1);cellular organisms(1);Opisthokonta(1);Metazoa(1)	36.317
+ds2020-267_403	235	pfam00164	gnl|CDD|278589	1.83229e-23	3	182	3	pfam00164, Ribosom_S12_S23, Ribosomal protein S12/S23.  This protein is known as S12 in bacteria and archaea and S23 in eukaryotes.	cellular organisms(2);Eukaryota(1);Viridiplantae(1)	21.831
+ds2020-267_404	234	pfam00155	gnl|CDD|306629	0.000251531	3	182	3	pfam00155, Aminotran_1_2, Aminotransferase class I and II.  	Bacteria(2);cellular organisms(1);Pseudomonadota(1)	25.314
+ds2020-267_835	233	pfam00680	gnl|CDD|279070	0.000703744	3	182	3	pfam00680, RdRP_1, RNA dependent RNA polymerase.  	Viruses(1);Riboviria(1);Orthornavirae(1);Pisuviricota(1)	28.244
+ds2020-267_837	231	pfam00481	gnl|CDD|306885	0.00063843	3	182	3	pfam00481, PP2C, Protein phosphatase 2C.  Protein phosphatase 2C is a Mn++ or Mg++ dependent protein serine/threonine phosphatase.	Eukaryota(2);cellular organisms(1);Viridiplantae(1)	22.921
+ds2020-267_838	230	pfam00072	gnl|CDD|306560	5.30837e-08	50	208	2	pfam00072, Response_reg, Response regulator receiver domain.  This domain receives the signal from the sensor partner in bacterial two-component systems. It is usually found N-terminal to a DNA binding effector domain.	Bacteria(2);cellular organisms(1);Pseudomonadota(1)	34.356
+ds2020-267_843	230	pfam00201	gnl|CDD|278624	2.93544e-07	46	210	1	pfam00201, UDPGT, UDP-glucoronosyl and UDP-glucosyl transferase.  	cellular organisms(1);Eukaryota(1);Viridiplantae(1);Streptophytina(1)	26.684
+ds2020-267_852	228	pfam17035	gnl|CDD|319097	3.87403e-09	108	203	3	pfam17035, BET, Bromodomain extra-terminal - transcription regulation.  The BET, or bromodomain extra-terminal domain, is found on bromodomain proteins that play key roles in development, cancer progression and virus-host pathogenesis. It interacts with NSD3, JMJD6, CHD4, GLTSCR1, and ATAD5 all of which are shown to impart a pTEFb-independent transcriptional activation function on the bromodomain proteins.	cellular organisms(1);Eukaryota(1);Opisthokonta(1);Metazoa(1)	34.188
+ds2020-267_855	207	pfam04061	gnl|CDD|309259	7.30581e-19	1	159	1	pfam04061, ORMDL, ORMDL family.  Evidence form suggests that ORMDLs are involved in protein folding in the ER. Orm proteins have been identified as negative regulators of sphingolipid synthesis that form a conserved complex with serine palmitoyltransferase, the first and rate-limiting enzyme in sphingolipid production. This novel and conserved protein complex, has been termed the SPOTS complex (serine palmitoyltransferase, Orm1/2, Tsc3, and Sac1).	cellular organisms(1);Eukaryota(1);Opisthokonta(1);Metazoa(1)	21.368
+ds2020-267_858	206	pfam10775	gnl|CDD|313884	0.00091969	1	159	1	pfam10775, ATP_sub_h, ATP synthase complex subunit h.  Subunit h is a component of the yeast mitochondrial F1-F0 ATP synthase. It is essential for the correct assembly and functioning of this enzyme. Subunit h occupies a central place in the peripheral stalk between the F1 sector and the membrane.	cellular organisms(1);Eukaryota(1);Opisthokonta(1);Fungi(1)	32.258