annotate test-data/cd_hit_protein_in.fasta @ 3:43724ea1c85f

Add cd-hit for protein fastas
author Jim Johnson <jj@umn.edu>
date Thu, 27 Jun 2013 21:37:08 -0500
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
3
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
1 >sp|P00325|ADH1B_HUMAN Alcohol dehydrogenase 1B OS=Homo sapiens GN=ADH1B PE=1 SV=2
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
2 MSTAGKVIKCKAAVLWEVKKPFSIEDVEVAPPKAYEVRIKMVAVGICRTDDHVVSGNLVT
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
3 PLPVILGHEAAGIVESVGEGVTTVKPGDKVIPLFTPQCGKCRVCKNPESNYCLKNDLGNP
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
4 RGTLQDGTRRFTCRGKPIHHFLGTSTFSQYTVVDENAVAKIDAASPLEKVCLIGCGFSTG
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
5 YGSAVNVAKVTPGSTCAVFGLGGVGLSAVMGCKAAGAARIIAVDINKDKFAKAKELGATE
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
6 CINPQDYKKPIQEVLKEMTDGGVDFSFEVIGRLDTMMASLLCCHEACGTSVIVGVPPASQ
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
7 NLSINPMLLLTGRTWKGAVYGGFKSKEGIPKLVADFMAKKFSLDALITHVLPFEKINEGF
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
8 DLLHSGKSIRTVLTF
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
9 >tr|K7D361|K7D361_PANTR Alcohol dehydrogenase 1B (Class I), beta polypeptide OS=Pan troglodytes GN=ADH1B PE=2 SV=1
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
10 MSTAGKVIKCKAAVLWEVKKPFSIEDVEVAPPKAYEVRIKMVAVGICRTDDHVVSGNLVT
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
11 PLPAILGHEAAGIVESVGEGVTTVKPGDKVIPLFTPQCGKCRVCKNPESNYCLKNDLGNP
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
12 RGTLQDGTRRFTCRGKPIHHFLGTSTFSQYTVVDENAVAKIDAASPLEKVCLIGCGFSTG
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
13 YGSAVNVAKVTPGSTCAVFGLGGVGLSAVMGCKAAGAARIIAVDINKDKFAKAKELGATE
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
14 CINPQDYKKPIQEVLKEMTDGGVDFSFEVIGRLDTMMASLLCCHEACGTSVIVGVPPASQ
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
15 NLSINPMLLLTGRTWKGAVYGGFKSKEGIPKLVADFMAKKFSLDALITHVLPFEKINEGF
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
16 DLLHSGKSIRTVLTF
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
17 >sp|P00329|ADH1_MOUSE Alcohol dehydrogenase 1 OS=Mus musculus GN=Adh1 PE=2 SV=2
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
18 MSTAGKVIKCKAAVLWELHKPFTIEDIEVAPPKAHEVRIKMVATGVCRSDDHVVSGTLVT
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
19 PLPAVLGHEGAGIVESVGEGVTCVKPGDKVIPLFSPQCGECRICKHPESNFCSRSDLLMP
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
20 RGTLREGTSRFSCKGKQIHNFISTSTFSQYTVVDDIAVAKIDGASPLDKVCLIGCGFSTG
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
21 YGSAVKVAKVTPGSTCAVFGLGGVGLSVIIGCKAAGAARIIAVDINKDKFAKAKELGATE
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
22 CINPQDYSKPIQEVLQEMTDGGVDFSFEVIGRLDTMTSALLSCHAACGVSVVVGVPPNAQ
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
23 NLSMNPMLLLLGRTWKGAIFGGFKSKDSVPKLVADFMAKKFPLDPLITHVLPFEKINEAF
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
24 DLLRSGKSIRTVLTF
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
25 >sp|P00338-2|LDHA_HUMAN Isoform 2 of L-lactate dehydrogenase A chain OS=Homo sapiens GN=LDHA
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
26 MATLKDQLIYNLLKEEQTPQNKITVVGVGAVGMACAISILMKDLADELALVDVIEDKLKG
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
27 EMMDLQHGSLFLRTPKIVSGKDYNVTANSKLVIITAGARQQEGESRLNLVQRNVNIFKFI
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
28 IPNVVKYSPNCKLLIVSNPVDILTYVAWKISGFPKNRVIGSGCNLDSARFRYLMGERLGV
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
29 HPLSCHGWVLGEHGDSSVPVWSGMNVAGVSLKTLHPDLGTDKDKEQWKECRYTLGDPKGA
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
30 AILKSSDVISFHCLGYNRILGGGCACCPFYLICD
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
31 >sp|P00338-5|LDHA_HUMAN Isoform 5 of L-lactate dehydrogenase A chain OS=Homo sapiens GN=LDHA
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
32 MATLKDQLIYNLLKEEQTPQNKITVVGVGAVGMACAISILMKDLADELALVDVIEDKLKG
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
33 EMMDLQHGSLFLRTPKIVSGKDYNVTANSKLVIITAGARQQEGESRLNLVQRNVNIFKFI
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
34 IPNVVKYSPNCKLLIVSNPVDILTYVAWKISGFPKNRVIGSGCNLDSARFRYLMGERLGV
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
35 HPLSCHGWVLGEHGDSSVPVWSGMNVAGVSLKTLHPDLGTDKDKEQWKEVHKQVVERVFT
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
36 E
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
37 >sp|P00340|LDHA_CHICK L-lactate dehydrogenase A chain OS=Gallus gallus GN=LDHA PE=1 SV=3
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
38 MSLKDHLIHNVHKEEHAHAHNKISVVGVGAVGMACAISILMKDLADELTLVDVVEDKLKG
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
39 EMLDLQHGSLFLKTPKIISGKDYSVTAHSKLVIVTAGARQQEGESRLNLVQRNVNIFKFI
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
40 IPNVVKYSPDCKLLIVSNPVDILTYVAWKISGFPKHRVIGSGCNLDSARFRHLMGERLGI
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
41 HPLSCHGWIVGEHGDSSVPVWSGVNVAGVSLKALHPDMGTDADKEHWKEVHKQVVDSAYE
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
42 VIKLKGYTSWAIGLSVADLAETIMKNLRRVHPISTAVKGMHGIKDDVFLSVPCVLGSSGI
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
43 TDVVKMILKPDEEEKIKKSADTLWGIQKELQF
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
44 >sp|P19858|LDHA_BOVIN L-lactate dehydrogenase A chain OS=Bos taurus GN=LDHA PE=2 SV=2
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
45 MATLKDQLIQNLLKEEHVPQNKITIVGVGAVGMACAISILMKDLADEVALVDVMEDKLKG
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
46 EMMDLQHGSLFLRTPKIVSGKDYNVTANSRLVIITAGARQQEGESRLNLVQRNVNIFKFI
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
47 IPNIVKYSPNCKLLVVSNPVDILTYVAWKISGFPKNRVIGSGCNLDSARFRYLMGERLGV
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
48 HPLSCHGWILGEHGDSSVPVWSGVNVAGVSLKNLHPELGTDADKEQWKAVHKQVVDSAYE
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
49 VIKLKGYTSWAIGLSVADLAESIMKNLRRVHPISTMIKGLYGIKEDVFLSVPCILGQNGI
43724ea1c85f Add cd-hit for protein fastas
Jim Johnson <jj@umn.edu>
parents:
diff changeset
50 SDVVKVTLTHEEEACLKKSADTLWGIQKELQF