comparison tool-data/blastdb_d.loc.sample @ 15:c16c30e9ad5b draft

Uploaded v0.1.03 (internal changes); v0.1.02 (BLAST+ 2.2.30 etc)
author peterjc
date Sun, 05 Jul 2015 10:37:27 -0400
parents 9dabbfd73c8a
children
comparison
equal deleted inserted replaced
14:2fe07f50a41e 15:c16c30e9ad5b
1 #This is a sample file distributed with Galaxy that is used to define a 1 # This is a sample file distributed with Galaxy that is used to define a
2 #list of protein domain databases, using three columns tab separated 2 # list of protein domain databases, using three columns tab separated
3 #(longer whitespace are TAB characters): 3 # (longer whitespace are TAB characters):
4 # 4 #
5 #<unique_id> <database_caption> <base_name_path> 5 # <unique_id>{tab}<database_caption>{tab}<base_name_path>
6 # 6 #
7 #The captions typically contain spaces and might end with the build date. 7 # The captions typically contain spaces and might end with the build date.
8 #It is important that the actual database name does not have a space in it, 8 # It is important that the actual database name does not have a space in
9 #and that there are only two tabs on each line. 9 # it, and that there are only two tabs on each line.
10 # 10 #
11 #You can download the NCBI provided databases as tar-balls from here: 11 # You can download the NCBI provided databases as tar-balls from here:
12 #ftp://ftp.ncbi.nih.gov/pub/mmdb/cdd/little_endian/ 12 # ftp://ftp.ncbi.nih.gov/pub/mmdb/cdd/little_endian/
13 # 13 #
14 #So, for example, if your database is CDD and the path to your base name 14 # For simplicity, many Galaxy servers are configured to offer just a live
15 #is /data/blastdb/Cdd, then the blastdb_d.loc entry would look like this: 15 # version of each NCBI BLAST database (updated with the NCBI provided
16 # Perl scripts or similar). In this case, we recommend using the case
17 # sensistive base-name of the NCBI BLAST databases as the unique id.
18 # Consistent naming is important for sharing workflows between Galaxy
19 # servers.
16 # 20 #
17 #Cdd{tab}NCBI Conserved Domains Database (CDD){tab}/data/blastdb/Cdd 21 # For example, consider the NCBI Conserved Domains Database (CDD), where
22 # you have downloaded and decompressed the files under the directory
23 # /data/blastdb/domains/ meaning at the command line BLAST+ would be
24 # run as follows any would look at the files /data/blastdb/domains/Cdd.*:
18 # 25 #
19 #and your /data/blastdb directory would contain all of the files associated 26 # $ rpsblast -db /data/blastdb/domains/Cdd -query ...
20 #with the database, /data/blastdb/Cdd.*.
21 # 27 #
22 #Your blastdb_d.loc file should include an entry per line for each "base name" 28 # In this case use Cdd (title case to match the NCBI file naming) as the
23 #you have stored. For example: 29 # unique id in the first column of blastdb_d.loc, giving an entry like
30 # this:
24 # 31 #
25 #Cdd NCBI CDD /data/blastdb/domains/Cdd 32 # Cdd{tab}NCBI Conserved Domains Database (CDD){tab}/data/blastdb/domains/Cdd
26 #Kog KOG (eukaryotes) /data/blastdb/domains/Kog
27 #Cog COG (prokaryotes) /data/blastdb/domains/Cog
28 #Pfam Pfam-A /data/blastdb/domains/Pfam
29 #Smart SMART /data/blastdb/domains/Smart
30 #Tigr TIGR /data/blastdb/domains/Tigr
31 #Prk Protein Clusters database /data/blastdb/domains/Prk
32 #...etc...
33 # 33 #
34 #See also blastdb.loc which is for any nucleotide BLAST database, and 34 # Your blastdb_d.loc file should include an entry per line for each "base name"
35 #blastdb_p.loc which is for any protein BLAST databases. 35 # you have stored. For example:
36 #
37 # Cdd{tab}NCBI CDD{tab}/data/blastdb/domains/Cdd
38 # Kog{tab}KOG (eukaryotes){tab}/data/blastdb/domains/Kog
39 # Cog{tab}COG (prokaryotes){tab}/data/blastdb/domains/Cog
40 # Pfam{tab}Pfam-A{tab}/data/blastdb/domains/Pfam
41 # Smart{tab}SMART{tab}/data/blastdb/domains/Smart
42 # Tigr{tab}TIGR /data/blastdb/domains/Tigr
43 # Prk{tab}Protein Clusters database{tab}/data/blastdb/domains/Prk
44 # ...etc...
45 #
46 # Alternatively, rather than a "live" mirror of the NCBI databases which
47 # are updated automatically, for full reproducibility the Galaxy Team
48 # recommend saving date-stamped copies of the databases. In this case
49 # your blastdb_d.loc file should include an entry per line for each
50 # version you have stored. For example:
51 #
52 # Cdd_05Jun2010{tab}NCBI CDD 05 Jun 2010{tab}/data/blastdb/domains/05Jun2010/Cdd
53 # Cdd_15Aug2010{tab}NCBI CDD 15 Aug 2010{tab}/data/blastdb/domains/15Aug2010/Cdd
54 # ...etc...
55 #
56 # See also blastdb.loc which is for any nucleotide BLAST database, and
57 # blastdb_p.loc which is for any protein BLAST databases.