comparison tool-data/blastdb_p.loc.sample @ 15:c16c30e9ad5b draft

Uploaded v0.1.03 (internal changes); v0.1.02 (BLAST+ 2.2.30 etc)
author peterjc
date Sun, 05 Jul 2015 10:37:27 -0400
parents 9dabbfd73c8a
children
comparison
equal deleted inserted replaced
14:2fe07f50a41e 15:c16c30e9ad5b
1 #This is a sample file distributed with Galaxy that is used to define a 1 # This is a sample file distributed with Galaxy that is used to define a
2 #list of protein BLAST databases, using three columns tab separated 2 # list of protein BLAST databases, using three columns tab separated:
3 #(longer whitespace are TAB characters):
4 # 3 #
5 #<unique_id> <database_caption> <base_name_path> 4 # <unique_id>{tab}<database_caption>{tab}<base_name_path>
6 # 5 #
7 #The captions typically contain spaces and might end with the build date. 6 # The captions typically contain spaces and might end with the build date.
8 #It is important that the actual database name does not have a space in 7 # It is important that the actual database name does not have a space in
9 #it, and that there are only two tabs on each line. 8 # it, and that there are only two tabs on each line.
10 # 9 #
11 #So, for example, if your database is NR and the path to your base name 10 # You can download the NCBI provided protein databases like NR from here:
12 #is /data/blastdb/nr, then the blastdb_p.loc entry would look like this: 11 # ftp://ftp.ncbi.nlm.nih.gov/blast/db/
13 # 12 #
14 #nr{tab}NCBI NR (non redundant){tab}/data/blastdb/nr 13 # For simplicity, many Galaxy servers are configured to offer just a live
14 # version of each NCBI BLAST database (updated with the NCBI provided
15 # Perl scripts or similar). In this case, we recommend using the case
16 # sensistive base-name of the NCBI BLAST databases as the unique id.
17 # Consistent naming is important for sharing workflows between Galaxy
18 # servers.
15 # 19 #
16 #and your /data/blastdb directory would contain all of the files associated 20 # For example, consider the NCBI "non-redundant" protein BLAST database
17 #with the database, /data/blastdb/nr.*. 21 # where you have downloaded and decompressed the files under /data/blastdb/
22 # meaning at the command line BLAST+ would be run with something like
23 # which would look at the files /data/blastdb/nr.p*:
18 # 24 #
19 #Your blastdb_p.loc file should include an entry per line for each "base name" 25 # $ blastp -db /data/blastdb/nr -query ...
20 #you have stored. For example:
21 # 26 #
22 #nr_05Jun2010 NCBI NR (non redundant) 05 Jun 2010 /data/blastdb/05Jun2010/nr 27 # In this case use nr (lower case to match the NCBI file naming) as the
23 #nr_15Aug2010 NCBI NR (non redundant) 15 Aug 2010 /data/blastdb/15Aug2010/nr 28 # unique id in the first column of blastdb_p.loc, giving an entry like
24 #...etc... 29 # this:
25 # 30 #
26 #You can download the NCBI provided protein databases like NR from here: 31 # nr{tab}NCBI non-redundant (nr){tab}/data/blastdb/nr
27 #ftp://ftp.ncbi.nlm.nih.gov/blast/db/
28 # 32 #
29 #See also blastdb.loc which is for any nucleotide BLAST database, and 33 # Alternatively, rather than a "live" mirror of the NCBI databases which
30 #blastdb_d.loc which is for any protein domains databases (like CDD). 34 # are updated automatically, for full reproducibility the Galaxy Team
35 # recommend saving date-stamped copies of the databases. In this case
36 # your blastdb_p.loc file should include an entry per line for each
37 # version you have stored. For example:
38 #
39 # nr_05Jun2010{tab}NCBI NR (non redundant) 05 Jun 2010{tab}/data/blastdb/05Jun2010/nr
40 # nr_15Aug2010{tab}NCBI NR (non redundant) 15 Aug 2010{tab}/data/blastdb/15Aug2010/nr
41 # ...etc...
42 #
43 # See also blastdb.loc which is for any nucleotide BLAST database, and
44 # blastdb_d.loc which is for any protein domains databases (like CDD).