annotate README.md @ 0:f80c9e6700ba draft default tip

planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
author yating-l
date Tue, 16 May 2017 13:18:12 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
1 WindowMasker
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
2 ------------
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
3
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
4 This is a Galaxy Wrapper for WindowMasker. WindowMasker is a program that can mask out highly repetitive and low complexity DNA sequences within a genome using the sequence of the genome itself.
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
5
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
6 The WinMask module works in two stages. During Stage 1, unit counts are collected and stored in a separate file. During Stage 2 that file is used to mask the input sequences. Usually the unit counts file is created once per genome and then used multiple times for masking.
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
7
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
8 WindowMasker_mkcounts
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
9 ======================
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
10 Stage 1: Generate a counts file
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
11
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
12 $ windowmasker -mk_counts [-in input_file_name] [-out output_file_name] [-checkdup check_duplicates] [-t_low T_low] [-t_high T_high] [-fa_list input_is_a_list] [-mem available_memory] [-unit unit_length] [-genome_size genome_size] [-exclude_ids exclide_id_list] [-ids id_list] [-infmt input_format] [-sformat unit_counts_format] [-smem available_memory] [-use_ba use_bit_arrays]
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
13
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
14
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
15 WindowMasker_ustat
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
16 ===================
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
17 Stage 2: WindowMasker reads the data generated in Stage 1 and a set of input DNA sequences to output information about masked subintervals. If "-dust true" is specified, then the corresponding algorithm of the DUST module is applied to the input sequences in addition to window based masking. When DUST module is run, the results of the DUST and WinMask modules are merged together in the output. Specifically, a base is masked if it is masked by either DUST or by WinMask.
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
18
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
19 windowmasker -ustat unit_counts [-in input_file_name] [-out output_file_name] [-window window_size] [-t_thres T_threshold] [-t_extend T_extend] [-t_low T_low] [-t_high T_high] [-set_t_low score] [-set_t_high score] [-infmt input_format] [-outfmt output_format] [-dust use_dust] [-exclude_ids exclude_id_list] [-ids id_list] [-text_match text_match_ids] [-use_ba use_bit_arrays]
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
20
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
21 Output formats:
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
22 * Use the binary or text maskinfo ASN.1 output formats to generate the mask file for the NCBI BLAST+ makeblastdb tool
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
23 * Use the BED output format to generate a list of masked regions
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
24
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
25 Reference
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
26 ==========
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
27 [NCBI C++ Toolkit Cross Reference -- WindowMasker](https://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/app/winmasker/README)
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
28
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
29 Citation
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
30 =========
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
31
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
32 [1] Morgulis A, Gertz EM, Schaffer AA, Agarwala R. WindowMasker:
f80c9e6700ba planemo upload commit 91a780909d1eda07d17f6aebf7f08f0c024b6a25
yating-l
parents:
diff changeset
33 Window based masker for sequence genomes. Submitted for publication.