dante_ltr: README.md comparison

comparison README.md @ 12:ff01d4263391 draft

"planemo upload commit 414119ad7c44562d2e956b765e97ca113bc35b2b-dirty"

author	petr-novak
date	Thu, 21 Jul 2022 08:23:15 +0000
parents	9de392f2fc02
children

comparison

equal deleted inserted replaced

-:54bd36973253
+:ff01d4263391
 ## Usage
 ### Detection of complete LTR retrotransposons
 ```shell
-Usage: ./extract_putative_ltr.R COMMAND [OPTIONS]
+Usage: ./detect_putative_ltr.R COMMAND [OPTIONS]
 Options:
 -g GFF3, --gff3=GFF3
 gff3 with dante results
 #### Example:
 ```shell
 mkdir -p tmp
-./extract_putative_ltr.R -g test_data/sample_DANTE.gff3 -s test_data/sample_genome.fasta -o tmp/ltr_annotation
+./detect_putative_ltr.R -g test_data/sample_DANTE.gff3 -s test_data/sample_genome.fasta -o tmp/ltr_annotation
 ```
 ####  Files in the output of `extract_putative_ltr.R`:
 - `prefix.gff3` - annotation of all identified elements
 - `prefix_DLTP.fasta` - elements with **d**omains, **L**TR, **T**SD and **P**BS
 - `prefix_DLP.fasta` - elements with **d**omains, **L**TR and **P**BS
 - `prefix_DLT.fasta` - elements with **d**omains, **L**TR, **T**SD
 - `prefix_statistics.csv` - number of elements in individual categories
+For large genomes, you can your `detect_putative_ltr_wrapper.py`. This script will split input fasta to smaller chunks and run `detect_putative_ltr.R` on each chunk to limit memory usage. Output will be merged after all chunks are processed.
+```shell
+usage: detect_putative_ltr_wrapper.py [-h] -g GFF3 -s REFERENCE_SEQUENCE -o
+OUTPUT [-c CPU] [-M MAX_MISSING_DOMAINS]
+[-L MIN_RELATIVE_LENGTH]
+[-S MAX_CHUNK_SIZE]
+detect_putative_ltr_wrapper.py is a wrapper for
+detect_putative_ltr.R
+optional arguments:
+-h, --help            show this help message and exit
+-g GFF3, --gff3 GFF3  gff3 file
+-s REFERENCE_SEQUENCE, --reference_sequence REFERENCE_SEQUENCE
+reference sequence as fasta file
+-o OUTPUT, --output OUTPUT
+output file path and prefix
+-c CPU, --cpu CPU     number of CPUs
+-M MAX_MISSING_DOMAINS, --max_missing_domains MAX_MISSING_DOMAINS
+-L MIN_RELATIVE_LENGTH, --min_relative_length MIN_RELATIVE_LENGTH
+Minimum relative length of protein domain to be considered
+for retrostransposon detection
+-S MAX_CHUNK_SIZE, --max_chunk_size MAX_CHUNK_SIZE
+If size of reference sequence is greater than this value,
+reference is analyzed in chunks of this size. This is
+just approximate value - sequences which are longer
+are are not split, default is 100000000
+```
 ### Validation of LTR retrotransposons detected un previous step:
 ```shell
 ./clean_ltr.R --help

Mercurial > repos > petr-novak > dante_ltr

comparison README.md @ 12:ff01d4263391 draft