annotate README.md @ 0:aef9a0c2c65e draft default tip

planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
author itaxotools
date Sun, 29 Jan 2023 16:32:28 +0000
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
1 # DNAconvert
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
2 Convert between different file formats containing genetic information.
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
3
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
4 ## Installation
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
5 Install the latest version directly using pip (requires Python 3.8 or later):
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
6 ```
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
7 pip install git+https://github.com/iTaxoTools/DNAconvert.git#egg=DNAconvert
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
8 ```
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
9
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
10 ## Executables
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
11 Download and run the standalone executables without installing Python.</br>
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
12 [See the latest release here.](https://github.com/iTaxoTools/DNAconvert/releases/latest)
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
13
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
14 ## Usage
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
15 usage: DNAconvert [-h] [--cmd] [--allow_empty_sequences]
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
16 [--informat INFORMAT] [--outformat OUTFORMAT]
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
17 [infile] [outfile]
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
18 DNAconvert
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
19
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
20 positional arguments:
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
21 infile the input file
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
22 outfile the output file
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
23
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
24 optional arguments:
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
25 -h, --help show this help message and exit
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
26 --cmd activates the command-line interface
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
27 --allow_empty_sequences
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
28 set this to keep the empty sequences in the output
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
29 file
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
30 --disable_automatic_renaming
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
31 disables automatic renaming, may result in duplicate
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
32 sequence names in Phylip and Nexus files
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
33 --informat INFORMAT format of the input file
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
34 --outformat OUTFORMAT
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
35 format of the output file
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
36
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
37 ### Batch processing
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
38
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
39 If `infile` is a directory, all files in it will be converted. In this case `informat` and `outformat` arguments are required.
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
40
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
41 Specifying names of the output files:
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
42 * `outfile` contains a '#' character: '#' will be replaced with the base names of input files.
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
43 * `outfile` is a directory: the output files will be written in it, with the same names as input files.
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
44
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
45 ## Supported formats
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
46 * `tab`: [Internal tab format][1]
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
47 * `tab_noheaders`: [Internal tab format][1] without headers
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
48 * `fasta`: FASTA format
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
49 * `relaxed_phylip`: relaxed Phylip format
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
50 * `fasta_hapview`: FASTA format for Haplotype Viewer
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
51 * `phylip`: Phylip format
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
52 * `fastq`: FASTQ format
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
53 * `fasta_gbexport`: FASTA format for export into Genbank repository
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
54 * `nexus`: NEXUS format
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
55 * `nexml`: DnaCharacterMatrix in NeXML format
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
56 * `genbank`: Genbank flat file format
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
57 * `mold_fasta`: FASTA format with sequence name matching requirements for the tool MolD
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
58
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
59 ## Recognised extension
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
60 If format is not provided, the program can infer it from the file extension
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
61
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
62 Currently recognised:
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
63 * `.tab`, `.txt`, `.tsv`: [Internal tab format][1]
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
64 * `.fas`, `.fasta`, `.fna`: FASTA format
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
65 * `.rel.phy`: relaxed Phylip format
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
66 * `.hapv.fas`: FASTA format for Haplotype Viewer
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
67 * `.phy`: Phylip format
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
68 * `.fastq`, `.fq`: FASTQ format
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
69 * `.fastq.gz`, `.fq.gz`: FASTQ format compressed with Gzip
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
70 * `.gb.fas`: FASTA format for export into Genbank repository
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
71 * `.nex`: NEXUS format
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
72 * `.xml`: NeXML format
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
73 * `.gb`: Genbank flat file format
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
74
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
75 Files with extension `.gz` are uncompressed automatically
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
76
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
77 ## Adding new formats
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
78 [Link to documentation](docs/ADDING_FORMATS.md)
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
79
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
80 [1]: docs/TAB_FORMAT.md
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
81
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
82 ## Options
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
83 DNAconvert uses two parsers for NEXUS format: internal (default) and the one from python-nexus package.
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
84
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
85 In the file `DNAconvert/config.json` (found in `%APPDATA%\iTaxoTools` or in `$XDG_CONFIG_HOME$/`) the key-value pair
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
86 ```
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
87 "nexus_parser" : "(method)""
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
88 ```
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
89 determines the parser. `(method)` is either `internal` or `python-nexus`.
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
90
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
91 ## Generating an executable
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
92 Using [PyInstaller](http://www.pyinstaller.org) is recommended. You should first clone the repository and install DNAconvert with all dependencies (includes PyInstaller):
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
93 ```
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
94 git clone https://github.com/iTaxoTools/DNAconvert.git
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
95 cd DNAconvert
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
96 pip install ".[dev]"
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
97 ```
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
98
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
99 After the following instruction, the directory `dist` will be created (among others) and the executable will be inside it:
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
100 ```
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
101 pyinstaller scripts/DNAconvert.spec
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
102 ```
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
103
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
104 ## Dependencies
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
105 Automatically installed when using pip:
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
106 * [python\-nexus](https://pypi.org/project/python-nexus/)
aef9a0c2c65e planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff changeset
107 * [dendropy](https://pypi.org/project/DendroPy/)