Mercurial > repos > itaxotools > dna_convert
annotate README.md @ 0:aef9a0c2c65e draft default tip
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
| author | itaxotools |
|---|---|
| date | Sun, 29 Jan 2023 16:32:28 +0000 |
| parents | |
| children |
| rev | line source |
|---|---|
|
0
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
1 # DNAconvert |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
2 Convert between different file formats containing genetic information. |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
3 |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
4 ## Installation |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
5 Install the latest version directly using pip (requires Python 3.8 or later): |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
6 ``` |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
7 pip install git+https://github.com/iTaxoTools/DNAconvert.git#egg=DNAconvert |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
8 ``` |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
9 |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
10 ## Executables |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
11 Download and run the standalone executables without installing Python.</br> |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
12 [See the latest release here.](https://github.com/iTaxoTools/DNAconvert/releases/latest) |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
13 |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
14 ## Usage |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
15 usage: DNAconvert [-h] [--cmd] [--allow_empty_sequences] |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
16 [--informat INFORMAT] [--outformat OUTFORMAT] |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
17 [infile] [outfile] |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
18 DNAconvert |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
19 |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
20 positional arguments: |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
21 infile the input file |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
22 outfile the output file |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
23 |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
24 optional arguments: |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
25 -h, --help show this help message and exit |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
26 --cmd activates the command-line interface |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
27 --allow_empty_sequences |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
28 set this to keep the empty sequences in the output |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
29 file |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
30 --disable_automatic_renaming |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
31 disables automatic renaming, may result in duplicate |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
32 sequence names in Phylip and Nexus files |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
33 --informat INFORMAT format of the input file |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
34 --outformat OUTFORMAT |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
35 format of the output file |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
36 |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
37 ### Batch processing |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
38 |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
39 If `infile` is a directory, all files in it will be converted. In this case `informat` and `outformat` arguments are required. |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
40 |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
41 Specifying names of the output files: |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
42 * `outfile` contains a '#' character: '#' will be replaced with the base names of input files. |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
43 * `outfile` is a directory: the output files will be written in it, with the same names as input files. |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
44 |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
45 ## Supported formats |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
46 * `tab`: [Internal tab format][1] |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
47 * `tab_noheaders`: [Internal tab format][1] without headers |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
48 * `fasta`: FASTA format |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
49 * `relaxed_phylip`: relaxed Phylip format |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
50 * `fasta_hapview`: FASTA format for Haplotype Viewer |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
51 * `phylip`: Phylip format |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
52 * `fastq`: FASTQ format |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
53 * `fasta_gbexport`: FASTA format for export into Genbank repository |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
54 * `nexus`: NEXUS format |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
55 * `nexml`: DnaCharacterMatrix in NeXML format |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
56 * `genbank`: Genbank flat file format |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
57 * `mold_fasta`: FASTA format with sequence name matching requirements for the tool MolD |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
58 |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
59 ## Recognised extension |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
60 If format is not provided, the program can infer it from the file extension |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
61 |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
62 Currently recognised: |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
63 * `.tab`, `.txt`, `.tsv`: [Internal tab format][1] |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
64 * `.fas`, `.fasta`, `.fna`: FASTA format |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
65 * `.rel.phy`: relaxed Phylip format |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
66 * `.hapv.fas`: FASTA format for Haplotype Viewer |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
67 * `.phy`: Phylip format |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
68 * `.fastq`, `.fq`: FASTQ format |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
69 * `.fastq.gz`, `.fq.gz`: FASTQ format compressed with Gzip |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
70 * `.gb.fas`: FASTA format for export into Genbank repository |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
71 * `.nex`: NEXUS format |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
72 * `.xml`: NeXML format |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
73 * `.gb`: Genbank flat file format |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
74 |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
75 Files with extension `.gz` are uncompressed automatically |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
76 |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
77 ## Adding new formats |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
78 [Link to documentation](docs/ADDING_FORMATS.md) |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
79 |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
80 [1]: docs/TAB_FORMAT.md |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
81 |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
82 ## Options |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
83 DNAconvert uses two parsers for NEXUS format: internal (default) and the one from python-nexus package. |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
84 |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
85 In the file `DNAconvert/config.json` (found in `%APPDATA%\iTaxoTools` or in `$XDG_CONFIG_HOME$/`) the key-value pair |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
86 ``` |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
87 "nexus_parser" : "(method)"" |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
88 ``` |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
89 determines the parser. `(method)` is either `internal` or `python-nexus`. |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
90 |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
91 ## Generating an executable |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
92 Using [PyInstaller](http://www.pyinstaller.org) is recommended. You should first clone the repository and install DNAconvert with all dependencies (includes PyInstaller): |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
93 ``` |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
94 git clone https://github.com/iTaxoTools/DNAconvert.git |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
95 cd DNAconvert |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
96 pip install ".[dev]" |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
97 ``` |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
98 |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
99 After the following instruction, the directory `dist` will be created (among others) and the executable will be inside it: |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
100 ``` |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
101 pyinstaller scripts/DNAconvert.spec |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
102 ``` |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
103 |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
104 ## Dependencies |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
105 Automatically installed when using pip: |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
106 * [python\-nexus](https://pypi.org/project/python-nexus/) |
|
aef9a0c2c65e
planemo upload commit 232ce39054ce38be27c436a4cabec2800e14f988-dirty
itaxotools
parents:
diff
changeset
|
107 * [dendropy](https://pypi.org/project/DendroPy/) |
