0
|
1 Hi, I’m GECCO!
|
|
2 ==============
|
|
3
|
|
4 🦎 ️Overview
|
|
5 ---------------
|
|
6
|
|
7 GECCO (Gene Cluster prediction with Conditional Random Fields) is a fast
|
|
8 and scalable method for identifying putative novel Biosynthetic Gene
|
|
9 Clusters (BGCs) in genomic and metagenomic data using Conditional Random
|
|
10 Fields (CRFs).
|
|
11
|
|
12 |GitLabCI| |License| |Coverage| |Docs| |Source| |Mirror| |Changelog|
|
|
13 |Issues| |Preprint| |PyPI| |Bioconda| |Versions| |Wheel|
|
|
14
|
|
15 🔧 Installing GECCO
|
|
16 -------------------
|
|
17
|
|
18 GECCO is implemented in `Python <https://www.python.org/>`__, and
|
|
19 supports `all versions <https://endoflife.date/python>`__ from Python
|
|
20 3.6. It requires additional libraries that can be installed directly
|
|
21 from `PyPI <https://pypi.org>`__, the Python Package Index.
|
|
22
|
|
23 Use ```pip`` <https://pip.pypa.io/en/stable/>`__ to install GECCO on
|
|
24 your machine:
|
|
25
|
|
26 .. code:: console
|
|
27
|
|
28 $ pip install gecco-tool
|
|
29
|
|
30 If you’d rather use `Conda <https://conda.io>`__, a package is available
|
|
31 in the ```bioconda`` <https://bioconda.github.io/>`__ channel. You can
|
|
32 install with:
|
|
33
|
|
34 .. code:: console
|
|
35
|
|
36 $ conda install -c bioconda gecco
|
|
37
|
|
38 This will install GECCO, its dependencies, and the data needed to run
|
|
39 predictions. This requires around 100MB of data to be downloaded, so it
|
|
40 could take some time depending on your Internet connection. Once done,
|
|
41 you will have a ``gecco`` command available in your $PATH.
|
|
42
|
|
43 *Note that GECCO uses*\ `HMMER3 <http://hmmer.org/>`__\ *, which can
|
|
44 only run on PowerPC and recent x86-64 machines running a POSIX operating
|
|
45 system. Therefore, Linux and OSX are supported platforms, but GECCO will
|
|
46 not be able to run on Windows.*
|
|
47
|
|
48 🧬 Running GECCO
|
|
49 -----------------
|
|
50
|
|
51 Once ``gecco`` is installed, you can run it from the terminal by giving
|
|
52 it a FASTA or GenBank file with the genomic sequence you want to
|
|
53 analyze, as well as an output directory:
|
|
54
|
|
55 .. code:: console
|
|
56
|
|
57 $ gecco run --genome some_genome.fna -o some_output_dir
|
|
58
|
|
59 Additional parameters of interest are:
|
|
60
|
|
61 - ``--jobs``, which controls the number of threads that will be spawned
|
|
62 by GECCO whenever a step can be parallelized. The default, *0*, will
|
|
63 autodetect the number of CPUs on the machine using
|
|
64 ```os.cpu_count`` <https://docs.python.org/3/library/os.html#os.cpu_count>`__.
|
|
65 - ``--cds``, controlling the minimum number of consecutive genes a BGC
|
|
66 region must have to be detected by GECCO (default is 3).
|
|
67 - ``--threshold``, controlling the minimum probability for a gene to be
|
|
68 considered part of a BGC region. Using a lower number will increase
|
|
69 the number (and possibly length) of predictions, but reduce accuracy.
|
|
70
|
|
71 🔖 Reference
|
|
72 -------------
|
|
73
|
|
74 GECCO can be cited using the following preprint:
|
|
75
|
|
76 **Accurate de novo identification of biosynthetic gene clusters with
|
|
77 GECCO**. Laura M Carroll, Martin Larralde, Jonas Simon Fleck, Ruby
|
|
78 Ponnudurai, Alessio Milanese, Elisa Cappio Barazzone, Georg Zeller.
|
|
79 bioRxiv 2021.05.03.442509;
|
|
80 `doi:10.1101/2021.05.03.442509 <https://doi.org/10.1101/2021.05.03.442509>`__
|
|
81
|
|
82 💭 Feedback
|
|
83 ------------
|
|
84
|
|
85 ⚠️ Issue Tracker
|
|
86 ~~~~~~~~~~~~~~~~
|
|
87
|
|
88 Found a bug ? Have an enhancement request ? Head over to the `GitHub
|
|
89 issue tracker <https://github.com/zellerlab/GECCO/issues>`__ if you need
|
|
90 to report or ask something. If you are filing in on a bug, please
|
|
91 include as much information as you can about the issue, and try to
|
|
92 recreate the same bug in a simple, easily reproducible situation.
|
|
93
|
|
94 🏗️ Contributing
|
|
95 ~~~~~~~~~~~~~~~~
|
|
96
|
|
97 Contributions are more than welcome! See
|
|
98 ```CONTRIBUTING.md`` <https://github.com/althonos/pyhmmer/blob/master/CONTRIBUTING.md>`__
|
|
99 for more details.
|
|
100
|
|
101 ⚖️ License
|
|
102 ----------
|
|
103
|
|
104 This software is provided under the `GNU General Public License v3.0 or
|
|
105 later <https://choosealicense.com/licenses/gpl-3.0/>`__. GECCO is
|
|
106 developped by the `Zeller
|
|
107 Team <https://www.embl.de/research/units/scb/zeller/index.html>`__ at
|
|
108 the `European Molecular Biology Laboratory <https://www.embl.de/>`__ in
|
|
109 Heidelberg.
|
|
110
|
|
111 .. |GitLabCI| image:: https://img.shields.io/gitlab/pipeline/grp-zeller/GECCO/master?gitlab_url=https%3A%2F%2Fgit.embl.de&style=flat-square&maxAge=600
|
|
112 :target: https://git.embl.de/grp-zeller/GECCO/-/pipelines/
|
|
113 .. |License| image:: https://img.shields.io/badge/license-GPLv3-blue.svg?style=flat-square&maxAge=2678400
|
|
114 :target: https://choosealicense.com/licenses/gpl-3.0/
|
|
115 .. |Coverage| image:: https://img.shields.io/codecov/c/gh/zellerlab/GECCO?style=flat-square&maxAge=600
|
|
116 :target: https://codecov.io/gh/zellerlab/GECCO/
|
|
117 .. |Docs| image:: https://img.shields.io/badge/docs-gecco.embl.de-green.svg?maxAge=2678400&style=flat-square
|
|
118 :target: https://gecco.embl.de
|
|
119 .. |Source| image:: https://img.shields.io/badge/source-GitHub-303030.svg?maxAge=2678400&style=flat-square
|
|
120 :target: https://github.com/zellerlab/GECCO/
|
|
121 .. |Mirror| image:: https://img.shields.io/badge/mirror-EMBL-009f4d?style=flat-square&maxAge=2678400
|
|
122 :target: https://git.embl.de/grp-zeller/GECCO/
|
|
123 .. |Changelog| image:: https://img.shields.io/badge/keep%20a-changelog-8A0707.svg?maxAge=2678400&style=flat-square
|
|
124 :target: https://github.com/zellerlab/GECCO/blob/master/CHANGELOG.md
|
|
125 .. |Issues| image:: https://img.shields.io/github/issues/zellerlab/GECCO.svg?style=flat-square&maxAge=600
|
|
126 :target: https://github.com/zellerlab/GECCO/issues
|
|
127 .. |Preprint| image:: https://img.shields.io/badge/preprint-bioRxiv-darkblue?style=flat-square&maxAge=2678400
|
|
128 :target: https://www.biorxiv.org/content/10.1101/2021.05.03.442509v1
|
|
129 .. |PyPI| image:: https://img.shields.io/pypi/v/gecco-tool.svg?style=flat-square&maxAge=3600
|
|
130 :target: https://pypi.python.org/pypi/gecco-tool
|
|
131 .. |Bioconda| image:: https://img.shields.io/conda/vn/bioconda/gecco?style=flat-square&maxAge=3600
|
|
132 :target: https://anaconda.org/bioconda/gecco
|
|
133 .. |Versions| image:: https://img.shields.io/pypi/pyversions/gecco-tool.svg?style=flat-square&maxAge=3600
|
|
134 :target: https://pypi.org/project/gecco-tool/#files
|
|
135 .. |Wheel| image:: https://img.shields.io/pypi/wheel/gecco-tool?style=flat-square&maxAge=3600
|
|
136 :target: https://pypi.org/project/gecco-tool/#files
|