0
|
1 Hi, I’m GECCO!
|
|
2 ==============
|
|
3
|
1
|
4 .. image:: https://raw.githubusercontent.com/zellerlab/GECCO/v0.6.2/static/gecco-square.png
|
|
5 :target: https://github.com/zellerlab/GECCO/
|
|
6
|
|
7
|
0
|
8 🦎 ️Overview
|
|
9 ---------------
|
|
10
|
|
11 GECCO (Gene Cluster prediction with Conditional Random Fields) is a fast
|
|
12 and scalable method for identifying putative novel Biosynthetic Gene
|
|
13 Clusters (BGCs) in genomic and metagenomic data using Conditional Random
|
|
14 Fields (CRFs).
|
|
15
|
|
16 |GitLabCI| |License| |Coverage| |Docs| |Source| |Mirror| |Changelog|
|
|
17 |Issues| |Preprint| |PyPI| |Bioconda| |Versions| |Wheel|
|
|
18
|
|
19 🔧 Installing GECCO
|
|
20 -------------------
|
|
21
|
|
22 GECCO is implemented in `Python <https://www.python.org/>`__, and
|
|
23 supports `all versions <https://endoflife.date/python>`__ from Python
|
|
24 3.6. It requires additional libraries that can be installed directly
|
|
25 from `PyPI <https://pypi.org>`__, the Python Package Index.
|
|
26
|
1
|
27 Use `pip <https://pip.pypa.io/en/stable/>`__ to install GECCO on
|
|
28 your machine::
|
0
|
29
|
|
30 $ pip install gecco-tool
|
|
31
|
|
32 If you’d rather use `Conda <https://conda.io>`__, a package is available
|
1
|
33 in the `bioconda <https://bioconda.github.io/>`__ channel. You can
|
|
34 install with::
|
0
|
35
|
|
36 $ conda install -c bioconda gecco
|
|
37
|
|
38 This will install GECCO, its dependencies, and the data needed to run
|
|
39 predictions. This requires around 100MB of data to be downloaded, so it
|
|
40 could take some time depending on your Internet connection. Once done,
|
|
41 you will have a ``gecco`` command available in your $PATH.
|
|
42
|
1
|
43 *Note that GECCO uses* `HMMER3 <http://hmmer.org/>`__, *which can
|
0
|
44 only run on PowerPC and recent x86-64 machines running a POSIX operating
|
|
45 system. Therefore, Linux and OSX are supported platforms, but GECCO will
|
|
46 not be able to run on Windows.*
|
|
47
|
|
48 🧬 Running GECCO
|
|
49 -----------------
|
|
50
|
|
51 Once ``gecco`` is installed, you can run it from the terminal by giving
|
|
52 it a FASTA or GenBank file with the genomic sequence you want to
|
1
|
53 analyze, as well as an output directory::
|
0
|
54
|
|
55 $ gecco run --genome some_genome.fna -o some_output_dir
|
|
56
|
|
57 Additional parameters of interest are:
|
|
58
|
|
59 - ``--jobs``, which controls the number of threads that will be spawned
|
|
60 by GECCO whenever a step can be parallelized. The default, *0*, will
|
|
61 autodetect the number of CPUs on the machine using
|
1
|
62 `os.cpu_count <https://docs.python.org/3/library/os.html#os.cpu_count>`__.
|
0
|
63 - ``--cds``, controlling the minimum number of consecutive genes a BGC
|
|
64 region must have to be detected by GECCO (default is 3).
|
|
65 - ``--threshold``, controlling the minimum probability for a gene to be
|
|
66 considered part of a BGC region. Using a lower number will increase
|
|
67 the number (and possibly length) of predictions, but reduce accuracy.
|
|
68
|
|
69 🔖 Reference
|
|
70 -------------
|
|
71
|
|
72 GECCO can be cited using the following preprint:
|
|
73
|
|
74 **Accurate de novo identification of biosynthetic gene clusters with
|
|
75 GECCO**. Laura M Carroll, Martin Larralde, Jonas Simon Fleck, Ruby
|
|
76 Ponnudurai, Alessio Milanese, Elisa Cappio Barazzone, Georg Zeller.
|
|
77 bioRxiv 2021.05.03.442509;
|
|
78 `doi:10.1101/2021.05.03.442509 <https://doi.org/10.1101/2021.05.03.442509>`__
|
|
79
|
|
80 💭 Feedback
|
|
81 ------------
|
|
82
|
|
83 ⚠️ Issue Tracker
|
|
84 ~~~~~~~~~~~~~~~~
|
|
85
|
|
86 Found a bug ? Have an enhancement request ? Head over to the `GitHub
|
|
87 issue tracker <https://github.com/zellerlab/GECCO/issues>`__ if you need
|
|
88 to report or ask something. If you are filing in on a bug, please
|
|
89 include as much information as you can about the issue, and try to
|
|
90 recreate the same bug in a simple, easily reproducible situation.
|
|
91
|
|
92 🏗️ Contributing
|
|
93 ~~~~~~~~~~~~~~~~
|
|
94
|
|
95 Contributions are more than welcome! See
|
1
|
96 `CONTRIBUTING.md <https://github.com/althonos/pyhmmer/blob/master/CONTRIBUTING.md>`__
|
0
|
97 for more details.
|
|
98
|
|
99 ⚖️ License
|
|
100 ----------
|
|
101
|
|
102 This software is provided under the `GNU General Public License v3.0 or
|
|
103 later <https://choosealicense.com/licenses/gpl-3.0/>`__. GECCO is
|
|
104 developped by the `Zeller
|
|
105 Team <https://www.embl.de/research/units/scb/zeller/index.html>`__ at
|
|
106 the `European Molecular Biology Laboratory <https://www.embl.de/>`__ in
|
|
107 Heidelberg.
|
|
108
|
|
109 .. |GitLabCI| image:: https://img.shields.io/gitlab/pipeline/grp-zeller/GECCO/master?gitlab_url=https%3A%2F%2Fgit.embl.de&style=flat-square&maxAge=600
|
|
110 :target: https://git.embl.de/grp-zeller/GECCO/-/pipelines/
|
|
111 .. |License| image:: https://img.shields.io/badge/license-GPLv3-blue.svg?style=flat-square&maxAge=2678400
|
|
112 :target: https://choosealicense.com/licenses/gpl-3.0/
|
|
113 .. |Coverage| image:: https://img.shields.io/codecov/c/gh/zellerlab/GECCO?style=flat-square&maxAge=600
|
|
114 :target: https://codecov.io/gh/zellerlab/GECCO/
|
|
115 .. |Docs| image:: https://img.shields.io/badge/docs-gecco.embl.de-green.svg?maxAge=2678400&style=flat-square
|
|
116 :target: https://gecco.embl.de
|
|
117 .. |Source| image:: https://img.shields.io/badge/source-GitHub-303030.svg?maxAge=2678400&style=flat-square
|
|
118 :target: https://github.com/zellerlab/GECCO/
|
|
119 .. |Mirror| image:: https://img.shields.io/badge/mirror-EMBL-009f4d?style=flat-square&maxAge=2678400
|
|
120 :target: https://git.embl.de/grp-zeller/GECCO/
|
|
121 .. |Changelog| image:: https://img.shields.io/badge/keep%20a-changelog-8A0707.svg?maxAge=2678400&style=flat-square
|
|
122 :target: https://github.com/zellerlab/GECCO/blob/master/CHANGELOG.md
|
|
123 .. |Issues| image:: https://img.shields.io/github/issues/zellerlab/GECCO.svg?style=flat-square&maxAge=600
|
|
124 :target: https://github.com/zellerlab/GECCO/issues
|
|
125 .. |Preprint| image:: https://img.shields.io/badge/preprint-bioRxiv-darkblue?style=flat-square&maxAge=2678400
|
|
126 :target: https://www.biorxiv.org/content/10.1101/2021.05.03.442509v1
|
|
127 .. |PyPI| image:: https://img.shields.io/pypi/v/gecco-tool.svg?style=flat-square&maxAge=3600
|
|
128 :target: https://pypi.python.org/pypi/gecco-tool
|
|
129 .. |Bioconda| image:: https://img.shields.io/conda/vn/bioconda/gecco?style=flat-square&maxAge=3600
|
|
130 :target: https://anaconda.org/bioconda/gecco
|
|
131 .. |Versions| image:: https://img.shields.io/pypi/pyversions/gecco-tool.svg?style=flat-square&maxAge=3600
|
|
132 :target: https://pypi.org/project/gecco-tool/#files
|
|
133 .. |Wheel| image:: https://img.shields.io/pypi/wheel/gecco-tool?style=flat-square&maxAge=3600
|
|
134 :target: https://pypi.org/project/gecco-tool/#files
|