Mercurial > repos > bgruening > sklearn_data_preprocess
comparison README.rst @ 17:f0f1e5ba6fca draft
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/sklearn commit cfc9fe24b7975fc5838bb3e456646202898eb977
author | bgruening |
---|---|
date | Sat, 04 Aug 2018 17:36:28 -0400 |
parents | 29899feb4d44 |
children |
comparison
equal
deleted
inserted
replaced
16:23f26ac9c7b3 | 17:f0f1e5ba6fca |
---|---|
1 *************** | |
2 Galaxy wrapper for scikit-learn library | 1 Galaxy wrapper for scikit-learn library |
3 *************** | 2 *************************************** |
4 | 3 |
5 Contents | 4 Contents |
6 ======== | 5 ======== |
6 | |
7 - `What is scikit-learn?`_ | 7 - `What is scikit-learn?`_ |
8 - `Scikit-learn main package groups`_ | 8 - `Scikit-learn main package groups`_ |
9 - `Tools offered by this wrapper`_ | 9 - `Tools offered by this wrapper`_ |
10 | 10 |
11 - `Machine learning workflows`_ | 11 - `Machine learning workflows`_ |
14 | 14 |
15 | 15 |
16 ____________________________ | 16 ____________________________ |
17 | 17 |
18 | 18 |
19 .. _What is scikit-learn? | 19 .. _What is scikit-learn?: |
20 | 20 |
21 What is scikit-learn? | 21 What is scikit-learn? |
22 =========================== | 22 ===================== |
23 | 23 |
24 Scikit-learn is an open-source machine learning library for the Python programming language. It offers various algorithms for performing supervised and unsupervised learning as well as data preprocessing and transformation, model selection and evaluation, and dataset utilities. It is built upon SciPy (Scientific Python) library. | 24 Scikit-learn is an open-source machine learning library for the Python programming language. It offers various algorithms for performing supervised and unsupervised learning as well as data preprocessing and transformation, model selection and evaluation, and dataset utilities. It is built upon SciPy (Scientific Python) library. |
25 | 25 |
26 Scikit-learn source code can be accessed at https://github.com/scikit-learn/scikit-learn. | 26 Scikit-learn source code can be accessed at https://github.com/scikit-learn/scikit-learn. |
27 Detailed installation instructions can be found at http://scikit-learn.org/stable/install.html | 27 Detailed installation instructions can be found at http://scikit-learn.org/stable/install.html |
28 | 28 |
29 | 29 |
30 .. _Scikit-learn main package groups: | 30 .. _Scikit-learn main package groups: |
31 | 31 |
32 ====== | |
33 Scikit-learn main package groups | 32 Scikit-learn main package groups |
34 ====== | 33 ================================ |
35 | 34 |
36 Scikit-learn provides the users with several main groups of related operations. | 35 Scikit-learn provides the users with several main groups of related operations. |
37 These are: | 36 These are: |
38 | 37 |
39 - Classification | 38 - Classification |
52 Each group consists of a number of well-known algorithms from the category. For example, one can find hierarchical, spectral, kmeans, and other clustering methods in sklearn.cluster package. | 51 Each group consists of a number of well-known algorithms from the category. For example, one can find hierarchical, spectral, kmeans, and other clustering methods in sklearn.cluster package. |
53 | 52 |
54 | 53 |
55 .. _Tools offered by this wrapper: | 54 .. _Tools offered by this wrapper: |
56 | 55 |
57 =================== | |
58 Available tools in the current wrapper | 56 Available tools in the current wrapper |
59 =================== | 57 ====================================== |
60 | 58 |
61 The current release of the wrapper offers a subset of the packages from scikit-learn library. You can find: | 59 The current release of the wrapper offers a subset of the packages from scikit-learn library. You can find: |
62 | 60 |
63 - A subset of classification metric functions | 61 - A subset of classification metric functions |
64 - Linear and quadratic discriminant classifiers | 62 - Linear and quadratic discriminant classifiers |
71 In addition, several tools for performing matrix operations, generating problem-specific datasets, and encoding text and extracting features have been prepared to help the user with more advanced operations. | 69 In addition, several tools for performing matrix operations, generating problem-specific datasets, and encoding text and extracting features have been prepared to help the user with more advanced operations. |
72 | 70 |
73 .. _Machine learning workflows: | 71 .. _Machine learning workflows: |
74 | 72 |
75 Machine learning workflows | 73 Machine learning workflows |
76 =============== | 74 ========================== |
77 | 75 |
78 Machine learning is about processes. No matter what machine learning algorithm we use, we can apply typical workflows and dataflows to produce more robust models and better predictions. | 76 Machine learning is about processes. No matter what machine learning algorithm we use, we can apply typical workflows and dataflows to produce more robust models and better predictions. |
79 Here we discuss supervised and unsupervised learning workflows. | 77 Here we discuss supervised and unsupervised learning workflows. |
80 | 78 |
81 .. _Supervised learning workflows: | 79 .. _Supervised learning workflows: |
82 | 80 |
83 =================== | |
84 Supervised machine learning workflows | 81 Supervised machine learning workflows |
85 =================== | 82 ===================================== |
86 | 83 |
87 **What is supervised learning?** | 84 **What is supervised learning?** |
88 | 85 |
89 In this machine learning task, given sample data which are labeled, the aim is to build a model which can predict the labels for new observations. | 86 In this machine learning task, given sample data which are labeled, the aim is to build a model which can predict the labels for new observations. |
90 In practice, there are five steps which we can go through to start from raw input data and end up getting reasonable predictions for new samples: | 87 In practice, there are five steps which we can go through to start from raw input data and end up getting reasonable predictions for new samples: |
130 This is a final evaluation in which, the optimized model is used to make predictions | 127 This is a final evaluation in which, the optimized model is used to make predictions |
131 on unseen (here test) samples. After this, the model is put into production. | 128 on unseen (here test) samples. After this, the model is put into production. |
132 | 129 |
133 .. _Unsupervised learning workflows: | 130 .. _Unsupervised learning workflows: |
134 | 131 |
135 ======================= | |
136 Unsupervised machine learning workflows | 132 Unsupervised machine learning workflows |
137 ======================= | 133 ======================================= |
138 | 134 |
139 **What is unsupervised learning?** | 135 **What is unsupervised learning?** |
140 | 136 |
141 Unlike supervised learning and more liklely in real life, here the initial data is not labeled. | 137 Unlike supervised learning and more liklely in real life, here the initial data is not labeled. |
142 The task is to extract the structure from the data and group the samples based on their similarities. | 138 The task is to extract the structure from the data and group the samples based on their similarities. |