Mercurial > repos > goeckslab > tabular_learner
comparison README.md @ 1:f69ed50c9768 draft
planemo upload for repository https://github.com/goeckslab/gleam commit 4dc221b2fa9717552787f0985ad3fc3df4460158
| author | goeckslab |
|---|---|
| date | Sat, 21 Jun 2025 15:06:53 +0000 |
| parents | 209b663a4f62 |
| children |
comparison
equal
deleted
inserted
replaced
| 0:209b663a4f62 | 1:f69ed50c9768 |
|---|---|
| 1 # Galaxy-Pycaret | 1 # Tabular Learner Tools |
| 2 A library of Galaxy machine learning tools based on PyCaret — part of the Galaxy ML2 tools, aiming to provide simple, powerful, and robust machine learning capabilities for Galaxy users. | |
| 3 | 2 |
| 4 # Install Galaxy-Pycaret into Galaxy | 3 This repository contains two machine learning tools for working with tabular data in the Gleam framework: |
| 5 | 4 |
| 6 * Update `tool_conf.xml` to include Galaxy-Pycaret tools. See [documentation](https://docs.galaxyproject.org/en/master/admin/tool_panel.html) for more details. This is an example: | 5 ## 1. Tabular Learner |
| 7 ``` | |
| 8 <section id="pycaret" name="Pycaret Applications"> | |
| 9 <tool file="galaxy-pycaret/tools/pycaret_train.xml" /> | |
| 10 </section> | |
| 11 ``` | |
| 12 | 6 |
| 13 * Configure the `job_conf.yml` under `lib/galaxy/config/sample` to enable the docker for the environment you want the Ludwig related job running in. This is an example: | 7 A comprehensive tool for training and evaluating multiple machine learning models on tabular datasets. |
| 14 ``` | |
| 15 execution: | |
| 16 default: local | |
| 17 environments: | |
| 18 local: | |
| 19 runner: local | |
| 20 docker_enabled: true | |
| 21 ``` | |
| 22 If you are using an older version of Galaxy, then `job_conf.xml` would be something you want to configure instead of `job_conf.yml`. Then you would want to configure destination instead of execution and environment. | |
| 23 See [documentation](https://docs.galaxyproject.org/en/master/admin/jobs.html#running-jobs-in-containers) for job_conf configuration. | |
| 24 * If you haven’t set `sanitize_all_html: false` in `galaxy.yml`, please set it to False to enable our HTML report functionality. | |
| 25 * Should be good to go. | |
| 26 | 8 |
| 27 # Make contributions | 9 ### Features: |
| 10 - Supports both classification and regression tasks | |
| 11 - Automatically compares multiple algorithms to find the best model | |
| 12 - Extensive customization options: | |
| 13 - Data normalization | |
| 14 - Feature selection | |
| 15 - Cross-validation | |
| 16 - Outlier removal | |
| 17 - Multicollinearity handling | |
| 18 - Polynomial feature generation | |
| 19 - Class imbalance correction | |
| 20 - Outputs detailed HTML reports with performance metrics and visualizations | |
| 21 - Saves the best model for later use | |
| 28 | 22 |
| 29 ## Getting Started | 23 ## 2. PyCaret Predictor/Evaluator |
| 30 | 24 |
| 31 To get started, you’ll need to fork the repository, clone it locally, and create a new branch for your contributions. | 25 A companion tool for making predictions and evaluating trained models on new data. |
| 32 | 26 |
| 33 1. **Fork the Repository**: Click the "Fork" button at the top right of this page. | 27 ### Features: |
| 34 2. **Clone the Fork**: | 28 - Works with models trained by Tabular Learner |
| 35 ```bash | 29 - Supports both classification and regression tasks |
| 36 git clone https://github.com/<your-username>/Galaxy-Pycaret.git | 30 - Generates predictions on new data |
| 37 cd <your-repo> | 31 - Creates evaluation reports when target values are provided |
| 38 ``` | 32 - Outputs predictions in CSV format |
| 39 3. **Create a Feature/hotfix/bugfix Branch**: | |
| 40 ```bash | |
| 41 git checkout -b feature/<feature-branch-name> | |
| 42 ``` | |
| 43 or | |
| 44 ```bash | |
| 45 git checkout -b hotfix/<hoxfix-branch-name> | |
| 46 ``` | |
| 47 or | |
| 48 ```bash | |
| 49 git checkout -b bugfix/<bugfix-branch-name> | |
| 50 ``` | |
| 51 | 33 |
| 52 ## How We Manage the Repo | 34 ## Workflow |
| 53 | 35 |
| 54 We follow a structured branching and merging strategy to ensure code quality and stability. | 36 These tools are designed to work together: |
| 37 1. Use **Tabular Learner** to train and find the best model for your dataset | |
| 38 2. Use **PyCaret Predictor/Evaluator** to apply your trained model to new data | |
| 55 | 39 |
| 56 1. **Main Branches**: | 40 Both tools are powered by [PyCaret](https://pycaret.org/), an open-source machine learning library that automates the ML workflow. |
| 57 - **`main`**: Contains production-ready code. | |
| 58 - **`dev`**: Contains code that is ready for the next release. | |
| 59 | |
| 60 2. **Supporting Branches**: | |
| 61 - **Feature Branches**: Created from `dev` for new features. | |
| 62 - **Bugfix Branches**: Created from `dev` for bug fixes. | |
| 63 - **Release Branches**: Created from `dev` when preparing a new release. | |
| 64 - **Hotfix Branches**: Created from `main` for critical fixes in production. | |
| 65 | |
| 66 ### Workflow | |
| 67 | |
| 68 - **Feature Development**: | |
| 69 - Branch from `dev`. | |
| 70 - Work on your feature. | |
| 71 - Submit a Pull Request (PR) to `dev`. | |
| 72 - **Hotfixes**: | |
| 73 - Branch from `main`. | |
| 74 - Fix the issue. | |
| 75 - Merge back into both `main` and `dev`. | |
| 76 | |
| 77 ## Contribution Guidelines | |
| 78 | |
| 79 We welcome contributions of all kinds. To make contributions easy and effective, please follow these guidelines: | |
| 80 | |
| 81 1. **Create an Issue**: Before starting work on a major change, create an issue to discuss it. | |
| 82 2. **Fork and Branch**: Fork the repo and create a feature branch. | |
| 83 3. **Write Tests**: Ensure your changes are well-tested if applicable. | |
| 84 4. **Code Style**: Follow the project’s coding conventions. | |
| 85 5. **Commit Messages**: Write clear and concise commit messages. | |
| 86 6. **Pull Request**: Submit a PR to the `dev` branch. Ensure your PR description is clear and includes the issue number. | |
| 87 | |
| 88 ### Submitting a Pull Request | |
| 89 | |
| 90 1. **Push your Branch**: | |
| 91 ```bash | |
| 92 git push origin feature/<feature-branch-name> | |
| 93 ``` | |
| 94 2. **Open a Pull Request**: | |
| 95 - Navigate to the original repository where you created your fork. | |
| 96 - Click on the "New Pull Request" button. | |
| 97 - Select `dev` as the base branch and your feature branch as the compare branch. | |
| 98 - Fill in the PR template with details about your changes. | |
| 99 | |
| 100 3. **Rebase or Merge `dev` into Your Feature Branch**: | |
| 101 - Before submitting your PR or when `dev` has been updated, rebase or merge `dev` into your feature branch to ensure your branch is up to date: | |
| 102 | |
| 103 4. **Resolve Conflicts**: | |
| 104 - If there are any conflicts during the rebase or merge, Git will pause and allow you to resolve the conflicts. | |
| 105 | |
| 106 5. **Review Process**: Your PR will be reviewed by a team member. Please address any feedback and update your PR as needed. |
