Mercurial > repos > bimib > cobraxy
diff COBRAxy/docs/getting-started.md @ 492:4ed95023af20 draft
Uploaded
author | francesco_lapi |
---|---|
date | Tue, 30 Sep 2025 14:02:17 +0000 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/COBRAxy/docs/getting-started.md Tue Sep 30 14:02:17 2025 +0000 @@ -0,0 +1,165 @@ +# Getting Started + +Welcome to COBRAxy! This guide will help you get up and running with metabolic flux analysis. + +## What is COBRAxy? + +COBRAxy is a comprehensive toolkit for metabolic flux analysis that bridges the gap between omics data and biological insights. It provides: + +- **Data Integration**: Combine gene expression and metabolite data +- **Metabolic Modeling**: Use constraint-based models for flux analysis +- **Visualization**: Generate interactive pathway maps +- **Statistical Analysis**: Perform enrichment and sensitivity analysis + +## Core Concepts + +### Reaction Activity Scores (RAS) +RAS quantify how active metabolic reactions are based on gene expression data. COBRAxy computes RAS by: +1. Mapping genes to reactions via GPR (Gene-Protein-Reaction) rules +2. Applying logical operations (AND/OR) based on enzyme complexes +3. Producing activity scores for each reaction in each sample + +### Reaction Propensity Scores (RPS) +RPS indicate metabolic preferences based on metabolite abundance: +1. Map metabolites to reactions as substrates/products +2. Weight by stoichiometry and frequency +3. Compute propensity scores using log-normalized formulas + +### Flux Sampling +Sample feasible flux distributions using: +- **CBS (Coordinate Hit-and-Run with Rounding)**: Fast, uniform sampling +- **OptGP (Optimal Growth Parallel)**: Growth-optimized sampling + +## Analysis Workflows + +COBRAxy supports two main analysis paths: + +### 1. Enrichment Analysis Workflow +```bash +# Generate activity scores +ras_generator → RAS values +rps_generator → RPS values + +# Statistical enrichment analysis +marea → Enriched pathway maps +``` + +**Use when**: You want to identify significantly altered pathways and create publication-ready maps. + +### 2. Flux Simulation Workflow +```bash +# Apply constraints to model +ras_generator → RAS values +ras_to_bounds → Constrained model + +# Sample flux distributions +flux_simulation → Flux samples +flux_to_map → Final visualizations +``` + +**Use when**: You want to predict metabolic flux distributions and study network-wide changes. + +## Your First Analysis + +Let's run a basic analysis with sample data: + +### Step 1: Prepare Your Data + +You'll need: +- **Gene expression data**: TSV file with genes (rows) × samples (columns) +- **Metabolic model**: SBML file or use built-in models (ENGRO2, Recon) +- **Metabolite data** (optional): TSV file with metabolites (rows) × samples (columns) + +### Step 2: Generate Activity Scores + +```bash +# Generate RAS from expression data +ras_generator -td $(pwd) \ + -in expression_data.tsv \ + -ra ras_output.tsv \ + -rs ENGRO2 +``` + +### Step 3: Create Pathway Maps + +```bash +# Generate enriched pathway maps +marea -td $(pwd) \ + -using_RAS true \ + -input_data ras_output.tsv \ + -choice_map ENGRO2 \ + -gs true \ + -idop pathway_maps +``` + +### Step 4: View Results + +Your analysis will generate: +- **RAS values**: `ras_output.tsv` - Activity scores for each reaction +- **Statistical maps**: `pathway_maps/` - SVG files with enrichment visualization +- **Log files**: Detailed execution logs for troubleshooting + +## Built-in Models + +COBRAxy includes ready-to-use metabolic models: + +| Model | Organism | Reactions | Genes | Description | +|-------|----------|-----------|-------|-------------| +| **ENGRO2** | Human | ~2,000 | ~500 | Focused human metabolism model | +| **Recon** | Human | ~10,000 | ~2,000 | Comprehensive human metabolism | + +Models are stored in the `local/` directory and include: +- SBML files +- GPR rules +- Gene mapping tables +- Pathway templates + +## Data Formats + +### Gene Expression Format +```tsv +Gene_ID Sample_1 Sample_2 Sample_3 +HGNC:5 12.5 8.3 15.7 +HGNC:10 3.2 4.1 2.8 +HGNC:15 7.9 11.2 6.4 +``` + +### Metabolite Format +```tsv +Metabolite_ID Sample_1 Sample_2 Sample_3 +glucose 100.5 85.3 120.7 +pyruvate 45.2 38.1 52.8 +lactate 23.9 41.2 19.4 +``` + +## Command Line vs Python API + +COBRAxy offers two usage modes: + +### Command Line (Quick Analysis) +```bash +# Simple command-line execution +ras_generator -td $(pwd) -in data.tsv -ra output.tsv -rs ENGRO2 +``` + +### Python API (Programming) +```python +import ras_generator +# Call main function with arguments +ras_generator.main(['-td', '/path', '-in', 'data.tsv', '-ra', 'output.tsv', '-rs', 'ENGRO2']) +``` + +## Next Steps + +Now that you understand the basics: + +1. **[Quick Start Guide](quickstart.md)** - Complete walkthrough with example data +2. **[Python API Tutorial](tutorials/python-api.md)** - Learn programmatic usage +3. **[Tools Reference](tools/)** - Detailed documentation for each tool +4. **[Examples](examples/)** - Real-world analysis examples + +## Need Help? + +- **[Troubleshooting](troubleshooting.md)** - Common issues and solutions +- **[GitHub Issues](https://github.com/CompBtBs/COBRAxy/issues)** - Report bugs or ask questions +- **[Contributing](contributing.md)** - Help improve COBRAxy \ No newline at end of file