annotate COBRAxy/docs/getting-started.md @ 546:01147e83f43c draft default tip

Uploaded
author luca_milaz
date Mon, 27 Oct 2025 12:33:08 +0000
parents fcdbc81feb45
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
492
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
1 # Getting Started
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
2
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
3 Welcome to COBRAxy! This guide will help you get up and running with metabolic flux analysis.
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
4
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
5 ## What is COBRAxy?
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
6
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
7 COBRAxy is a comprehensive toolkit for metabolic flux analysis that bridges the gap between omics data and biological insights. It provides:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
8
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
9 - **Data Integration**: Combine gene expression and metabolite data
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
10 - **Metabolic Modeling**: Use constraint-based models for flux analysis
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
11 - **Visualization**: Generate interactive pathway maps
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
12 - **Statistical Analysis**: Perform enrichment and sensitivity analysis
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
13
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
14 ## Core Concepts
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
15
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
16 ### Reaction Activity Scores (RAS)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
17 RAS quantify how active metabolic reactions are based on gene expression data. COBRAxy computes RAS by:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
18 1. Mapping genes to reactions via GPR (Gene-Protein-Reaction) rules
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
19 2. Applying logical operations (AND/OR) based on enzyme complexes
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
20 3. Producing activity scores for each reaction in each sample
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
21
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
22 ### Reaction Propensity Scores (RPS)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
23 RPS indicate metabolic preferences based on metabolite abundance:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
24 1. Map metabolites to reactions as substrates/products
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
25 2. Weight by stoichiometry and frequency
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
26 3. Compute propensity scores using log-normalized formulas
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
27
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
28 ### Flux Sampling
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
29 Sample feasible flux distributions using:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
30 - **CBS (Coordinate Hit-and-Run with Rounding)**: Fast, uniform sampling
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
31 - **OptGP (Optimal Growth Parallel)**: Growth-optimized sampling
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
32
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
33 ## Analysis Workflows
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
34
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
35 COBRAxy supports two main analysis paths:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
36
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
37 ### 1. Enrichment Analysis Workflow
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
38 ```bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
39 # Generate activity scores
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
40 ras_generator → RAS values
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
41 rps_generator → RPS values
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
42
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
43 # Statistical enrichment analysis
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
44 marea → Enriched pathway maps
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
45 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
46
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
47 **Use when**: You want to identify significantly altered pathways and create publication-ready maps.
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
48
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
49 ### 2. Flux Simulation Workflow
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
50 ```bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
51 # Apply constraints to model
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
52 ras_generator → RAS values
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
53 ras_to_bounds → Constrained model
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
54
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
55 # Sample flux distributions
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
56 flux_simulation → Flux samples
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
57 flux_to_map → Final visualizations
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
58 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
59
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
60 **Use when**: You want to predict metabolic flux distributions and study network-wide changes.
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
61
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
62 ## Your First Analysis
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
63
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
64 Let's run a basic analysis with sample data:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
65
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
66 ### Step 1: Prepare Your Data
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
67
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
68 You'll need:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
69 - **Gene expression data**: TSV file with genes (rows) × samples (columns)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
70 - **Metabolic model**: SBML file or use built-in models (ENGRO2, Recon)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
71 - **Metabolite data** (optional): TSV file with metabolites (rows) × samples (columns)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
72
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
73 ### Step 2: Generate Activity Scores
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
74
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
75 ```bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
76 # Generate RAS from expression data
542
fcdbc81feb45 Uploaded
francesco_lapi
parents: 492
diff changeset
77 # Note: -td is optional and auto-detected after pip install
fcdbc81feb45 Uploaded
francesco_lapi
parents: 492
diff changeset
78 ras_generator \
492
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
79 -in expression_data.tsv \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
80 -ra ras_output.tsv \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
81 -rs ENGRO2
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
82 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
83
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
84 ### Step 3: Create Pathway Maps
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
85
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
86 ```bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
87 # Generate enriched pathway maps
542
fcdbc81feb45 Uploaded
francesco_lapi
parents: 492
diff changeset
88 # Note: -td is optional and auto-detected after pip install
fcdbc81feb45 Uploaded
francesco_lapi
parents: 492
diff changeset
89 marea \
492
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
90 -using_RAS true \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
91 -input_data ras_output.tsv \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
92 -choice_map ENGRO2 \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
93 -gs true \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
94 -idop pathway_maps
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
95 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
96
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
97 ### Step 4: View Results
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
98
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
99 Your analysis will generate:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
100 - **RAS values**: `ras_output.tsv` - Activity scores for each reaction
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
101 - **Statistical maps**: `pathway_maps/` - SVG files with enrichment visualization
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
102 - **Log files**: Detailed execution logs for troubleshooting
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
103
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
104 ## Built-in Models
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
105
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
106 COBRAxy includes ready-to-use metabolic models:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
107
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
108 | Model | Organism | Reactions | Genes | Description |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
109 |-------|----------|-----------|-------|-------------|
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
110 | **ENGRO2** | Human | ~2,000 | ~500 | Focused human metabolism model |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
111 | **Recon** | Human | ~10,000 | ~2,000 | Comprehensive human metabolism |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
112
542
fcdbc81feb45 Uploaded
francesco_lapi
parents: 492
diff changeset
113 Models are stored in the `src/local/` directory and include:
492
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
114 - SBML files
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
115 - GPR rules
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
116 - Gene mapping tables
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
117 - Pathway templates
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
118
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
119 ## Data Formats
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
120
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
121 ### Gene Expression Format
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
122 ```tsv
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
123 Gene_ID Sample_1 Sample_2 Sample_3
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
124 HGNC:5 12.5 8.3 15.7
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
125 HGNC:10 3.2 4.1 2.8
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
126 HGNC:15 7.9 11.2 6.4
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
127 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
128
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
129 ### Metabolite Format
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
130 ```tsv
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
131 Metabolite_ID Sample_1 Sample_2 Sample_3
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
132 glucose 100.5 85.3 120.7
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
133 pyruvate 45.2 38.1 52.8
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
134 lactate 23.9 41.2 19.4
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
135 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
136
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
137 ## Next Steps
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
138
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
139 Now that you understand the basics:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
140
542
fcdbc81feb45 Uploaded
francesco_lapi
parents: 492
diff changeset
141 1. **[Quick Start Guide](/quickstart.md)** - Complete walkthrough with example data
fcdbc81feb45 Uploaded
francesco_lapi
parents: 492
diff changeset
142 2. **[Galaxy Tutorial](/tutorials/galaxy-setup.md)** - Web-based analysis setup
fcdbc81feb45 Uploaded
francesco_lapi
parents: 492
diff changeset
143 3. **[Tools Reference](/tools/)** - Detailed documentation for each tool
fcdbc81feb45 Uploaded
francesco_lapi
parents: 492
diff changeset
144 4. **[Examples](/examples/)** - Real-world analysis examples
492
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
145
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
146 ## Need Help?
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
147
542
fcdbc81feb45 Uploaded
francesco_lapi
parents: 492
diff changeset
148 - **[Troubleshooting](/troubleshooting.md)** - Common issues and solutions
492
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
149 - **[GitHub Issues](https://github.com/CompBtBs/COBRAxy/issues)** - Report bugs or ask questions
542
fcdbc81feb45 Uploaded
francesco_lapi
parents: 492
diff changeset
150 - **[Contributing](/contributing.md)** - Help improve COBRAxy