annotate COBRAxy/docs/getting-started.md @ 492:4ed95023af20 draft

Uploaded
author francesco_lapi
date Tue, 30 Sep 2025 14:02:17 +0000
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
492
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
1 # Getting Started
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
2
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
3 Welcome to COBRAxy! This guide will help you get up and running with metabolic flux analysis.
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
4
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
5 ## What is COBRAxy?
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
6
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
7 COBRAxy is a comprehensive toolkit for metabolic flux analysis that bridges the gap between omics data and biological insights. It provides:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
8
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
9 - **Data Integration**: Combine gene expression and metabolite data
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
10 - **Metabolic Modeling**: Use constraint-based models for flux analysis
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
11 - **Visualization**: Generate interactive pathway maps
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
12 - **Statistical Analysis**: Perform enrichment and sensitivity analysis
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
13
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
14 ## Core Concepts
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
15
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
16 ### Reaction Activity Scores (RAS)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
17 RAS quantify how active metabolic reactions are based on gene expression data. COBRAxy computes RAS by:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
18 1. Mapping genes to reactions via GPR (Gene-Protein-Reaction) rules
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
19 2. Applying logical operations (AND/OR) based on enzyme complexes
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
20 3. Producing activity scores for each reaction in each sample
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
21
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
22 ### Reaction Propensity Scores (RPS)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
23 RPS indicate metabolic preferences based on metabolite abundance:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
24 1. Map metabolites to reactions as substrates/products
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
25 2. Weight by stoichiometry and frequency
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
26 3. Compute propensity scores using log-normalized formulas
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
27
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
28 ### Flux Sampling
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
29 Sample feasible flux distributions using:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
30 - **CBS (Coordinate Hit-and-Run with Rounding)**: Fast, uniform sampling
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
31 - **OptGP (Optimal Growth Parallel)**: Growth-optimized sampling
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
32
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
33 ## Analysis Workflows
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
34
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
35 COBRAxy supports two main analysis paths:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
36
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
37 ### 1. Enrichment Analysis Workflow
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
38 ```bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
39 # Generate activity scores
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
40 ras_generator → RAS values
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
41 rps_generator → RPS values
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
42
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
43 # Statistical enrichment analysis
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
44 marea → Enriched pathway maps
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
45 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
46
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
47 **Use when**: You want to identify significantly altered pathways and create publication-ready maps.
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
48
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
49 ### 2. Flux Simulation Workflow
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
50 ```bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
51 # Apply constraints to model
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
52 ras_generator → RAS values
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
53 ras_to_bounds → Constrained model
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
54
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
55 # Sample flux distributions
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
56 flux_simulation → Flux samples
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
57 flux_to_map → Final visualizations
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
58 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
59
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
60 **Use when**: You want to predict metabolic flux distributions and study network-wide changes.
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
61
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
62 ## Your First Analysis
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
63
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
64 Let's run a basic analysis with sample data:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
65
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
66 ### Step 1: Prepare Your Data
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
67
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
68 You'll need:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
69 - **Gene expression data**: TSV file with genes (rows) × samples (columns)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
70 - **Metabolic model**: SBML file or use built-in models (ENGRO2, Recon)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
71 - **Metabolite data** (optional): TSV file with metabolites (rows) × samples (columns)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
72
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
73 ### Step 2: Generate Activity Scores
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
74
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
75 ```bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
76 # Generate RAS from expression data
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
77 ras_generator -td $(pwd) \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
78 -in expression_data.tsv \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
79 -ra ras_output.tsv \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
80 -rs ENGRO2
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
81 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
82
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
83 ### Step 3: Create Pathway Maps
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
84
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
85 ```bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
86 # Generate enriched pathway maps
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
87 marea -td $(pwd) \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
88 -using_RAS true \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
89 -input_data ras_output.tsv \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
90 -choice_map ENGRO2 \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
91 -gs true \
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
92 -idop pathway_maps
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
93 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
94
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
95 ### Step 4: View Results
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
96
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
97 Your analysis will generate:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
98 - **RAS values**: `ras_output.tsv` - Activity scores for each reaction
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
99 - **Statistical maps**: `pathway_maps/` - SVG files with enrichment visualization
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
100 - **Log files**: Detailed execution logs for troubleshooting
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
101
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
102 ## Built-in Models
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
103
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
104 COBRAxy includes ready-to-use metabolic models:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
105
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
106 | Model | Organism | Reactions | Genes | Description |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
107 |-------|----------|-----------|-------|-------------|
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
108 | **ENGRO2** | Human | ~2,000 | ~500 | Focused human metabolism model |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
109 | **Recon** | Human | ~10,000 | ~2,000 | Comprehensive human metabolism |
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
110
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
111 Models are stored in the `local/` directory and include:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
112 - SBML files
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
113 - GPR rules
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
114 - Gene mapping tables
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
115 - Pathway templates
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
116
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
117 ## Data Formats
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
118
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
119 ### Gene Expression Format
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
120 ```tsv
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
121 Gene_ID Sample_1 Sample_2 Sample_3
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
122 HGNC:5 12.5 8.3 15.7
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
123 HGNC:10 3.2 4.1 2.8
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
124 HGNC:15 7.9 11.2 6.4
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
125 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
126
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
127 ### Metabolite Format
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
128 ```tsv
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
129 Metabolite_ID Sample_1 Sample_2 Sample_3
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
130 glucose 100.5 85.3 120.7
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
131 pyruvate 45.2 38.1 52.8
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
132 lactate 23.9 41.2 19.4
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
133 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
134
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
135 ## Command Line vs Python API
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
136
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
137 COBRAxy offers two usage modes:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
138
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
139 ### Command Line (Quick Analysis)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
140 ```bash
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
141 # Simple command-line execution
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
142 ras_generator -td $(pwd) -in data.tsv -ra output.tsv -rs ENGRO2
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
143 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
144
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
145 ### Python API (Programming)
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
146 ```python
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
147 import ras_generator
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
148 # Call main function with arguments
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
149 ras_generator.main(['-td', '/path', '-in', 'data.tsv', '-ra', 'output.tsv', '-rs', 'ENGRO2'])
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
150 ```
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
151
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
152 ## Next Steps
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
153
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
154 Now that you understand the basics:
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
155
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
156 1. **[Quick Start Guide](quickstart.md)** - Complete walkthrough with example data
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
157 2. **[Python API Tutorial](tutorials/python-api.md)** - Learn programmatic usage
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
158 3. **[Tools Reference](tools/)** - Detailed documentation for each tool
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
159 4. **[Examples](examples/)** - Real-world analysis examples
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
160
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
161 ## Need Help?
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
162
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
163 - **[Troubleshooting](troubleshooting.md)** - Common issues and solutions
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
164 - **[GitHub Issues](https://github.com/CompBtBs/COBRAxy/issues)** - Report bugs or ask questions
4ed95023af20 Uploaded
francesco_lapi
parents:
diff changeset
165 - **[Contributing](contributing.md)** - Help improve COBRAxy