|
492
|
1 # Getting Started
|
|
|
2
|
|
|
3 Welcome to COBRAxy! This guide will help you get up and running with metabolic flux analysis.
|
|
|
4
|
|
|
5 ## What is COBRAxy?
|
|
|
6
|
|
|
7 COBRAxy is a comprehensive toolkit for metabolic flux analysis that bridges the gap between omics data and biological insights. It provides:
|
|
|
8
|
|
|
9 - **Data Integration**: Combine gene expression and metabolite data
|
|
|
10 - **Metabolic Modeling**: Use constraint-based models for flux analysis
|
|
|
11 - **Visualization**: Generate interactive pathway maps
|
|
|
12 - **Statistical Analysis**: Perform enrichment and sensitivity analysis
|
|
|
13
|
|
|
14 ## Core Concepts
|
|
|
15
|
|
|
16 ### Reaction Activity Scores (RAS)
|
|
|
17 RAS quantify how active metabolic reactions are based on gene expression data. COBRAxy computes RAS by:
|
|
|
18 1. Mapping genes to reactions via GPR (Gene-Protein-Reaction) rules
|
|
|
19 2. Applying logical operations (AND/OR) based on enzyme complexes
|
|
|
20 3. Producing activity scores for each reaction in each sample
|
|
|
21
|
|
|
22 ### Reaction Propensity Scores (RPS)
|
|
|
23 RPS indicate metabolic preferences based on metabolite abundance:
|
|
|
24 1. Map metabolites to reactions as substrates/products
|
|
|
25 2. Weight by stoichiometry and frequency
|
|
|
26 3. Compute propensity scores using log-normalized formulas
|
|
|
27
|
|
|
28 ### Flux Sampling
|
|
|
29 Sample feasible flux distributions using:
|
|
|
30 - **CBS (Coordinate Hit-and-Run with Rounding)**: Fast, uniform sampling
|
|
|
31 - **OptGP (Optimal Growth Parallel)**: Growth-optimized sampling
|
|
|
32
|
|
|
33 ## Analysis Workflows
|
|
|
34
|
|
|
35 COBRAxy supports two main analysis paths:
|
|
|
36
|
|
|
37 ### 1. Enrichment Analysis Workflow
|
|
|
38 ```bash
|
|
|
39 # Generate activity scores
|
|
|
40 ras_generator → RAS values
|
|
|
41 rps_generator → RPS values
|
|
|
42
|
|
|
43 # Statistical enrichment analysis
|
|
|
44 marea → Enriched pathway maps
|
|
|
45 ```
|
|
|
46
|
|
|
47 **Use when**: You want to identify significantly altered pathways and create publication-ready maps.
|
|
|
48
|
|
|
49 ### 2. Flux Simulation Workflow
|
|
|
50 ```bash
|
|
|
51 # Apply constraints to model
|
|
|
52 ras_generator → RAS values
|
|
|
53 ras_to_bounds → Constrained model
|
|
|
54
|
|
|
55 # Sample flux distributions
|
|
|
56 flux_simulation → Flux samples
|
|
|
57 flux_to_map → Final visualizations
|
|
|
58 ```
|
|
|
59
|
|
|
60 **Use when**: You want to predict metabolic flux distributions and study network-wide changes.
|
|
|
61
|
|
|
62 ## Your First Analysis
|
|
|
63
|
|
|
64 Let's run a basic analysis with sample data:
|
|
|
65
|
|
|
66 ### Step 1: Prepare Your Data
|
|
|
67
|
|
|
68 You'll need:
|
|
|
69 - **Gene expression data**: TSV file with genes (rows) × samples (columns)
|
|
|
70 - **Metabolic model**: SBML file or use built-in models (ENGRO2, Recon)
|
|
|
71 - **Metabolite data** (optional): TSV file with metabolites (rows) × samples (columns)
|
|
|
72
|
|
|
73 ### Step 2: Generate Activity Scores
|
|
|
74
|
|
|
75 ```bash
|
|
|
76 # Generate RAS from expression data
|
|
542
|
77 # Note: -td is optional and auto-detected after pip install
|
|
|
78 ras_generator \
|
|
492
|
79 -in expression_data.tsv \
|
|
|
80 -ra ras_output.tsv \
|
|
|
81 -rs ENGRO2
|
|
|
82 ```
|
|
|
83
|
|
|
84 ### Step 3: Create Pathway Maps
|
|
|
85
|
|
|
86 ```bash
|
|
|
87 # Generate enriched pathway maps
|
|
542
|
88 # Note: -td is optional and auto-detected after pip install
|
|
|
89 marea \
|
|
492
|
90 -using_RAS true \
|
|
|
91 -input_data ras_output.tsv \
|
|
|
92 -choice_map ENGRO2 \
|
|
|
93 -gs true \
|
|
|
94 -idop pathway_maps
|
|
|
95 ```
|
|
|
96
|
|
|
97 ### Step 4: View Results
|
|
|
98
|
|
|
99 Your analysis will generate:
|
|
|
100 - **RAS values**: `ras_output.tsv` - Activity scores for each reaction
|
|
|
101 - **Statistical maps**: `pathway_maps/` - SVG files with enrichment visualization
|
|
|
102 - **Log files**: Detailed execution logs for troubleshooting
|
|
|
103
|
|
|
104 ## Built-in Models
|
|
|
105
|
|
|
106 COBRAxy includes ready-to-use metabolic models:
|
|
|
107
|
|
|
108 | Model | Organism | Reactions | Genes | Description |
|
|
|
109 |-------|----------|-----------|-------|-------------|
|
|
|
110 | **ENGRO2** | Human | ~2,000 | ~500 | Focused human metabolism model |
|
|
|
111 | **Recon** | Human | ~10,000 | ~2,000 | Comprehensive human metabolism |
|
|
|
112
|
|
542
|
113 Models are stored in the `src/local/` directory and include:
|
|
492
|
114 - SBML files
|
|
|
115 - GPR rules
|
|
|
116 - Gene mapping tables
|
|
|
117 - Pathway templates
|
|
|
118
|
|
|
119 ## Data Formats
|
|
|
120
|
|
|
121 ### Gene Expression Format
|
|
|
122 ```tsv
|
|
|
123 Gene_ID Sample_1 Sample_2 Sample_3
|
|
|
124 HGNC:5 12.5 8.3 15.7
|
|
|
125 HGNC:10 3.2 4.1 2.8
|
|
|
126 HGNC:15 7.9 11.2 6.4
|
|
|
127 ```
|
|
|
128
|
|
|
129 ### Metabolite Format
|
|
|
130 ```tsv
|
|
|
131 Metabolite_ID Sample_1 Sample_2 Sample_3
|
|
|
132 glucose 100.5 85.3 120.7
|
|
|
133 pyruvate 45.2 38.1 52.8
|
|
|
134 lactate 23.9 41.2 19.4
|
|
|
135 ```
|
|
|
136
|
|
|
137 ## Next Steps
|
|
|
138
|
|
|
139 Now that you understand the basics:
|
|
|
140
|
|
542
|
141 1. **[Quick Start Guide](/quickstart.md)** - Complete walkthrough with example data
|
|
|
142 2. **[Galaxy Tutorial](/tutorials/galaxy-setup.md)** - Web-based analysis setup
|
|
|
143 3. **[Tools Reference](/tools/)** - Detailed documentation for each tool
|
|
|
144 4. **[Examples](/examples/)** - Real-world analysis examples
|
|
492
|
145
|
|
|
146 ## Need Help?
|
|
|
147
|
|
542
|
148 - **[Troubleshooting](/troubleshooting.md)** - Common issues and solutions
|
|
492
|
149 - **[GitHub Issues](https://github.com/CompBtBs/COBRAxy/issues)** - Report bugs or ask questions
|
|
542
|
150 - **[Contributing](/contributing.md)** - Help improve COBRAxy |