492
|
1 # Flux to Map
|
|
2
|
|
3 Visualize metabolic flux data on pathway maps with statistical analysis and color coding.
|
|
4
|
|
5 ## Overview
|
|
6
|
|
7 Flux to Map performs statistical analysis on flux distribution data and generates color-coded metabolic pathway maps. It compares flux values between sample groups and highlights significantly different reactions with appropriate colors and line weights.
|
|
8
|
|
9 ## Usage
|
|
10
|
|
11 ### Command Line
|
|
12
|
|
13 ```bash
|
|
14 flux_to_map -td /path/to/COBRAxy \
|
|
15 -input_data_fluxes flux_data.tsv \
|
|
16 -input_class_fluxes sample_groups.tsv \
|
|
17 -comparison manyvsmany \
|
|
18 -test ks \
|
|
19 -pv 0.05 \
|
|
20 -fc 1.5 \
|
|
21 -choice_map ENGRO2 \
|
|
22 -generate_svg true \
|
|
23 -generate_pdf true \
|
|
24 -idop flux_maps/
|
|
25 ```
|
|
26
|
|
27 ### Galaxy Interface
|
|
28
|
|
29 Select "Flux to Map" from the COBRAxy tool suite and configure flux analysis and visualization parameters.
|
|
30
|
|
31 ## Parameters
|
|
32
|
|
33 ### Required Parameters
|
|
34
|
|
35 | Parameter | Flag | Description |
|
|
36 |-----------|------|-------------|
|
|
37 | Tool Directory | `-td, --tool_dir` | Path to COBRAxy installation directory |
|
|
38
|
|
39 ### Data Input Parameters
|
|
40
|
|
41 | Parameter | Flag | Description | Default |
|
|
42 |-----------|------|-------------|---------|
|
|
43 | Flux Data | `-idf, --input_data_fluxes` | Flux values TSV file | - |
|
|
44 | Flux Classes | `-icf, --input_class_fluxes` | Sample group labels for fluxes | - |
|
|
45 | Multiple Flux Files | `-idsf, --input_datas_fluxes` | Multiple flux datasets (space-separated) | - |
|
|
46 | Flux Names | `-naf, --names_fluxes` | Names for multiple flux datasets | - |
|
|
47 | Analysis Option | `-op, --option` | Analysis mode (datasets or dataset_class) | - |
|
|
48
|
|
49 ### Statistical Parameters
|
|
50
|
|
51 | Parameter | Flag | Description | Default |
|
|
52 |-----------|------|-------------|---------|
|
|
53 | Comparison Type | `-co, --comparison` | Statistical comparison mode | manyvsmany |
|
|
54 | Statistical Test | `-te, --test` | Statistical test method | ks |
|
|
55 | P-Value Threshold | `-pv, --pValue` | Significance threshold | 0.1 |
|
|
56 | Adjusted P-values | `-adj, --adjusted` | Apply FDR correction | false |
|
|
57 | Fold Change | `-fc, --fChange` | Minimum fold change threshold | 1.5 |
|
|
58
|
|
59 ### Visualization Parameters
|
|
60
|
|
61 | Parameter | Flag | Description | Default |
|
|
62 |-----------|------|-------------|---------|
|
|
63 | Map Choice | `-mc, --choice_map` | Built-in metabolic map | HMRcore |
|
|
64 | Custom Map | `-cm, --custom_map` | Path to custom SVG map | - |
|
|
65 | Generate SVG | `-gs, --generate_svg` | Create SVG output | true |
|
|
66 | Generate PDF | `-gp, --generate_pdf` | Create PDF output | true |
|
|
67 | Color Map | `-colorm, --color_map` | Color scheme (jet, viridis) | - |
|
|
68 | Output Directory | `-idop, --output_path` | Results directory | result/ |
|
|
69
|
|
70 ### Advanced Parameters
|
|
71
|
|
72 | Parameter | Flag | Description | Default |
|
|
73 |-----------|------|-------------|---------|
|
|
74 | Output Log | `-ol, --out_log` | Log file path | - |
|
|
75 | Control Sample | `-on, --control` | Control group identifier | - |
|
|
76
|
|
77 ## Input Formats
|
|
78
|
|
79 ### Flux Data File
|
|
80
|
|
81 Tab-separated format with reactions as rows and samples as columns:
|
|
82
|
|
83 ```
|
|
84 Reaction Sample1 Sample2 Sample3 Control1 Control2
|
|
85 R00001 15.23 -8.45 22.1 12.8 14.2
|
|
86 R00002 0.0 12.67 -5.3 8.9 7.4
|
|
87 R00003 45.8 38.2 51.7 42.1 39.8
|
|
88 R00004 -12.4 -15.8 -9.2 -11.5 -13.1
|
|
89 ```
|
|
90
|
|
91 ### Sample Class File
|
|
92
|
|
93 Group assignment for statistical comparisons:
|
|
94
|
|
95 ```
|
|
96 Sample Class
|
|
97 Sample1 Treatment
|
|
98 Sample2 Treatment
|
|
99 Sample3 Treatment
|
|
100 Control1 Control
|
|
101 Control2 Control
|
|
102 ```
|
|
103
|
|
104 ### Multiple Dataset Format
|
|
105
|
|
106 When using multiple flux files, provide space-separated paths and corresponding names:
|
|
107
|
|
108 ```bash
|
|
109 -idsf "dataset1_flux.tsv dataset2_flux.tsv dataset3_flux.tsv"
|
|
110 -naf "Condition_A Condition_B Condition_C"
|
|
111 ```
|
|
112
|
|
113 ## Statistical Analysis
|
|
114
|
|
115 ### Comparison Types
|
|
116
|
|
117 #### manyvsmany
|
|
118 Compare all possible group pairs:
|
|
119 - Treatment vs Control
|
|
120 - Condition_A vs Condition_B
|
|
121 - Condition_A vs Condition_C
|
|
122 - Condition_B vs Condition_C
|
|
123
|
|
124 #### onevsrest
|
|
125 Compare each group against all others combined:
|
|
126 - Treatment vs (Control + Other)
|
|
127 - Control vs (Treatment + Other)
|
|
128
|
|
129 #### onevsmany
|
|
130 Compare one reference group against each other group:
|
|
131 - Control vs Treatment
|
|
132 - Control vs Condition_A
|
|
133 - Control vs Condition_B
|
|
134
|
|
135 ### Statistical Tests
|
|
136
|
|
137 | Test | Description | Best For |
|
|
138 |------|-------------|----------|
|
|
139 | `ks` | Kolmogorov-Smirnov | Non-parametric, distribution-free |
|
|
140 | `ttest_p` | Paired t-test | Related samples, normal distributions |
|
|
141 | `ttest_ind` | Independent t-test | Independent samples, normal distributions |
|
|
142 | `wilcoxon` | Wilcoxon signed-rank | Non-parametric paired comparisons |
|
|
143 | `mw` | Mann-Whitney U | Non-parametric independent comparisons |
|
|
144
|
|
145 ### Significance Assessment
|
|
146
|
|
147 Reactions are considered significant when:
|
|
148 1. **P-value** ≤ specified threshold (default: 0.1)
|
|
149 2. **Fold change** ≥ specified threshold (default: 1.5)
|
|
150 3. **FDR correction** (if enabled) maintains significance
|
|
151
|
|
152 ## Map Visualization
|
|
153
|
|
154 ### Built-in Maps
|
|
155
|
|
156 #### HMRcore (Default)
|
|
157 - **Scope**: Core human metabolic network
|
|
158 - **Reactions**: ~300 essential reactions
|
|
159 - **Coverage**: Central carbon, amino acid, lipid metabolism
|
|
160 - **Use Case**: General overview, publication figures
|
|
161
|
|
162 #### ENGRO2
|
|
163 - **Scope**: Extended human genome-scale reconstruction
|
|
164 - **Reactions**: ~2,000 reactions
|
|
165 - **Coverage**: Comprehensive metabolic network
|
|
166 - **Use Case**: Detailed analysis, specialized tissues
|
|
167
|
|
168 #### Custom Maps
|
|
169 User-provided SVG files with reaction elements:
|
|
170 ```xml
|
|
171 <rect id="R00001" class="reaction" fill="gray" stroke="black"/>
|
|
172 <path id="R00002" class="reaction" fill="gray" stroke="black"/>
|
|
173 ```
|
|
174
|
|
175 ### Color Coding Scheme
|
|
176
|
|
177 #### Significance Colors
|
|
178 - **Red Gradient**: Significantly upregulated (positive fold change)
|
|
179 - **Blue Gradient**: Significantly downregulated (negative fold change)
|
|
180 - **Gray**: Not statistically significant
|
|
181 - **White**: No data available
|
|
182
|
|
183 #### Visual Elements
|
|
184 - **Line Width**: Proportional to fold change magnitude
|
|
185 - **Color Intensity**: Proportional to statistical significance (-log10 p-value)
|
|
186 - **Transparency**: Indicates confidence level
|
|
187
|
|
188 ### Color Maps
|
|
189
|
|
190 #### Jet (Default)
|
|
191 - High contrast color transitions
|
|
192 - Blue (low) → Green → Yellow → Red (high)
|
|
193 - Good for identifying extreme values
|
|
194
|
|
195 #### Viridis
|
|
196 - Perceptually uniform color scale
|
|
197 - Colorblind-friendly
|
|
198 - Purple (low) → Blue → Green → Yellow (high)
|
|
199
|
|
200 ## Output Files
|
|
201
|
|
202 ### Statistical Results
|
|
203 - `flux_statistics.tsv`: P-values, fold changes, test statistics for all reactions
|
|
204 - `significant_fluxes.tsv`: Only reactions meeting significance criteria
|
|
205 - `comparison_summary.txt`: Analysis parameters and summary statistics
|
|
206
|
|
207 ### Visualizations
|
|
208 - `flux_map.svg`: Interactive color-coded pathway map
|
|
209 - `flux_map.pdf`: High-resolution PDF (if requested)
|
|
210 - `flux_map.png`: Raster image (if requested)
|
|
211 - `legend.svg`: Color scale and statistical significance legend
|
|
212
|
|
213 ### Analysis Files
|
|
214 - `fold_changes.tsv`: Detailed fold change calculations
|
|
215 - `group_statistics.tsv`: Per-group summary statistics
|
|
216 - `comparison_matrix.tsv`: Pairwise comparison results
|
|
217
|
|
218 ## Examples
|
|
219
|
|
220 ### Basic Flux Comparison
|
|
221
|
|
222 ```bash
|
|
223 # Compare treatment vs control fluxes
|
|
224 flux_to_map -td /opt/COBRAxy \
|
|
225 -idf treatment_vs_control_fluxes.tsv \
|
|
226 -icf sample_groups.tsv \
|
|
227 -co manyvsmany \
|
|
228 -te ks \
|
|
229 -pv 0.05 \
|
|
230 -fc 2.0 \
|
|
231 -mc HMRcore \
|
|
232 -gs true \
|
|
233 -gp true \
|
|
234 -idop flux_comparison/
|
|
235 ```
|
|
236
|
|
237 ### Multiple Condition Analysis
|
|
238
|
|
239 ```bash
|
|
240 # Compare multiple experimental conditions
|
|
241 flux_to_map -td /opt/COBRAxy \
|
|
242 -idsf "cond1_flux.tsv cond2_flux.tsv cond3_flux.tsv" \
|
|
243 -naf "Control Treatment1 Treatment2" \
|
|
244 -co onevsrest \
|
|
245 -te wilcoxon \
|
|
246 -adj true \
|
|
247 -pv 0.01 \
|
|
248 -fc 1.8 \
|
|
249 -mc ENGRO2 \
|
|
250 -colorm viridis \
|
|
251 -idop multi_condition_flux/
|
|
252 ```
|
|
253
|
|
254 ### Custom Map Visualization
|
|
255
|
|
256 ```bash
|
|
257 # Use tissue-specific custom map
|
|
258 flux_to_map -td /opt/COBRAxy \
|
|
259 -idf liver_flux_data.tsv \
|
|
260 -icf liver_conditions.tsv \
|
|
261 -co manyvsmany \
|
|
262 -te ttest_ind \
|
|
263 -pv 0.05 \
|
|
264 -fc 1.5 \
|
|
265 -cm maps/liver_specific_map.svg \
|
|
266 -gs true \
|
|
267 -gp true \
|
|
268 -idop liver_flux_analysis/ \
|
|
269 -ol liver_analysis.log
|
|
270 ```
|
|
271
|
|
272 ### High-Throughput Analysis
|
|
273
|
|
274 ```bash
|
|
275 # Process multiple datasets with stringent criteria
|
|
276 flux_to_map -td /opt/COBRAxy \
|
|
277 -idsf "exp1.tsv exp2.tsv exp3.tsv exp4.tsv" \
|
|
278 -naf "Exp1 Exp2 Exp3 Exp4" \
|
|
279 -co manyvsmany \
|
|
280 -te ks \
|
|
281 -adj true \
|
|
282 -pv 0.001 \
|
|
283 -fc 3.0 \
|
|
284 -mc HMRcore \
|
|
285 -colorm jet \
|
|
286 -gs true \
|
|
287 -gp true \
|
|
288 -idop high_throughput_flux/
|
|
289 ```
|
|
290
|
|
291 ## Quality Control
|
|
292
|
|
293 ### Data Validation
|
|
294
|
|
295 #### Pre-analysis Checks
|
|
296 - Verify flux value distributions (check for outliers)
|
|
297 - Ensure sample names match between data and class files
|
|
298 - Validate reaction coverage across samples
|
|
299 - Check for missing values and their patterns
|
|
300
|
|
301 #### Statistical Validation
|
|
302 - Assess normality assumptions for parametric tests
|
|
303 - Verify adequate sample sizes per group (n≥3 recommended)
|
|
304 - Check variance homogeneity between groups
|
|
305 - Evaluate multiple testing burden
|
|
306
|
|
307 ### Result Interpretation
|
|
308
|
|
309 #### Biological Validation
|
|
310 - Compare results with known pathway activities
|
|
311 - Check for pathway coherence (related reactions should cluster)
|
|
312 - Validate against literature or experimental evidence
|
|
313 - Assess metabolic network connectivity
|
|
314
|
|
315 #### Technical Validation
|
|
316 - Compare results across different statistical tests
|
|
317 - Check sensitivity to parameter changes
|
|
318 - Validate fold change calculations
|
|
319 - Verify map element correspondence
|
|
320
|
|
321 ## Tips and Best Practices
|
|
322
|
|
323 ### Data Preparation
|
|
324 - **Normalization**: Ensure consistent flux units across samples
|
|
325 - **Filtering**: Remove reactions with excessive missing values (>50%)
|
|
326 - **Outlier Detection**: Identify and handle extreme flux values
|
|
327 - **Batch Effects**: Account for technical variation between experiments
|
|
328
|
|
329 ### Statistical Considerations
|
|
330 - Use FDR correction for multiple comparisons (`-adj true`)
|
|
331 - Choose appropriate statistical tests based on data distribution
|
|
332 - Consider effect size (fold change) alongside significance
|
|
333 - Validate results with independent datasets when possible
|
|
334
|
|
335 ### Visualization Optimization
|
|
336 - Select appropriate color maps for your audience
|
|
337 - Use high fold change thresholds (>2.0) for cleaner maps
|
|
338 - Export both SVG (editable) and PDF (publication) formats
|
|
339 - Include comprehensive legends and annotations
|
|
340
|
|
341 ### Performance Tips
|
|
342 - Use HMRcore for faster processing and clearer visualizations
|
|
343 - Reduce dataset size for initial exploratory analysis
|
|
344 - Process large datasets in batches if memory constrained
|
|
345 - Cache intermediate results for parameter optimization
|
|
346
|
|
347 ## Integration Workflow
|
|
348
|
|
349 ### Upstream Tools
|
|
350 - [Flux Simulation](flux-simulation.md) - Generate flux distributions for comparison
|
|
351 - [MAREA](marea.md) - Alternative analysis pathway for RAS/RPS data
|
|
352
|
|
353 ### Downstream Analysis
|
|
354 - Export results to statistical software (R, Python) for advanced analysis
|
|
355 - Integrate with pathway databases (KEGG, Reactome)
|
|
356 - Combine with other omics data for systems-level insights
|
|
357
|
|
358 ### Typical Pipeline
|
|
359
|
|
360 ```bash
|
|
361 # 1. Generate flux samples from constrained models
|
|
362 flux_simulation -td /opt/COBRAxy -ms ENGRO2 -in bounds/*.tsv \
|
|
363 -ni Sample1,Sample2,Control1,Control2 -a CBS \
|
|
364 -ot mean -idop fluxes/
|
|
365
|
|
366 # 2. Analyze and visualize flux differences
|
|
367 flux_to_map -td /opt/COBRAxy -idf fluxes/mean.csv \
|
|
368 -icf sample_groups.tsv -co manyvsmany -te ks \
|
|
369 -mc HMRcore -gs true -gp true -idop flux_maps/
|
|
370
|
|
371 # 3. Further analysis with custom scripts
|
|
372 python analyze_flux_results.py -i flux_maps/ -o final_results/
|
|
373 ```
|
|
374
|
|
375 ## Troubleshooting
|
|
376
|
|
377 ### Common Issues
|
|
378
|
|
379 **No significant reactions found**
|
|
380 - Lower p-value threshold (`-pv 0.2`)
|
|
381 - Reduce fold change requirement (`-fc 1.2`)
|
|
382 - Check sample group definitions and sizes
|
|
383 - Verify flux data quality and normalization
|
|
384
|
|
385 **Map rendering problems**
|
|
386 - Check SVG map file integrity and format
|
|
387 - Verify reaction ID matching between data and map
|
|
388 - Ensure sufficient system memory for large maps
|
|
389 - Validate XML structure of custom maps
|
|
390
|
|
391 **Statistical test failures**
|
|
392 - Check data distribution assumptions
|
|
393 - Verify sufficient sample sizes per group
|
|
394 - Consider alternative non-parametric tests
|
|
395 - Examine variance patterns between groups
|
|
396
|
|
397 ### Error Messages
|
|
398
|
|
399 | Error | Cause | Solution |
|
|
400 |-------|-------|----------|
|
|
401 | "Map file not found" | Missing/invalid map path | Check file location and format |
|
|
402 | "No matching reactions" | ID mismatch between data and map | Verify reaction naming consistency |
|
|
403 | "Insufficient data" | Too few samples per group | Increase sample sizes or merge groups |
|
|
404 | "Memory allocation failed" | Large dataset/map combination | Reduce data size or increase system memory |
|
|
405
|
|
406 ### Performance Issues
|
|
407
|
|
408 **Slow processing**
|
|
409 - Use HMRcore instead of ENGRO2 for faster rendering
|
|
410 - Reduce dataset size for testing
|
|
411 - Process subsets of reactions separately
|
|
412 - Monitor system resource usage
|
|
413
|
|
414 **Large output files**
|
|
415 - Use compressed formats when possible
|
|
416 - Reduce map resolution for preliminary analysis
|
|
417 - Export only essential output formats
|
|
418 - Clean temporary files regularly
|
|
419
|
|
420 ## Advanced Usage
|
|
421
|
|
422 ### Custom Statistical Functions
|
|
423
|
|
424 Advanced users can implement custom statistical tests by modifying the analysis functions:
|
|
425
|
|
426 ```python
|
|
427 def custom_test(group1, group2):
|
|
428 # Custom statistical test implementation
|
|
429 statistic, pvalue = your_test_function(group1, group2)
|
|
430 return statistic, pvalue
|
|
431 ```
|
|
432
|
|
433 ### Batch Processing Script
|
|
434
|
|
435 Process multiple experiments systematically:
|
|
436
|
|
437 ```bash
|
|
438 #!/bin/bash
|
|
439 experiments=("exp1" "exp2" "exp3" "exp4")
|
|
440 for exp in "${experiments[@]}"; do
|
|
441 flux_to_map -td /opt/COBRAxy \
|
|
442 -idf "data/${exp}_flux.tsv" \
|
|
443 -icf "data/${exp}_classes.tsv" \
|
|
444 -co manyvsmany -te ks -pv 0.05 \
|
|
445 -mc HMRcore -gs true -gp true \
|
|
446 -idop "results/${exp}/"
|
|
447 done
|
|
448 ```
|
|
449
|
|
450 ### Result Aggregation
|
|
451
|
|
452 Combine results across multiple analyses:
|
|
453
|
|
454 ```bash
|
|
455 # Merge significant reactions across experiments
|
|
456 python merge_flux_results.py \
|
|
457 -i results/exp*/significant_fluxes.tsv \
|
|
458 -o combined_significant_reactions.tsv \
|
|
459 --method intersection
|
|
460 ```
|
|
461
|
|
462 ## See Also
|
|
463
|
|
464 - [Flux Simulation](flux-simulation.md) - Generate input flux distributions
|
|
465 - [MAREA](marea.md) - Alternative pathway analysis approach
|
|
466 - [Custom Map Creation Guide](../tutorials/custom-map-creation.md)
|
|
467 - [Statistical Methods Reference](../tutorials/statistical-methods.md) |