Mercurial > repos > bimib > cobraxy
diff COBRAxy/docs/tools/flux-to-map.md @ 547:73f2f7e2be17 draft
Uploaded
| author | francesco_lapi |
|---|---|
| date | Tue, 28 Oct 2025 10:44:07 +0000 |
| parents | fcdbc81feb45 |
| children |
line wrap: on
line diff
--- a/COBRAxy/docs/tools/flux-to-map.md Mon Oct 27 12:33:08 2025 +0000 +++ b/COBRAxy/docs/tools/flux-to-map.md Tue Oct 28 10:44:07 2025 +0000 @@ -1,467 +1,125 @@ # Flux to Map -Visualize metabolic flux data on pathway maps with statistical analysis and color coding. +Visualize flux distributions on metabolic pathway maps. ## Overview -Flux to Map performs statistical analysis on flux distribution data and generates color-coded metabolic pathway maps. It compares flux values between sample groups and highlights significantly different reactions with appropriate colors and line weights. +This tool analyzes and visualizes statistical differences in reaction fluxes of groups of samples, returned by the Flux Simulation tool. The results can be visualized on s SVG metabolic map. + +## Galaxy Interface + +In Galaxy: **COBRAxy → Metabolic Flux Enrichment Analysis** -## Usage +1. Upload flux data and sample class file +2. Select the map and configure the comparison type +3. Click **Run tool** -### Command Line +## Command-line console ```bash -flux_to_map -td /path/to/COBRAxy \ - -input_data_fluxes flux_data.tsv \ - -input_class_fluxes sample_groups.tsv \ - -comparison manyvsmany \ - -test ks \ - -pv 0.05 \ - -fc 1.5 \ +flux_to_map -input_data fluxes.csv \ + -input_class classes.csv \ -choice_map ENGRO2 \ - -generate_svg true \ - -generate_pdf true \ - -idop flux_maps/ + -comparison manyvsmany \ + -pvalue 0.05 \ + -idop output/ ``` -### Galaxy Interface - -Select "Flux to Map" from the COBRAxy tool suite and configure flux analysis and visualization parameters. - ## Parameters -### Required Parameters - -| Parameter | Flag | Description | -|-----------|------|-------------| -| Tool Directory | `-td, --tool_dir` | Path to COBRAxy installation directory | - -### Data Input Parameters - -| Parameter | Flag | Description | Default | -|-----------|------|-------------|---------| -| Flux Data | `-idf, --input_data_fluxes` | Flux values TSV file | - | -| Flux Classes | `-icf, --input_class_fluxes` | Sample group labels for fluxes | - | -| Multiple Flux Files | `-idsf, --input_datas_fluxes` | Multiple flux datasets (space-separated) | - | -| Flux Names | `-naf, --names_fluxes` | Names for multiple flux datasets | - | -| Analysis Option | `-op, --option` | Analysis mode (datasets or dataset_class) | - | - -### Statistical Parameters - | Parameter | Flag | Description | Default | |-----------|------|-------------|---------| -| Comparison Type | `-co, --comparison` | Statistical comparison mode | manyvsmany | -| Statistical Test | `-te, --test` | Statistical test method | ks | -| P-Value Threshold | `-pv, --pValue` | Significance threshold | 0.1 | -| Adjusted P-values | `-adj, --adjusted` | Apply FDR correction | false | -| Fold Change | `-fc, --fChange` | Minimum fold change threshold | 1.5 | - -### Visualization Parameters - -| Parameter | Flag | Description | Default | -|-----------|------|-------------|---------| -| Map Choice | `-mc, --choice_map` | Built-in metabolic map | HMRcore | -| Custom Map | `-cm, --custom_map` | Path to custom SVG map | - | -| Generate SVG | `-gs, --generate_svg` | Create SVG output | true | -| Generate PDF | `-gp, --generate_pdf` | Create PDF output | true | -| Color Map | `-colorm, --color_map` | Color scheme (jet, viridis) | - | -| Output Directory | `-idop, --output_path` | Results directory | result/ | - -### Advanced Parameters - -| Parameter | Flag | Description | Default | -|-----------|------|-------------|---------| -| Output Log | `-ol, --out_log` | Log file path | - | -| Control Sample | `-on, --control` | Control group identifier | - | +| Input Data | `-input_data` | Flux data file | - | +| Input Class | `-input_class` | Sample class definitions | - | +| Map Choice | `-choice_map` | ENGRO2, Recon, or Custom | ENGRO2 | +| Custom Map | `-custom_map` | Path to custom SVG map | - | +| Comparison | `-comparison` | manyvsmany, onevsrest, onevsmany | manyvsmany | +| P-value | `-pvalue` | Significance threshold | 0.05 | +| FDR Correction | `-fdr` | Apply FDR correction | true | +| Test Type | `-test_type` | t, wilcoxon, ks | t | +| Color Map | `--color_map` | Color scheme: viridis or jet | viridis | +| Output Path | `-idop` | Output directory | flux_to_map/ | ## Input Formats -### Flux Data File - -Tab-separated format with reactions as rows and samples as columns: +### Flux Data ``` -Reaction Sample1 Sample2 Sample3 Control1 Control2 -R00001 15.23 -8.45 22.1 12.8 14.2 -R00002 0.0 12.67 -5.3 8.9 7.4 -R00003 45.8 38.2 51.7 42.1 39.8 -R00004 -12.4 -15.8 -9.2 -11.5 -13.1 +Reaction Sample1 Sample2 Sample3 +R00001 12.5 8.5 14.2 +R00002 -6.5 13.5 7.2 ``` -### Sample Class File - -Group assignment for statistical comparisons: +### Sample Classes ``` -Sample Class -Sample1 Treatment -Sample2 Treatment +SampleID Class +Sample1 Control +Sample2 Treatment Sample3 Treatment -Control1 Control -Control2 Control -``` - -### Multiple Dataset Format - -When using multiple flux files, provide space-separated paths and corresponding names: - -```bash --idsf "dataset1_flux.tsv dataset2_flux.tsv dataset3_flux.tsv" --naf "Condition_A Condition_B Condition_C" ``` -## Statistical Analysis - -### Comparison Types - -#### manyvsmany -Compare all possible group pairs: -- Treatment vs Control -- Condition_A vs Condition_B -- Condition_A vs Condition_C -- Condition_B vs Condition_C - -#### onevsrest -Compare each group against all others combined: -- Treatment vs (Control + Other) -- Control vs (Treatment + Other) - -#### onevsmany -Compare one reference group against each other group: -- Control vs Treatment -- Control vs Condition_A -- Control vs Condition_B - -### Statistical Tests +**Note on Metabolic Map** +We provide a default SVG map for the ENGRO2 model. If another model is used, we suggest uploading a custom SVG map. -| Test | Description | Best For | -|------|-------------|----------| -| `ks` | Kolmogorov-Smirnov | Non-parametric, distribution-free | -| `ttest_p` | Paired t-test | Related samples, normal distributions | -| `ttest_ind` | Independent t-test | Independent samples, normal distributions | -| `wilcoxon` | Wilcoxon signed-rank | Non-parametric paired comparisons | -| `mw` | Mann-Whitney U | Non-parametric independent comparisons | - -### Significance Assessment +**File Format Notes:** +- Use **tab-separated** values (TSV) or **comma-separated** (CSV) +- First row must contain column headers +- Sample names must match between flux data and class file +- Class names should not contain spaces -Reactions are considered significant when: -1. **P-value** ≤ specified threshold (default: 0.1) -2. **Fold change** ≥ specified threshold (default: 1.5) -3. **FDR correction** (if enabled) maintains significance - -## Map Visualization - -### Built-in Maps - -#### HMRcore (Default) -- **Scope**: Core human metabolic network -- **Reactions**: ~300 essential reactions -- **Coverage**: Central carbon, amino acid, lipid metabolism -- **Use Case**: General overview, publication figures +## Statistical Tests -#### ENGRO2 -- **Scope**: Extended human genome-scale reconstruction -- **Reactions**: ~2,000 reactions -- **Coverage**: Comprehensive metabolic network -- **Use Case**: Detailed analysis, specialized tissues - -#### Custom Maps -User-provided SVG files with reaction elements: -```xml -<rect id="R00001" class="reaction" fill="gray" stroke="black"/> -<path id="R00002" class="reaction" fill="gray" stroke="black"/> -``` +- **t**: Student's t-test (parametric, assumes normality) +- **wilcoxon**: Wilcoxon/Mann-Whitney (non-parametric) +- **ks**: Kolmogorov-Smirnov (distribution-free) -### Color Coding Scheme - -#### Significance Colors -- **Red Gradient**: Significantly upregulated (positive fold change) -- **Blue Gradient**: Significantly downregulated (negative fold change) -- **Gray**: Not statistically significant -- **White**: No data available - -#### Visual Elements -- **Line Width**: Proportional to fold change magnitude -- **Color Intensity**: Proportional to statistical significance (-log10 p-value) -- **Transparency**: Indicates confidence level - -### Color Maps +## Comparison Types -#### Jet (Default) -- High contrast color transitions -- Blue (low) → Green → Yellow → Red (high) -- Good for identifying extreme values - -#### Viridis -- Perceptually uniform color scale -- Colorblind-friendly -- Purple (low) → Blue → Green → Yellow (high) - -## Output Files +- **manyvsmany**: All pairwise class comparisons +- **onevsrest**: Each class vs all others +- **onevsmany**: One reference vs multiple classes -### Statistical Results -- `flux_statistics.tsv`: P-values, fold changes, test statistics for all reactions -- `significant_fluxes.tsv`: Only reactions meeting significance criteria -- `comparison_summary.txt`: Analysis parameters and summary statistics +## Output -### Visualizations -- `flux_map.svg`: Interactive color-coded pathway map -- `flux_map.pdf`: High-resolution PDF (if requested) -- `flux_map.png`: Raster image (if requested) -- `legend.svg`: Color scale and statistical significance legend - -### Analysis Files -- `fold_changes.tsv`: Detailed fold change calculations -- `group_statistics.tsv`: Per-group summary statistics -- `comparison_matrix.tsv`: Pairwise comparison results +- `*_map.svg`: Annotated pathway maps +- `comparison_results.tsv`: Statistical results +- `*.log`: Processing log ## Examples -### Basic Flux Comparison - -```bash -# Compare treatment vs control fluxes -flux_to_map -td /opt/COBRAxy \ - -idf treatment_vs_control_fluxes.tsv \ - -icf sample_groups.tsv \ - -co manyvsmany \ - -te ks \ - -pv 0.05 \ - -fc 2.0 \ - -mc HMRcore \ - -gs true \ - -gp true \ - -idop flux_comparison/ -``` - -### Multiple Condition Analysis +### Basic Comparison ```bash -# Compare multiple experimental conditions -flux_to_map -td /opt/COBRAxy \ - -idsf "cond1_flux.tsv cond2_flux.tsv cond3_flux.tsv" \ - -naf "Control Treatment1 Treatment2" \ - -co onevsrest \ - -te wilcoxon \ - -adj true \ - -pv 0.01 \ - -fc 1.8 \ - -mc ENGRO2 \ - -colorm viridis \ - -idop multi_condition_flux/ -``` - -### Custom Map Visualization - -```bash -# Use tissue-specific custom map -flux_to_map -td /opt/COBRAxy \ - -idf liver_flux_data.tsv \ - -icf liver_conditions.tsv \ - -co manyvsmany \ - -te ttest_ind \ - -pv 0.05 \ - -fc 1.5 \ - -cm maps/liver_specific_map.svg \ - -gs true \ - -gp true \ - -idop liver_flux_analysis/ \ - -ol liver_analysis.log -``` - -### High-Throughput Analysis - -```bash -# Process multiple datasets with stringent criteria -flux_to_map -td /opt/COBRAxy \ - -idsf "exp1.tsv exp2.tsv exp3.tsv exp4.tsv" \ - -naf "Exp1 Exp2 Exp3 Exp4" \ - -co manyvsmany \ - -te ks \ - -adj true \ - -pv 0.001 \ - -fc 3.0 \ - -mc HMRcore \ - -colorm jet \ - -gs true \ - -gp true \ - -idop high_throughput_flux/ +flux_to_map -input_data fluxes.csv \ + -input_class classes.csv \ + -choice_map ENGRO2 \ + -comparison manyvsmany \ + -pvalue 0.05 \ + -idop results/ ``` -## Quality Control - -### Data Validation - -#### Pre-analysis Checks -- Verify flux value distributions (check for outliers) -- Ensure sample names match between data and class files -- Validate reaction coverage across samples -- Check for missing values and their patterns - -#### Statistical Validation -- Assess normality assumptions for parametric tests -- Verify adequate sample sizes per group (n≥3 recommended) -- Check variance homogeneity between groups -- Evaluate multiple testing burden - -### Result Interpretation - -#### Biological Validation -- Compare results with known pathway activities -- Check for pathway coherence (related reactions should cluster) -- Validate against literature or experimental evidence -- Assess metabolic network connectivity - -#### Technical Validation -- Compare results across different statistical tests -- Check sensitivity to parameter changes -- Validate fold change calculations -- Verify map element correspondence - -## Tips and Best Practices - -### Data Preparation -- **Normalization**: Ensure consistent flux units across samples -- **Filtering**: Remove reactions with excessive missing values (>50%) -- **Outlier Detection**: Identify and handle extreme flux values -- **Batch Effects**: Account for technical variation between experiments - -### Statistical Considerations -- Use FDR correction for multiple comparisons (`-adj true`) -- Choose appropriate statistical tests based on data distribution -- Consider effect size (fold change) alongside significance -- Validate results with independent datasets when possible - -### Visualization Optimization -- Select appropriate color maps for your audience -- Use high fold change thresholds (>2.0) for cleaner maps -- Export both SVG (editable) and PDF (publication) formats -- Include comprehensive legends and annotations - -### Performance Tips -- Use HMRcore for faster processing and clearer visualizations -- Reduce dataset size for initial exploratory analysis -- Process large datasets in batches if memory constrained -- Cache intermediate results for parameter optimization - -## Integration Workflow - -### Upstream Tools -- [Flux Simulation](flux-simulation.md) - Generate flux distributions for comparison -- [MAREA](marea.md) - Alternative analysis pathway for RAS/RPS data - -### Downstream Analysis -- Export results to statistical software (R, Python) for advanced analysis -- Integrate with pathway databases (KEGG, Reactome) -- Combine with other omics data for systems-level insights - -### Typical Pipeline +### With Custom Map ```bash -# 1. Generate flux samples from constrained models -flux_simulation -td /opt/COBRAxy -ms ENGRO2 -in bounds/*.tsv \ - -ni Sample1,Sample2,Control1,Control2 -a CBS \ - -ot mean -idop fluxes/ - -# 2. Analyze and visualize flux differences -flux_to_map -td /opt/COBRAxy -idf fluxes/mean.csv \ - -icf sample_groups.tsv -co manyvsmany -te ks \ - -mc HMRcore -gs true -gp true -idop flux_maps/ - -# 3. Further analysis with custom scripts -python analyze_flux_results.py -i flux_maps/ -o final_results/ +flux_to_map -input_data fluxes.csv \ + -input_class classes.csv \ + -choice_map Custom \ + -custom_map pathway.svg \ + -comparison onevsrest \ + -test_type wilcoxon \ + -idop results/ ``` ## Troubleshooting -### Common Issues - -**No significant reactions found** -- Lower p-value threshold (`-pv 0.2`) -- Reduce fold change requirement (`-fc 1.2`) -- Check sample group definitions and sizes -- Verify flux data quality and normalization - -**Map rendering problems** -- Check SVG map file integrity and format -- Verify reaction ID matching between data and map -- Ensure sufficient system memory for large maps -- Validate XML structure of custom maps - -**Statistical test failures** -- Check data distribution assumptions -- Verify sufficient sample sizes per group -- Consider alternative non-parametric tests -- Examine variance patterns between groups - -### Error Messages - -| Error | Cause | Solution | -|-------|-------|----------| -| "Map file not found" | Missing/invalid map path | Check file location and format | -| "No matching reactions" | ID mismatch between data and map | Verify reaction naming consistency | -| "Insufficient data" | Too few samples per group | Increase sample sizes or merge groups | -| "Memory allocation failed" | Large dataset/map combination | Reduce data size or increase system memory | - -### Performance Issues - -**Slow processing** -- Use HMRcore instead of ENGRO2 for faster rendering -- Reduce dataset size for testing -- Process subsets of reactions separately -- Monitor system resource usage - -**Large output files** -- Use compressed formats when possible -- Reduce map resolution for preliminary analysis -- Export only essential output formats -- Clean temporary files regularly - -## Advanced Usage - -### Custom Statistical Functions - -Advanced users can implement custom statistical tests by modifying the analysis functions: - -```python -def custom_test(group1, group2): - # Custom statistical test implementation - statistic, pvalue = your_test_function(group1, group2) - return statistic, pvalue -``` - -### Batch Processing Script - -Process multiple experiments systematically: - -```bash -#!/bin/bash -experiments=("exp1" "exp2" "exp3" "exp4") -for exp in "${experiments[@]}"; do - flux_to_map -td /opt/COBRAxy \ - -idf "data/${exp}_flux.tsv" \ - -icf "data/${exp}_classes.tsv" \ - -co manyvsmany -te ks -pv 0.05 \ - -mc HMRcore -gs true -gp true \ - -idop "results/${exp}/" -done -``` - -### Result Aggregation - -Combine results across multiple analyses: - -```bash -# Merge significant reactions across experiments -python merge_flux_results.py \ - -i results/exp*/significant_fluxes.tsv \ - -o combined_significant_reactions.tsv \ - --method intersection -``` +| Error | Solution | +|-------|----------| +| "No matching reactions" | Verify reaction ID consistency | +| "Insufficient data" | Increase sample sizes | ## See Also -- [Flux Simulation](flux-simulation.md) - Generate input flux distributions -- [MAREA](marea.md) - Alternative pathway analysis approach -- [Custom Map Creation Guide](/tutorials/custom-map-creation.md) -- [Statistical Methods Reference](/tutorials/statistical-methods.md) \ No newline at end of file +- [MAREA](tools/marea) +- [Flux Simulation](tools/flux-simulation) +- [Built-in Models](reference/built-in-models)
