Mercurial > repos > bimib > cobraxy
diff COBRAxy/docs/tools/marea.md @ 547:73f2f7e2be17 draft
Uploaded
| author | francesco_lapi |
|---|---|
| date | Tue, 28 Oct 2025 10:44:07 +0000 |
| parents | fcdbc81feb45 |
| children |
line wrap: on
line diff
--- a/COBRAxy/docs/tools/marea.md Mon Oct 27 12:33:08 2025 +0000 +++ b/COBRAxy/docs/tools/marea.md Tue Oct 28 10:44:07 2025 +0000 @@ -1,279 +1,151 @@ # MAREA -Metabolic Reaction Enrichment Analysis for statistical comparison and map visualization. +Metabolic Enrichment Analysis and Visualization. ## Overview -MAREA performs statistical enrichment analysis on RAS/RPS data to identify significantly different metabolic reactions between sample groups. It generates enriched pathway maps with color-coded reactions showing statistical significance and fold changes. +MAREA performs statistical comparison of metabolic scores (RAS/RPS) and visualizes results on pathway maps. + +## Galaxy Interface + +In Galaxy: **COBRAxy → Metabolic Reaction Enrichment Analysis** -## Usage +1. Upload RAS/RPS scores and sample class file +2. Select map and configure statistical parameters +3. Click **Run tool** -### Command Line +## Command-line console ```bash -marea -td /path/to/COBRAxy \ - -using_RAS true \ - -input_data ras_data.tsv \ - -input_class class_labels.tsv \ +marea -input_data scores.tsv \ + -input_class classes.csv \ + -choice_map ENGRO2 \ -comparison manyvsmany \ - -test ks \ - -pv 0.05 \ - -fc 1.5 \ - -choice_map ENGRO2 \ - -generate_svg true \ - -idop results/ + -pvalue 0.05 \ + -idop output/ ``` -### Galaxy Interface - -Select "MAREA" from the COBRAxy tool suite and configure analysis parameters through the web interface. - ## Parameters -### Required Parameters - -| Parameter | Flag | Description | -|-----------|------|-------------| -| Tool Directory | `-td, --tool_dir` | Path to COBRAxy installation directory | - -### Data Input Parameters - -| Parameter | Flag | Description | Default | -|-----------|------|-------------|---------| -| Use RAS | `-using_RAS, --using_RAS` | Include RAS analysis | true | -| RAS Data | `-input_data, --input_data` | RAS scores TSV file | - | -| RAS Classes | `-input_class, --input_class` | Sample group labels | - | -| Multiple RAS | `-input_datas, --input_datas` | Multiple RAS files (space-separated) | - | -| RAS Names | `-names, --names` | Names for multiple datasets | - | -| Use RPS | `-using_RPS, --using_RPS` | Include RPS analysis | false | -| RPS Data | `-input_data_rps, --input_data_rps` | RPS scores TSV file | - | -| RPS Classes | `-input_class_rps, --input_class_rps` | RPS sample groups | - | - -### Statistical Parameters - | Parameter | Flag | Description | Default | |-----------|------|-------------|---------| -| Comparison Type | `-co, --comparison` | Statistical comparison mode | manyvsmany | -| Statistical Test | `-te, --test` | Statistical test method | ks | -| P-Value Threshold | `-pv, --pValue` | Significance threshold | 0.1 | -| Adjusted P-values | `-adj, --adjusted` | Apply FDR correction | false | -| Fold Change | `-fc, --fChange` | Minimum fold change | 1.5 | -| Net Enrichment | `-ne, --net` | Use net enrichment for RPS | false | -| Analysis Option | `-op, --option` | Analysis mode | datasets | - -### Visualization Parameters - -| Parameter | Flag | Description | Default | -|-----------|------|-------------|---------| -| Map Choice | `-choice_map, --choice_map` | Built-in metabolic map | - | -| Custom Map | `-custom_map, --custom_map` | Path to custom SVG map | - | -| Generate SVG | `-gs, --generate_svg` | Create SVG output | true | -| Generate PDF | `-gp, --generate_pdf` | Create PDF output | false | -| Generate PNG | `-gpng, --generate_png` | Create PNG output | false | -| Color Map | `-colorm, --color_map` | Color scheme (jet/viridis) | jet | -| Output Directory | `-idop, --output_path` | Results directory | result/ | - -### Advanced Parameters - -| Parameter | Flag | Description | Default | -|-----------|------|-------------|---------| -| Output Log | `-ol, --out_log` | Log file path | - | -| Control Sample | `-on, --control` | Control group identifier | - | +| Input Data | `-input_data` | RAS/RPS scores file | - | +| Input Class | `-input_class` | Sample class definitions | - | +| Map Choice | `-choice_map` | ENGRO2, Recon, or Custom | ENGRO2 | +| Custom Map | `-custom_map` | Path to custom SVG map | - | +| Comparison | `-comparison` | manyvsmany, onevsrest, onevsmany | manyvsmany | +| P-value | `-pvalue` | Significance threshold | 0.05 | +| FDR Correction | `-fdr` | Apply FDR correction | true | +| Test Type | `-test_type` | t, wilcoxon, ks, DESeq | t | +| Net RPS | `--net` | Use net contribution for reversible reactions (RPS only) | false | +| Output Path | `-idop` | Output directory | marea/ | ## Input Formats -### RAS/RPS Data File - -Tab-separated format with reactions as rows and samples as columns: +### Metabolic Scores ``` -Reaction Sample1 Sample2 Sample3 Sample4 -R00001 1.25 0.85 1.42 0.78 -R00002 0.65 1.35 0.72 1.28 -R00003 2.15 2.05 0.45 0.52 +Reaction Sample1 Sample2 Sample3 +R00001 1.25 0.85 1.42 +R00002 0.65 1.35 0.72 ``` -### Class Labels File - -Sample grouping information: +### Sample Classes ``` -Sample Class +SampleID Class Sample1 Control Sample2 Treatment -Sample3 Control -Sample4 Treatment +Sample3 Treatment ``` -## Comparison Types - -### manyvsmany -Compare all possible pairs of groups: -- Group A vs Group B -- Group A vs Group C -- Group B vs Group C - -### onevsrest -Compare each group against all others combined: -- Group A vs (Group B + Group C) -- Group B vs (Group A + Group C) - -### onevsmany -Compare one specific group against each other group separately: -- Control vs Treatment1 -- Control vs Treatment2 +**File Format Notes:** +- Use **tab-separated** values (TSV) or **comma-separated** (CSV) +- First row must contain column headers +- Sample names must match between scores and class file +- Class names should not contain spaces ## Statistical Tests -| Test | Description | Use Case | -|------|-------------|----------| -| `ks` | Kolmogorov-Smirnov | Non-parametric, distribution-free | -| `ttest_p` | Paired t-test | Related samples | -| `ttest_ind` | Independent t-test | Unrelated samples | -| `wilcoxon` | Wilcoxon signed-rank | Non-parametric paired | -| `mw` | Mann-Whitney U | Non-parametric independent | -| `DESeq` | DESeq2-style analysis | Count-like data with dispersion | +- **t**: Student's t-test (parametric, assumes normality) +- **wilcoxon**: Wilcoxon/Mann-Whitney (non-parametric) +- **ks**: Kolmogorov-Smirnov (distribution-free) +- **DESeq**: DESeq2-like test (**RAS only**, requires ≥2 replicates per sample) + +## Comparison Types -## Output Files +- **manyvsmany**: All pairwise comparisons +- **onevsrest**: Each class vs all others +- **onevsmany**: One reference vs multiple classes + +## Advanced Options + +### Net RPS Values -### Statistical Results -- `comparison_stats.tsv`: P-values, fold changes, and test statistics -- `enriched_reactions.tsv`: Significantly enriched reactions only -- `comparison_summary.txt`: Analysis summary and parameters +When analyzing RPS data with reversible reactions, the `--net` parameter controls arrow coloring: + +**When `--net false` (default):** +- Each direction of a reversible reaction colored independently +- Forward and backward contributions shown separately -### Visualization -- `pathway_map.svg`: Color-coded metabolic map -- `pathway_map.pdf`: PDF version (if requested) -- `pathway_map.png`: PNG version (if requested) -- `legend.svg`: Color scale and significance indicators +**When `--net true` (RPS only):** +- Arrow tips colored with net contribution of both directions +- Useful for visualizing overall metabolite flow direction + +**Note**: This option only applies to RPS analysis and affects visualization of reversible reactions on metabolic maps. + +## Output + +- `*_map.svg`: Annotated pathway maps +- `comparison_results.tsv`: Statistical results +- `*.log`: Processing log ## Examples -### Basic RAS Analysis +### Basic Analysis ```bash -# Simple two-group comparison -marea -td /opt/COBRAxy \ - -using_RAS true \ - -input_data ras_scores.tsv \ - -input_class sample_groups.tsv \ +marea -input_data ras_scores.tsv \ + -input_class classes.csv \ + -choice_map ENGRO2 \ -comparison manyvsmany \ - -test ks \ - -pv 0.05 \ - -choice_map ENGRO2 \ + -pvalue 0.05 \ -idop results/ ``` -### Combined RAS + RPS Analysis +### Custom Map ```bash -# Multi-modal analysis -marea -td /opt/COBRAxy \ - -using_RAS true \ - -input_data ras_scores.tsv \ - -input_class ras_groups.tsv \ - -using_RPS true \ - -input_data_rps rps_scores.tsv \ - -input_class_rps rps_groups.tsv \ +marea -input_data rps_scores.tsv \ + -input_class classes.csv \ + -choice_map Custom \ + -custom_map pathway.svg \ -comparison onevsrest \ - -test DESeq \ - -adj true \ - -fc 2.0 \ - -choice_map HMRcore \ - -generate_pdf true \ - -idop multimodal_results/ -``` - -### Multiple Dataset Analysis - -```bash -# Compare multiple experiments -marea -td /opt/COBRAxy \ - -using_RAS true \ - -input_datas exp1_ras.tsv exp2_ras.tsv exp3_ras.tsv \ - -names "Experiment1" "Experiment2" "Experiment3" \ - -comparison onevsmany \ - -test wilcoxon \ - -pv 0.01 \ - -custom_map custom_pathway.svg \ - -idop multi_experiment/ + -idop results/ ``` -## Map Visualization - -### Built-in Maps -- **ENGRO2**: Human genome-scale reconstruction -- **HMRcore**: Core human metabolic network -- **Recon**: Recon3D human model - -### Color Coding -- **Red**: Significantly upregulated (high activity) -- **Blue**: Significantly downregulated (low activity) -- **Gray**: Not significant -- **Line Width**: Proportional to fold change magnitude - -### Custom Maps -SVG files with reaction elements having IDs matching your data: -```xml -<rect id="R00001" class="reaction" .../> -<path id="R00002" class="reaction" .../> -``` - -## Quality Control +### Non-parametric Test -### Pre-analysis Checks -- Verify sample names match between data and class files -- Check for missing values and outliers -- Ensure adequate sample sizes per group (n ≥ 3 recommended) - -### Post-analysis Validation -- Review statistical distribution assumptions -- Check multiple testing correction effects -- Validate biological relevance of enriched pathways - -## Tips and Best Practices - -### Statistical Considerations -- Use FDR correction (`-adj true`) for multiple comparisons -- Choose appropriate tests based on data distribution -- Consider effect size alongside significance - -### Visualization Optimization -- Use high fold change thresholds (>2.0) for cleaner maps -- Export both SVG (editable) and PDF (publication-ready) formats -- Adjust color schemes for colorblind accessibility +```bash +marea -input_data scores.tsv \ + -input_class classes.csv \ + -choice_map ENGRO2 \ + -test_type wilcoxon \ + -pvalue 0.01 \ + -fdr true \ + -idop results/ +``` ## Troubleshooting -### Common Issues - -**No significant reactions found** -- Lower p-value threshold (`-pv 0.1`) -- Reduce fold change requirement (`-fc 1.2`) -- Check sample group definitions - -**Map rendering errors** -- Verify SVG map file integrity -- Check reaction ID matching between data and map -- Ensure sufficient system memory for large maps - -**Statistical test failures** -- Verify data normality for parametric tests -- Check for sufficient sample sizes -- Consider alternative test methods - -## Integration - -### Upstream Tools -- [RAS Generator](ras-generator.md) - Generate RAS scores -- [RPS Generator](rps-generator.md) - Generate RPS scores - -### Downstream Analysis -- [Flux to Map](flux-to-map.md) - Additional visualization options -- [MAREA Cluster](marea-cluster.md) - Sample clustering analysis +| Error | Solution | +|-------|----------| +| "No matching reactions" | Verify reaction IDs | +| "Insufficient samples" | Increase sample sizes per group | ## See Also -- [Statistical Tests Documentation](/tutorials/statistical-tests.md) -- [Map Customization Guide](/tutorials/custom-maps.md) -- [Multi-modal Analysis Tutorial](/tutorials/multimodal-analysis.md) \ No newline at end of file +- [RAS Generator](tools/ras-generator) +- [RPS Generator](tools/rps-generator) +- [Flux to Map](tools/flux-to-map) +- [Built-in Models](reference/built-in-models)
