| 
492
 | 
     1 # Getting Started
 | 
| 
 | 
     2 
 | 
| 
 | 
     3 Welcome to COBRAxy! This guide will help you get up and running with metabolic flux analysis.
 | 
| 
 | 
     4 
 | 
| 
 | 
     5 ## What is COBRAxy?
 | 
| 
 | 
     6 
 | 
| 
 | 
     7 COBRAxy is a comprehensive toolkit for metabolic flux analysis that bridges the gap between omics data and biological insights. It provides:
 | 
| 
 | 
     8 
 | 
| 
 | 
     9 - **Data Integration**: Combine gene expression and metabolite data
 | 
| 
 | 
    10 - **Metabolic Modeling**: Use constraint-based models for flux analysis
 | 
| 
 | 
    11 - **Visualization**: Generate interactive pathway maps
 | 
| 
 | 
    12 - **Statistical Analysis**: Perform enrichment and sensitivity analysis
 | 
| 
 | 
    13 
 | 
| 
 | 
    14 ## Core Concepts
 | 
| 
 | 
    15 
 | 
| 
 | 
    16 ### Reaction Activity Scores (RAS)
 | 
| 
 | 
    17 RAS quantify how active metabolic reactions are based on gene expression data. COBRAxy computes RAS by:
 | 
| 
 | 
    18 1. Mapping genes to reactions via GPR (Gene-Protein-Reaction) rules
 | 
| 
 | 
    19 2. Applying logical operations (AND/OR) based on enzyme complexes
 | 
| 
 | 
    20 3. Producing activity scores for each reaction in each sample
 | 
| 
 | 
    21 
 | 
| 
 | 
    22 ### Reaction Propensity Scores (RPS)
 | 
| 
 | 
    23 RPS indicate metabolic preferences based on metabolite abundance:
 | 
| 
 | 
    24 1. Map metabolites to reactions as substrates/products
 | 
| 
 | 
    25 2. Weight by stoichiometry and frequency
 | 
| 
 | 
    26 3. Compute propensity scores using log-normalized formulas
 | 
| 
 | 
    27 
 | 
| 
 | 
    28 ### Flux Sampling
 | 
| 
 | 
    29 Sample feasible flux distributions using:
 | 
| 
 | 
    30 - **CBS (Coordinate Hit-and-Run with Rounding)**: Fast, uniform sampling
 | 
| 
 | 
    31 - **OptGP (Optimal Growth Parallel)**: Growth-optimized sampling
 | 
| 
 | 
    32 
 | 
| 
 | 
    33 ## Analysis Workflows
 | 
| 
 | 
    34 
 | 
| 
 | 
    35 COBRAxy supports two main analysis paths:
 | 
| 
 | 
    36 
 | 
| 
 | 
    37 ### 1. Enrichment Analysis Workflow
 | 
| 
 | 
    38 ```bash
 | 
| 
 | 
    39 # Generate activity scores
 | 
| 
 | 
    40 ras_generator → RAS values
 | 
| 
 | 
    41 rps_generator → RPS values
 | 
| 
 | 
    42 
 | 
| 
 | 
    43 # Statistical enrichment analysis  
 | 
| 
 | 
    44 marea → Enriched pathway maps
 | 
| 
 | 
    45 ```
 | 
| 
 | 
    46 
 | 
| 
 | 
    47 **Use when**: You want to identify significantly altered pathways and create publication-ready maps.
 | 
| 
 | 
    48 
 | 
| 
 | 
    49 ### 2. Flux Simulation Workflow  
 | 
| 
 | 
    50 ```bash
 | 
| 
 | 
    51 # Apply constraints to model
 | 
| 
 | 
    52 ras_generator → RAS values
 | 
| 
 | 
    53 ras_to_bounds → Constrained model
 | 
| 
 | 
    54 
 | 
| 
 | 
    55 # Sample flux distributions
 | 
| 
 | 
    56 flux_simulation → Flux samples
 | 
| 
 | 
    57 flux_to_map → Final visualizations
 | 
| 
 | 
    58 ```
 | 
| 
 | 
    59 
 | 
| 
 | 
    60 **Use when**: You want to predict metabolic flux distributions and study network-wide changes.
 | 
| 
 | 
    61 
 | 
| 
 | 
    62 ## Your First Analysis
 | 
| 
 | 
    63 
 | 
| 
 | 
    64 Let's run a basic analysis with sample data:
 | 
| 
 | 
    65 
 | 
| 
 | 
    66 ### Step 1: Prepare Your Data
 | 
| 
 | 
    67 
 | 
| 
 | 
    68 You'll need:
 | 
| 
 | 
    69 - **Gene expression data**: TSV file with genes (rows) × samples (columns)
 | 
| 
 | 
    70 - **Metabolic model**: SBML file or use built-in models (ENGRO2, Recon)
 | 
| 
 | 
    71 - **Metabolite data** (optional): TSV file with metabolites (rows) × samples (columns)
 | 
| 
 | 
    72 
 | 
| 
 | 
    73 ### Step 2: Generate Activity Scores
 | 
| 
 | 
    74 
 | 
| 
 | 
    75 ```bash
 | 
| 
 | 
    76 # Generate RAS from expression data
 | 
| 
 | 
    77 ras_generator -td $(pwd) \
 | 
| 
 | 
    78   -in expression_data.tsv \
 | 
| 
 | 
    79   -ra ras_output.tsv \
 | 
| 
 | 
    80   -rs ENGRO2
 | 
| 
 | 
    81 ```
 | 
| 
 | 
    82 
 | 
| 
 | 
    83 ### Step 3: Create Pathway Maps
 | 
| 
 | 
    84 
 | 
| 
 | 
    85 ```bash
 | 
| 
 | 
    86 # Generate enriched pathway maps
 | 
| 
 | 
    87 marea -td $(pwd) \
 | 
| 
 | 
    88   -using_RAS true \
 | 
| 
 | 
    89   -input_data ras_output.tsv \
 | 
| 
 | 
    90   -choice_map ENGRO2 \
 | 
| 
 | 
    91   -gs true \
 | 
| 
 | 
    92   -idop pathway_maps
 | 
| 
 | 
    93 ```
 | 
| 
 | 
    94 
 | 
| 
 | 
    95 ### Step 4: View Results
 | 
| 
 | 
    96 
 | 
| 
 | 
    97 Your analysis will generate:
 | 
| 
 | 
    98 - **RAS values**: `ras_output.tsv` - Activity scores for each reaction
 | 
| 
 | 
    99 - **Statistical maps**: `pathway_maps/` - SVG files with enrichment visualization
 | 
| 
 | 
   100 - **Log files**: Detailed execution logs for troubleshooting
 | 
| 
 | 
   101 
 | 
| 
 | 
   102 ## Built-in Models
 | 
| 
 | 
   103 
 | 
| 
 | 
   104 COBRAxy includes ready-to-use metabolic models:
 | 
| 
 | 
   105 
 | 
| 
 | 
   106 | Model | Organism | Reactions | Genes | Description |
 | 
| 
 | 
   107 |-------|----------|-----------|-------|-------------|
 | 
| 
 | 
   108 | **ENGRO2** | Human | ~2,000 | ~500 | Focused human metabolism model |
 | 
| 
 | 
   109 | **Recon** | Human | ~10,000 | ~2,000 | Comprehensive human metabolism |
 | 
| 
 | 
   110 
 | 
| 
 | 
   111 Models are stored in the `local/` directory and include:
 | 
| 
 | 
   112 - SBML files
 | 
| 
 | 
   113 - GPR rules  
 | 
| 
 | 
   114 - Gene mapping tables
 | 
| 
 | 
   115 - Pathway templates
 | 
| 
 | 
   116 
 | 
| 
 | 
   117 ## Data Formats
 | 
| 
 | 
   118 
 | 
| 
 | 
   119 ### Gene Expression Format
 | 
| 
 | 
   120 ```tsv
 | 
| 
 | 
   121 Gene_ID	Sample_1	Sample_2	Sample_3
 | 
| 
 | 
   122 HGNC:5	12.5	8.3	15.7
 | 
| 
 | 
   123 HGNC:10	3.2	4.1	2.8
 | 
| 
 | 
   124 HGNC:15	7.9	11.2	6.4
 | 
| 
 | 
   125 ```
 | 
| 
 | 
   126 
 | 
| 
 | 
   127 ### Metabolite Format
 | 
| 
 | 
   128 ```tsv
 | 
| 
 | 
   129 Metabolite_ID	Sample_1	Sample_2	Sample_3
 | 
| 
 | 
   130 glucose	100.5	85.3	120.7
 | 
| 
 | 
   131 pyruvate	45.2	38.1	52.8
 | 
| 
 | 
   132 lactate	23.9	41.2	19.4
 | 
| 
 | 
   133 ```
 | 
| 
 | 
   134 
 | 
| 
 | 
   135 ## Command Line vs Python API
 | 
| 
 | 
   136 
 | 
| 
 | 
   137 COBRAxy offers two usage modes:
 | 
| 
 | 
   138 
 | 
| 
 | 
   139 ### Command Line (Quick Analysis)
 | 
| 
 | 
   140 ```bash
 | 
| 
 | 
   141 # Simple command-line execution
 | 
| 
 | 
   142 ras_generator -td $(pwd) -in data.tsv -ra output.tsv -rs ENGRO2
 | 
| 
 | 
   143 ```
 | 
| 
 | 
   144 
 | 
| 
 | 
   145 ### Python API (Programming)
 | 
| 
 | 
   146 ```python
 | 
| 
 | 
   147 import ras_generator
 | 
| 
 | 
   148 # Call main function with arguments
 | 
| 
 | 
   149 ras_generator.main(['-td', '/path', '-in', 'data.tsv', '-ra', 'output.tsv', '-rs', 'ENGRO2'])
 | 
| 
 | 
   150 ```
 | 
| 
 | 
   151 
 | 
| 
 | 
   152 ## Next Steps
 | 
| 
 | 
   153 
 | 
| 
 | 
   154 Now that you understand the basics:
 | 
| 
 | 
   155 
 | 
| 
 | 
   156 1. **[Quick Start Guide](quickstart.md)** - Complete walkthrough with example data
 | 
| 
 | 
   157 2. **[Python API Tutorial](tutorials/python-api.md)** - Learn programmatic usage
 | 
| 
 | 
   158 3. **[Tools Reference](tools/)** - Detailed documentation for each tool
 | 
| 
 | 
   159 4. **[Examples](examples/)** - Real-world analysis examples
 | 
| 
 | 
   160 
 | 
| 
 | 
   161 ## Need Help?
 | 
| 
 | 
   162 
 | 
| 
 | 
   163 - **[Troubleshooting](troubleshooting.md)** - Common issues and solutions
 | 
| 
 | 
   164 - **[GitHub Issues](https://github.com/CompBtBs/COBRAxy/issues)** - Report bugs or ask questions
 | 
| 
 | 
   165 - **[Contributing](contributing.md)** - Help improve COBRAxy |