Mercurial > repos > bimib > cobraxy
comparison COBRAxy/docs/troubleshooting.md @ 550:4cf00f21f609 draft default tip
Uploaded
| author | francesco_lapi |
|---|---|
| date | Mon, 03 Nov 2025 14:49:49 +0000 |
| parents | 73f2f7e2be17 |
| children |
comparison
equal
deleted
inserted
replaced
| 549:4c5fdcefce8e | 550:4cf00f21f609 |
|---|---|
| 64 | 64 |
| 65 # Windows (using conda) | 65 # Windows (using conda) |
| 66 conda install -c conda-forge glpk swiglpk | 66 conda install -c conda-forge glpk swiglpk |
| 67 ``` | 67 ``` |
| 68 | 68 |
| 69 **Problem**: SVG processing errors | 69 |
| 70 ## Galaxy Tool Issues | |
| 71 | |
| 72 ### Import Metabolic Model | |
| 73 | |
| 74 **Error message**: | |
| 70 ```bash | 75 ```bash |
| 71 # Install libvips for image processing | 76 Traceback (most recent call last): |
| 72 # Ubuntu/Debian: sudo apt-get install libvips | 77 File "/export/tool_deps/_conda/envs/mulled-v1-d3fef6bda7daedb89425f527672b54ab0a4be6cfe3c8725b7f8c0948e0c80773/lib/python3.11/site-packages/cobra/io/sbml.py", line 458, in read_sbml_model |
| 73 # macOS: brew install vips | 78 return _sbml_to_model(doc, number=number, f_replace=f_replace, **kwargs) |
| 79 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | |
| 80 File "/export/tool_deps/_conda/envs/mulled-v1-d3fef6bda7daedb89425f527672b54ab0a4be6cfe3c8725b7f8c0948e0c80773/lib/python3.11/site-packages/cobra/io/sbml.py", line 563, in _sbml_to_model | |
| 81 raise CobraSBMLError("No SBML model detected in file.") | |
| 82 cobra.io.sbml.CobraSBMLError: No SBML model detected in file. | |
| 74 ``` | 83 ``` |
| 75 | 84 |
| 76 ## Data Format Issues | 85 **Meaning:** |
| 86 The Import Metabolic Model tool cannot read the input file as a valid SBML model with FBC annotations. | |
| 77 | 87 |
| 78 ### Gene Expression Problems | 88 **Suggested Action:** |
| 89 Verify that the input XML file is in proper SBML format and includes all necessary FBC annotations. | |
| 79 | 90 |
| 80 **Problem**: "No computable scores" error | 91 |
| 81 ``` | 92 ### Flux simulation |
| 82 Cause: Gene IDs don't match between data and model | 93 |
| 83 Solution: | 94 **Error message**: |
| 84 1. Check gene ID format (HGNC vs symbols vs Ensembl) | 95 ```bash |
| 85 2. Verify first column contains gene identifiers | 96 Execution aborted: wrong format of bounds dataset |
| 86 3. Ensure tab-separated format | |
| 87 4. Try different built-in model | |
| 88 ``` | 97 ``` |
| 89 | 98 |
| 90 **Problem**: Many "gene not found" warnings | 99 **Meaning:** |
| 91 ```python | 100 Flux simulation cannot read the bounds of the metabolic model for the constrained simulation problem (optimization or sampling). |
| 92 # Check gene overlap with model | 101 This usually happens if the input “Bound file(s): *” is incorrect. For example, it occurs when the **RasToBounds - Cell Class** file is passed instead of the collection of bound files named **"RAS to bounds"**. |
| 93 import pickle | |
| 94 genes_dict = pickle.load(open('src/local/pickle files/ENGRO2_genes.p', 'rb')) | |
| 95 model_genes = set(genes_dict['hugo_id'].keys()) | |
| 96 | 102 |
| 97 import pandas as pd | 103 **Suggested Action:** |
| 98 data_genes = set(pd.read_csv('expression.tsv', sep='\t').iloc[:, 0]) | 104 Check the input files and ensure the correct bounds collection is used. |
| 99 | |
| 100 overlap = len(model_genes.intersection(data_genes)) | |
| 101 print(f"Gene overlap: {overlap}/{len(data_genes)} ({overlap/len(data_genes)*100:.1f}%)") | |
| 102 ``` | |
| 103 | |
| 104 **Problem**: File format not recognized | |
| 105 ```tsv | |
| 106 # Correct format - tab-separated: | |
| 107 Gene_ID Sample_1 Sample_2 | |
| 108 HGNC:5 10.5 11.2 | |
| 109 HGNC:10 3.2 4.1 | |
| 110 | |
| 111 # Wrong - comma-separated or spaces will fail | |
| 112 ``` | |
| 113 | |
| 114 ### Model Issues | |
| 115 | |
| 116 **Problem**: Custom model not loading | |
| 117 ``` | |
| 118 Solution: | |
| 119 1. Check TSV format with "GPR" column header | |
| 120 2. Verify reaction IDs are unique | |
| 121 3. Test GPR syntax (use 'and'/'or', proper parentheses) | |
| 122 4. Check file permissions and encoding (UTF-8) | |
| 123 ``` | |
| 124 | |
| 125 ## Tool Execution Errors | |
| 126 | |
| 127 | |
| 128 | |
| 129 ### File Path Problems | |
| 130 | |
| 131 **Problem**: "File not found" errors | |
| 132 ```python | |
| 133 # Use absolute paths | |
| 134 from pathlib import Path | |
| 135 | |
| 136 input_file = str(Path('expression.tsv').absolute()) | |
| 137 | |
| 138 args = ['-in', input_file, ...] | |
| 139 ``` | |
| 140 | |
| 141 **Problem**: Permission denied | |
| 142 ```bash | |
| 143 # Check write permissions | |
| 144 ls -la output_directory/ | |
| 145 | |
| 146 # Fix permissions | |
| 147 chmod 755 output_directory/ | |
| 148 chmod 644 input_files/* | |
| 149 ``` | |
| 150 | |
| 151 ### Galaxy Integration Issues | |
| 152 | |
| 153 **Problem**: COBRAxy tools not appearing in Galaxy | |
| 154 ```xml | |
| 155 <!-- Check tool_conf.xml syntax --> | |
| 156 <section id="cobraxy" name="COBRAxy"> | |
| 157 <tool file="cobraxy/ras_generator.xml" /> | |
| 158 </section> | |
| 159 | |
| 160 <!-- Verify file paths are correct --> | |
| 161 ls tools/cobraxy/ras_generator.xml | |
| 162 ``` | |
| 163 | |
| 164 **Problem**: Tool execution fails in Galaxy | |
| 165 ``` | |
| 166 Check Galaxy logs: | |
| 167 - main.log: General Galaxy issues | |
| 168 - handler.log: Job execution problems | |
| 169 - uwsgi.log: Web server issues | |
| 170 | |
| 171 Common fixes: | |
| 172 1. Restart Galaxy after adding tools | |
| 173 2. Check Python environment has COBRApy installed | |
| 174 3. Verify file permissions on tool files | |
| 175 ``` | |
| 176 | |
| 177 | |
| 178 | |
| 179 **Problem**: Flux sampling hangs | |
| 180 ```bash | |
| 181 # Check solver availability | |
| 182 python -c "import cobra; print(cobra.Configuration().solver)" | |
| 183 | |
| 184 # Should show: glpk, cplex, or gurobi | |
| 185 # Install GLPK if missing: | |
| 186 pip install swiglpk | |
| 187 ``` | |
| 188 | |
| 189 ### Large Dataset Handling | |
| 190 | |
| 191 **Problem**: Cannot process large expression matrices | |
| 192 ```python | |
| 193 # Process in chunks | |
| 194 def process_large_dataset(expression_file, chunk_size=1000): | |
| 195 df = pd.read_csv(expression_file, sep='\t') | |
| 196 | |
| 197 for i in range(0, len(df), chunk_size): | |
| 198 chunk = df.iloc[i:i+chunk_size] | |
| 199 chunk_file = f'chunk_{i}.tsv' | |
| 200 chunk.to_csv(chunk_file, sep='\t', index=False) | |
| 201 | |
| 202 # Process chunk | |
| 203 ras_generator.main(['-in', chunk_file, ...]) | |
| 204 ``` | |
| 205 | |
| 206 ## Output Validation | |
| 207 | |
| 208 ### Unexpected Results | |
| 209 | |
| 210 **Problem**: All RAS values are zero or null | |
| 211 ```python | |
| 212 # Debug gene mapping | |
| 213 import pandas as pd | |
| 214 ras_df = pd.read_csv('ras_output.tsv', sep='\t', index_col=0) | |
| 215 | |
| 216 # Check data quality | |
| 217 print(f"Null percentage: {ras_df.isnull().sum().sum() / ras_df.size * 100:.1f}%") | |
| 218 print(f"Zero percentage: {(ras_df == 0).sum().sum() / ras_df.size * 100:.1f}%") | |
| 219 | |
| 220 # Check expression data preprocessing | |
| 221 expr_df = pd.read_csv('expression.tsv', sep='\t', index_col=0) | |
| 222 print(f"Expression range: {expr_df.min().min():.2f} to {expr_df.max().max():.2f}") | |
| 223 ``` | |
| 224 | |
| 225 **Problem**: RAS values seem too high/low | |
| 226 ``` | |
| 227 Possible causes: | |
| 228 1. Expression data not log-transformed | |
| 229 2. Wrong normalization method | |
| 230 3. Incorrect gene ID mapping | |
| 231 4. GPR rule interpretation issues | |
| 232 | |
| 233 Solutions: | |
| 234 1. Check expression data preprocessing | |
| 235 2. Validate against known control genes | |
| 236 3. Compare with published metabolic activity patterns | |
| 237 ``` | |
| 238 | |
| 239 ### Missing Pathway Maps | |
| 240 | |
| 241 **Problem**: MAREA generates no output maps | |
| 242 ``` | |
| 243 Debug steps: | |
| 244 1. Check RAS input has non-null values | |
| 245 2. Verify model choice matches RAS generation | |
| 246 3. Check statistical significance thresholds | |
| 247 4. Look at log files for specific errors | |
| 248 ``` | |
| 249 | |
| 250 ## Environment Issues | |
| 251 | |
| 252 ### Conda/Virtual Environment Problems | |
| 253 | |
| 254 **Problem**: Tool import fails in virtual environment | |
| 255 ```bash | |
| 256 # Activate environment properly | |
| 257 source venv/bin/activate # Linux/macOS | |
| 258 # or | |
| 259 venv\Scripts\activate # Windows | |
| 260 | |
| 261 # Verify COBRAxy installation | |
| 262 pip list | grep cobra | |
| 263 python -c "import cobra; print('COBRApy version:', cobra.__version__)" | |
| 264 ``` | |
| 265 | |
| 266 **Problem**: Version conflicts | |
| 267 ```bash | |
| 268 # Create clean environment | |
| 269 conda create -n cobraxy python=3.9 | |
| 270 conda activate cobraxy | |
| 271 | |
| 272 # Install COBRAxy fresh | |
| 273 cd COBRAxy/src | |
| 274 pip install -e . | |
| 275 ``` | |
| 276 | |
| 277 ### Cross-Platform Issues | |
| 278 | |
| 279 **Problem**: Windows path separator issues | |
| 280 ```python | |
| 281 # Use pathlib for cross-platform paths | |
| 282 from pathlib import Path | |
| 283 | |
| 284 # Instead of: '/path/to/file' | |
| 285 # Use: str(Path('path') / 'to' / 'file') | |
| 286 ``` | |
| 287 | |
| 288 **Problem**: Line ending issues (Windows/Unix) | |
| 289 ```bash | |
| 290 # Convert line endings if needed | |
| 291 dos2unix input_file.tsv # Unix | |
| 292 unix2dos input_file.tsv # Windows | |
| 293 ``` | |
| 294 | |
| 295 ## Debugging Strategies | |
| 296 | |
| 297 ### Enable Detailed Logging | |
| 298 | |
| 299 ```python | |
| 300 import logging | |
| 301 logging.basicConfig(level=logging.DEBUG) | |
| 302 | |
| 303 # Many tools accept log file parameter | |
| 304 args = [..., '--out_log', 'detailed.log'] | |
| 305 ``` | |
| 306 | |
| 307 ### Test with Small Datasets | |
| 308 | |
| 309 ```python | |
| 310 # Create minimal test case | |
| 311 test_data = """Gene_ID Sample1 Sample2 | |
| 312 HGNC:5 10.0 15.0 | |
| 313 HGNC:10 5.0 8.0""" | |
| 314 | |
| 315 with open('test_input.tsv', 'w') as f: | |
| 316 f.write(test_data) | |
| 317 | |
| 318 # Test basic functionality | |
| 319 ras_generator.main(['-in', 'test_input.tsv', | |
| 320 '-ra', 'test_output.tsv', '-rs', 'ENGRO2']) | |
| 321 ``` | |
| 322 | |
| 323 ### Check Dependencies | |
| 324 | |
| 325 ```python | |
| 326 # Verify all required packages | |
| 327 required_packages = ['cobra', 'pandas', 'numpy', 'scipy'] | |
| 328 | |
| 329 for package in required_packages: | |
| 330 try: | |
| 331 __import__(package) | |
| 332 print(f"✓ {package}") | |
| 333 except ImportError: | |
| 334 print(f"✗ {package} - MISSING") | |
| 335 ``` | |
| 336 | 105 |
| 337 ## Getting Help | 106 ## Getting Help |
| 338 | 107 |
| 339 ### Information to Include in Bug Reports | 108 ### Information to Include in Bug Reports |
| 340 | 109 |
| 366 - Tested with built-in example data | 135 - Tested with built-in example data |
| 367 - Searched existing GitHub issues | 136 - Searched existing GitHub issues |
| 368 - Tried alternative models/parameters | 137 - Tried alternative models/parameters |
| 369 - Checked file formats and permissions | 138 - Checked file formats and permissions |
| 370 | 139 |
| 371 ## Prevention Tips | |
| 372 | |
| 373 ### Best Practices | |
| 374 | |
| 375 1. **Use virtual environments** to avoid conflicts | |
| 376 2. **Validate input data** before processing | |
| 377 3. **Start with small datasets** for testing | |
| 378 4. **Keep backups** of working configurations | |
| 379 5. **Document successful workflows** for reuse | |
| 380 6. **Test after updates** to catch regressions | |
| 381 | |
| 382 ### Data Quality Checks | |
| 383 | |
| 384 ```python | |
| 385 def validate_expression_data(filename): | |
| 386 """Validate gene expression file format.""" | |
| 387 df = pd.read_csv(filename, sep='\t') | |
| 388 | |
| 389 # Check basic format | |
| 390 assert df.shape[0] > 0, "Empty file" | |
| 391 assert df.shape[1] > 1, "Need at least 2 columns" | |
| 392 | |
| 393 # Check numeric data | |
| 394 numeric_cols = df.select_dtypes(include=[np.number]).columns | |
| 395 assert len(numeric_cols) > 0, "No numeric expression data" | |
| 396 | |
| 397 # Check for missing values | |
| 398 null_pct = df.isnull().sum().sum() / df.size * 100 | |
| 399 if null_pct > 50: | |
| 400 print(f"Warning: {null_pct:.1f}% missing values") | |
| 401 | |
| 402 print(f"✓ File valid: {df.shape[0]} genes × {df.shape[1]-1} samples") | |
| 403 ``` | |
| 404 | 140 |
| 405 This troubleshooting guide covers the most common issues. For tool-specific problems, check the individual tool documentation pages. | 141 This troubleshooting guide covers the most common issues. For tool-specific problems, check the individual tool documentation pages. |
