Mercurial > repos > bgruening > flexynesis_cbioportal_import
annotate flexynesis_plot.py @ 8:461b8e2d79d5 draft default tip
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit bcee55468e70cd025cc5359bef5e9991e893eca9
| author | bgruening | 
|---|---|
| date | Thu, 16 Oct 2025 20:15:31 +0000 | 
| parents | 693011647a67 | 
| children | 
| rev | line source | 
|---|---|
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1 #!/usr/bin/env python | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 2 """Generate plots using flexynesis | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 3 This script generates dimensionality reduction plots, Kaplan-Meier survival curves, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 4 and Cox proportional hazards models from data processed by flexynesis.""" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 5 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 6 import argparse | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 7 import os | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 8 from pathlib import Path | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 9 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 10 import matplotlib.pyplot as plt | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 11 import numpy as np | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 12 import pandas as pd | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 13 import seaborn as sns | 
| 8 
461b8e2d79d5
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit bcee55468e70cd025cc5359bef5e9991e893eca9
 bgruening parents: 
7diff
changeset | 14 from flexynesis.utils import ( | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 15 build_cox_model, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 16 plot_dim_reduced, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 17 plot_hazard_ratios, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 18 plot_kaplan_meier_curves, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 19 plot_pr_curves, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 20 plot_roc_curves, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 21 plot_scatter | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 22 ) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 23 from scipy.stats import kruskal, mannwhitneyu | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 24 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 25 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 26 def load_embeddings(embeddings_path): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 27 """Load embeddings from a file""" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 28 try: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 29 # Determine file extension | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 30 file_ext = Path(embeddings_path).suffix.lower() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 31 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 32 if file_ext == '.csv': | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 33 df = pd.read_csv(embeddings_path, index_col=0) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 34 elif file_ext in ['.tsv', '.txt', '.tab', '.tabular']: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 35 df = pd.read_csv(embeddings_path, sep='\t', index_col=0) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 36 else: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 37 raise ValueError(f"Unsupported file extension: {file_ext}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 38 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 39 return df, df.index.tolist() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 40 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 41 except Exception as e: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 42 raise ValueError(f"Error loading embeddings from {embeddings_path}: {e}") from e | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 43 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 44 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 45 def load_labels(labels_input): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 46 """Load predicted labels from flexynesis""" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 47 try: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 48 # Determine file extension | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 49 file_ext = Path(labels_input).suffix.lower() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 50 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 51 if file_ext == '.csv': | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 52 df = pd.read_csv(labels_input) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 53 elif file_ext in ['.tsv', '.txt', '.tab', '.tabular']: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 54 df = pd.read_csv(labels_input, sep='\t') | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 55 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 56 print(f"available columns: {df.columns.tolist()}") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 57 return df | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 58 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 59 except Exception as e: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 60 raise ValueError(f"Error loading labels from {labels_input}: {e}") from e | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 61 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 62 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 63 def load_omics(omics_path): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 64 """Load omics data from a file. First column should be features""" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 65 try: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 66 # Determine file extension | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 67 file_ext = Path(omics_path).suffix.lower() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 68 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 69 if file_ext == '.csv': | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 70 df = pd.read_csv(omics_path, index_col=0) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 71 elif file_ext in ['.tsv', '.txt', '.tab', '.tabular']: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 72 df = pd.read_csv(omics_path, sep='\t', index_col=0) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 73 else: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 74 raise ValueError(f"Unsupported file extension: {file_ext}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 75 return df | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 76 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 77 except Exception as e: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 78 raise ValueError(f"Error loading omics data from {omics_path}: {e}") from e | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 79 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 80 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 81 def match_samples_to_embeddings(sample_names, labels): | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 82 """Filter label data to match sample names in the embeddings""" | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 83 # Create a DataFrame from sample_names to preserve order | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 84 sample_df = pd.DataFrame({'sample_names': sample_names}) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 85 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 86 # left_join | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 87 first_column = labels.columns[0] | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 88 df_matched = sample_df.merge(labels, left_on='sample_names', right_on=first_column, how='left') | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 89 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 90 # remove sample_names to keep the initial structure | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 91 df_matched = df_matched.drop('sample_names', axis=1) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 92 return df_matched | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 93 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 94 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 95 def detect_color_type(labels_series): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 96 """Auto-detect whether target variables should be treated as categorical or numerical""" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 97 # Remove NaN | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 98 clean_labels = labels_series.dropna() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 99 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 100 if clean_labels.empty: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 101 return 'categorical' # default output if no labels | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 102 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 103 # Check if all values can be converted to numbers | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 104 try: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 105 numeric_labels = pd.to_numeric(clean_labels, errors='coerce') | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 106 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 107 # If conversion failed -> categorical | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 108 if numeric_labels.isna().any(): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 109 return 'categorical' | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 110 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 111 # Check number of unique values | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 112 unique_count = len(clean_labels.unique()) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 113 total_count = len(clean_labels) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 114 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 115 # If few unique values relative to total -> categorical | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 116 # Threshold: if unique values < 10 OR unique/total < 0.1 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 117 if unique_count < 10 or (unique_count / total_count) < 0.1: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 118 return 'categorical' | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 119 else: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 120 return 'numerical' | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 121 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 122 except Exception: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 123 return 'categorical' | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 124 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 125 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 126 def plot_label_concordance_heatmap(labels1, labels2, figsize=(12, 10)): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 127 """ | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 128 Plot a heatmap reflecting the concordance between two sets of labels using pandas crosstab. | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 129 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 130 Parameters: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 131 - labels1: The first set of labels. | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 132 - labels2: The second set of labels. | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 133 """ | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 134 # Compute the cross-tabulation | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 135 ct = pd.crosstab(pd.Series(labels1, name='Labels Set 1'), pd.Series(labels2, name='Labels Set 2')) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 136 # Normalize the cross-tabulation matrix column-wise | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 137 ct_normalized = ct.div(ct.sum(axis=1), axis=0) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 138 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 139 # Plot the heatmap | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 140 plt.figure(figsize=figsize) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 141 sns.heatmap(ct_normalized, annot=True, cmap='viridis', linewidths=.5) # col_cluster=False) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 142 plt.title('Concordance between label groups') | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 143 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 144 return plt.gcf() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 145 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 146 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 147 def plot_boxplot(categorical_x, numerical_y, title_x='Categories', title_y='Values', figsize=(10, 6), jittersize=4): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 148 """ | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 149 Create a boxplot with to visualize the distribution of predicted probabilities across different categories. | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 150 the x axis represents the true labels, and the y axis represents the predicted probabilities for specific categories. | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 151 """ | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 152 df = pd.DataFrame({title_x: categorical_x, title_y: numerical_y}) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 153 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 154 # Compute p-value | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 155 groups = df[title_x].unique() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 156 if len(groups) == 2: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 157 group1 = df[df[title_x] == groups[0]][title_y] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 158 group2 = df[df[title_x] == groups[1]][title_y] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 159 stat, p = mannwhitneyu(group1, group2, alternative='two-sided') | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 160 test_name = "Mann-Whitney U" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 161 else: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 162 group_data = [df[df[title_x] == group][title_y] for group in groups] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 163 stat, p = kruskal(*group_data) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 164 test_name = "Kruskal-Wallis" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 165 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 166 # Create a boxplot with jittered points | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 167 plt.figure(figsize=figsize) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 168 sns.boxplot(x=title_x, y=title_y, hue=title_x, data=df, palette='Set2', legend=False, fill=False) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 169 sns.stripplot(x=title_x, y=title_y, data=df, color='black', size=jittersize, jitter=True, dodge=True, alpha=0.4) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 170 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 171 # Labels and p-value annotation | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 172 plt.xlabel(title_x) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 173 plt.ylabel(title_y) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 174 plt.text( | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 175 x=-0.4, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 176 y=plt.ylim()[1], | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 177 s=f'{test_name} p = {p:.3e}', | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 178 verticalalignment='top', | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 179 horizontalalignment='left', | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 180 fontsize=12, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 181 bbox=dict(boxstyle='round,pad=0.3', facecolor='white', edgecolor='gray') | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 182 ) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 183 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 184 plt.tight_layout() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 185 return plt.gcf() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 186 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 187 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 188 def generate_dimred_plots(embeddings, matched_labels, args, output_dir, output_name_base): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 189 """Generate dimensionality reduction plots""" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 190 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 191 # Check if this is the specific format with sample_id, known_label, predicted_label | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 192 required_cols = ['sample_id', 'variable', 'class_label', 'probability', 'known_label', 'predicted_label'] | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 193 is_flexynesis_format = all(col in matched_labels.columns for col in required_cols) | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 194 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 195 if not args.color: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 196 if is_flexynesis_format: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 197 print("Detected flexynesis labels format") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 198 print(f"Generating {args.method.upper()} plots for known and predicted labels...") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 199 else: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 200 print("Labels are not in flexynesis format (Custom labels), please specify a color variable with --color") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 201 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 202 # Parse target values from comma-separated string | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 203 if args.target_value: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 204 target_values = [val.strip() for val in args.target_value.split(',')] | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 205 else: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 206 # If no target values specified, use all unique variables | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 207 target_values = matched_labels['variable'].unique().tolist() | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 208 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 209 print(f"Generating {args.method.upper()} plots for {len(target_values)} target variable(s): {', '.join(target_values)}") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 210 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 211 # Check variables | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 212 available_vars = matched_labels['variable'].unique() | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 213 missing_vars = [var for var in target_values if var not in available_vars] | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 214 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 215 if missing_vars: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 216 print(f"Warning: The following target variables were not found in the data: {', '.join(missing_vars)}") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 217 print(f"Available variables: {', '.join(available_vars)}") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 218 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 219 # Filter to only process available variables | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 220 valid_vars = [var for var in target_values if var in available_vars] | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 221 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 222 if not valid_vars: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 223 raise ValueError(f"None of the specified target variables were found in the data. Available: {', '.join(available_vars)}") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 224 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 225 # Generate plots for each valid target variable | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 226 for var in valid_vars: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 227 print(f"\nPlotting variable: {var}") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 228 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 229 # Filter matched labels for current variable | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 230 var_labels = matched_labels[matched_labels['variable'] == var].copy() | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 231 var_labels = var_labels.drop_duplicates(subset='sample_id') | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 232 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 233 if var_labels.empty: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 234 print(f"Warning: No data found for variable '{var}', skipping...") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 235 continue | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 236 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 237 # Auto-detect color type | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 238 known_color_type = detect_color_type(var_labels['known_label']) | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 239 predicted_color_type = detect_color_type(var_labels['predicted_label']) | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 240 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 241 print(f" Auto-detected color types - Known: {known_color_type}, Predicted: {predicted_color_type}") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 242 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 243 try: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 244 # Plot 1: Known labels | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 245 print(f" Creating known labels plot for {var}...") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 246 fig_known = plot_dim_reduced( | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 247 matrix=embeddings, | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 248 labels=var_labels['known_label'], | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 249 method=args.method, | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 250 color_type=known_color_type | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 251 ) | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 252 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 253 output_path_known = output_dir / f"{output_name_base}_{var}_known.{args.format}" | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 254 print(f" Saving known labels plot to: {output_path_known.name}") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 255 fig_known.save(output_path_known, dpi=args.dpi, bbox_inches='tight') | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 256 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 257 # Plot 2: Predicted labels | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 258 print(f" Creating predicted labels plot for {var}...") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 259 fig_predicted = plot_dim_reduced( | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 260 matrix=embeddings, | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 261 labels=var_labels['predicted_label'], | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 262 method=args.method, | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 263 color_type=predicted_color_type | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 264 ) | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 265 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 266 output_path_predicted = output_dir / f"{output_name_base}_{var}_predicted.{args.format}" | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 267 print(f" Saving predicted labels plot to: {output_path_predicted.name}") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 268 fig_predicted.save(output_path_predicted, dpi=args.dpi, bbox_inches='tight') | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 269 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 270 print(f" ✓ Successfully created plots for variable '{var}'") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 271 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 272 except Exception as e: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 273 print(f" ✗ Error creating plots for variable '{var}': {e}") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 274 continue | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 275 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 276 print(f"\nDimensionality reduction plots completed for {len(valid_vars)} variable(s)!") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 277 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 278 else: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 279 # check if the color variable exists in matched_labels | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 280 if args.color not in matched_labels.columns: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 281 raise ValueError(f"Color variable '{args.color}' not found in matched labels. Available columns: {matched_labels.columns.tolist()}") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 282 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 283 # Auto-detect color type | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 284 color_type = detect_color_type(matched_labels[args.color]) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 285 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 286 print(f" Auto-detected color type: {color_type}") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 287 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 288 # Plot: Specified color column | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 289 print(f" Creating plot for {args.color}...") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 290 fig = plot_dim_reduced( | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 291 matrix=embeddings, | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 292 labels=matched_labels[args.color], | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 293 method=args.method, | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 294 color_type=color_type | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 295 ) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 296 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 297 output_path = output_dir / f"{output_name_base}_{args.color}.{args.format}" | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 298 print(f" Saving plot to: {output_path.name}") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 299 fig.save(output_path, dpi=args.dpi, bbox_inches='tight') | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 300 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 301 print(f" ✓ Successfully created plot for variable '{args.color}'") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 302 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 303 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 304 def generate_km_plots(survival_data, labels, args, output_dir, output_name_base): | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 305 """Generate Kaplan-Meier plots""" | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 306 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 307 # Check if this is the specific format with sample_id, known_label, predicted_label | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 308 required_cols = ['sample_id', 'variable', 'class_label', 'probability', 'known_label', 'predicted_label'] | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 309 is_flexynesis_format = all(col in labels.columns for col in required_cols) | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 310 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 311 if not is_flexynesis_format: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 312 raise ValueError(f"Labels are not in flexynesis format (Custom labels). Please provide a valid label file with the required columns, {required_cols}.") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 313 | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 314 print("Generating Kaplan-Meier curves of risk subtypes...") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 315 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 316 if survival_data.columns[0] != 'sample_id': | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 317 survival_data = survival_data.rename(columns={survival_data.columns[0]: 'sample_id'}) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 318 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 319 # Check if the event column exists | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 320 if args.surv_event_var not in survival_data.columns: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 321 raise ValueError(f"Column '{args.surv_event_var}' not found in survival data") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 322 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 323 labels = labels[(labels['variable'] == args.surv_event_var)] | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 324 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 325 # Merge survival data with labels | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 326 df_deceased = pd.merge(survival_data, labels, on='sample_id', how='inner') | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 327 df_deceased = df_deceased.dropna(subset=[args.surv_time_var, args.surv_event_var]) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 328 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 329 if df_deceased.empty: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 330 raise ValueError("No matching samples found after merging survival and label data.") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 331 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 332 # Get risk scores | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 333 risk_scores = df_deceased['predicted_label'].values | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 334 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 335 # Compute groups (e.g., median split) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 336 quantiles = np.quantile(risk_scores, [0.5]) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 337 groups = np.digitize(risk_scores, quantiles) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 338 group_labels = ['low_risk' if g == 0 else 'high_risk' for g in groups] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 339 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 340 fig_known = plot_kaplan_meier_curves( | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 341 durations=df_deceased[args.surv_time_var], | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 342 events=df_deceased[args.surv_event_var], | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 343 categorical_variable=group_labels | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 344 ) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 345 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 346 output_path_known = output_dir / f"{output_name_base}_km_risk_subtypes.{args.format}" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 347 print(f"Saving Kaplan-Meier plot to: {output_path_known.absolute()}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 348 fig_known.save(output_path_known, dpi=args.dpi, bbox_inches='tight') | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 349 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 350 print("Kaplan-Meier plot saved successfully!") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 351 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 352 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 353 def generate_cox_plots(important_features, clinical_train, clinical_test, omics_train, omics_test, args, output_dir, output_name_base): | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 354 """Generate Cox proportional hazards plots""" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 355 print("Generating Cox proportional hazards analysis...") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 356 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 357 # Check if this is the specific format with target_variable, importance | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 358 required_cols = ['target_variable', 'layer', 'importance'] | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 359 is_flexynesis_format = all(col in important_features.columns for col in required_cols) | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 360 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 361 if not is_flexynesis_format: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 362 raise ValueError(f"Labels are not in flexynesis format (Custom labels). Please provide a valid important_features file with the required columns, {required_cols}.") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 363 | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 364 # Parse clinical variables | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 365 clinical_vars = [] | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 366 if args.clinical_variables: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 367 clinical_vars = [var.strip() for var in args.clinical_variables.split(',')] | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 368 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 369 # Validate that survival variables are included | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 370 required_vars = [args.surv_time_var, args.surv_event_var] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 371 for var in required_vars: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 372 if var not in clinical_vars: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 373 clinical_vars.append(var) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 374 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 375 print(f"Using clinical variables: {', '.join(clinical_vars)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 376 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 377 # filter datasets for clinical variables | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 378 if all(var in clinical_train.columns and var in clinical_test.columns for var in clinical_vars): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 379 df_clin_train = clinical_train[clinical_vars] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 380 df_clin_test = clinical_test[clinical_vars] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 381 # Drop rows with NaN in clinical variables | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 382 df_clin_train = df_clin_train.dropna(subset=clinical_vars) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 383 df_clin_test = df_clin_test.dropna(subset=clinical_vars) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 384 else: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 385 raise ValueError(f"Not all clinical variables found in datasets. Available in train dataset: {clinical_train.columns.tolist()}, Available in test dataset: {clinical_test.columns.tolist()}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 386 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 387 # Combine | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 388 df_clin = pd.concat([df_clin_train, df_clin_test], axis=0) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 389 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 390 # Get top survival markers | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 391 print(f"Extracting top {args.top_features} important features for {args.surv_event_var}...") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 392 try: | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 393 print(f"Loading {args.top_features} important features from: {args.important_features}") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 394 imp_features = load_labels(args.important_features) | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 395 imp_features = imp_features[imp_features['target_variable'] == args.surv_event_var] | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 396 if args.layer not in imp_features['layer'].unique(): | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 397 print(f"Available class labels: {imp_features['layer'].unique()}") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 398 raise ValueError(f"Class label '{args.layer}' not found in important features data: {args.important_features}") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 399 imp_features = imp_features[imp_features['layer'] == args.layer] | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 400 if imp_features.empty: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 401 raise ValueError(f"No important features found for target variable '{args.surv_event_var}' in {args.important_features}") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 402 imp_features = imp_features.sort_values(by='importance', ascending=False) | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 403 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 404 if len(imp_features) < args.top_features: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 405 raise ValueError(f"Requested top {args.top_features} features, but only {len(imp_features)} available in {args.important_features}") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 406 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 407 imp = imp_features['name'].unique().tolist()[0:args.top_features] | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 408 | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 409 print(f"Top features: {', '.join(imp)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 410 except Exception as e: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 411 raise ValueError(f"Error getting important features: {e}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 412 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 413 # Extract feature data from omics datasets | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 414 try: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 415 omics_test = omics_test.loc[omics_test.index.isin(imp)] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 416 omics_train = omics_train.loc[omics_train.index.isin(imp)] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 417 # Drop rows with NaN in omics datasets | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 418 omics_test = omics_test.dropna(subset=omics_test.columns) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 419 omics_train = omics_train.dropna(subset=omics_train.columns) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 420 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 421 df_imp = pd.concat([omics_train, omics_test], axis=1) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 422 df_imp = df_imp.T # Transpose to have samples as rows | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 423 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 424 print(f"Feature data shape: {df_imp.shape}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 425 except Exception as e: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 426 raise ValueError(f"Error extracting feature subset: {e}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 427 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 428 # Combine markers with clinical variables | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 429 df = pd.merge(df_imp, df_clin, left_index=True, right_index=True) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 430 print(f"Combined data shape: {df.shape}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 431 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 432 # Remove samples without survival endpoints | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 433 initial_samples = len(df) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 434 df = df[df[args.surv_event_var].notna()] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 435 final_samples = len(df) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 436 print(f"Removed {initial_samples - final_samples} samples without survival data") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 437 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 438 if df.empty: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 439 raise ValueError("No samples remain after filtering for survival data") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 440 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 441 # Build Cox model | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 442 print(f"Building Cox model with time variable: {args.surv_time_var}, event variable: {args.surv_event_var}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 443 try: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 444 coxm = build_cox_model(df, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 445 duration_col=args.surv_time_var, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 446 event_col=args.surv_event_var, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 447 crossval=args.crossval, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 448 n_splits=args.n_splits, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 449 random_state=args.random_state) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 450 print("Cox model built successfully") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 451 except Exception as e: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 452 raise ValueError(f"Error building Cox model: {e}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 453 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 454 # Generate hazard ratios plot | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 455 try: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 456 print("Generating hazard ratios plot...") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 457 fig = plot_hazard_ratios(coxm) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 458 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 459 output_path = output_dir / f"{output_name_base}_hazard_ratios.{args.format}" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 460 print(f"Saving hazard ratios plot to: {output_path.absolute()}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 461 fig.save(output_path, dpi=args.dpi, bbox_inches='tight') | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 462 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 463 print("Cox proportional hazards analysis completed successfully!") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 464 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 465 except Exception as e: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 466 raise ValueError(f"Error generating hazard ratios plot: {e}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 467 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 468 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 469 def generate_plot_scatter(labels, args, output_dir, output_name_base): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 470 """Generate scatter plot of known vs predicted labels""" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 471 print("Generating scatter plots of known vs predicted labels...") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 472 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 473 # Check if this is the specific format with sample_id, known_label, predicted_label | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 474 required_cols = ['sample_id', 'variable', 'class_label', 'probability', 'known_label', 'predicted_label'] | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 475 is_flexynesis_format = all(col in labels.columns for col in required_cols) | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 476 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 477 if is_flexynesis_format: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 478 # Parse target values from comma-separated string | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 479 if args.target_value: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 480 target_values = [val.strip() for val in args.target_value.split(',')] | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 481 else: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 482 # If no target values specified, use all unique variables | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 483 target_values = labels['variable'].unique().tolist() | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 484 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 485 print(f"Processing target values: {target_values}") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 486 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 487 successful_plots = 0 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 488 skipped_plots = 0 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 489 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 490 for target_value in target_values: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 491 print(f"\nProcessing target value: '{target_value}'") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 492 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 493 # Filter labels for the current target value | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 494 target_labels = labels[labels['variable'] == target_value] | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 495 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 496 if target_labels.empty: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 497 print(f" Warning: No data found for target value '{target_value}' - skipping") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 498 skipped_plots += 1 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 499 continue | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 500 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 501 # Check if labels are numeric and convert | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 502 true_values = pd.to_numeric(target_labels['known_label'], errors='coerce') | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 503 predicted_values = pd.to_numeric(target_labels['predicted_label'], errors='coerce') | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 504 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 505 if true_values.isna().all() or predicted_values.isna().all(): | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 506 print(f"No valid numeric values found for known or predicted labels in '{target_value}'") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 507 skipped_plots += 1 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 508 continue | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 509 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 510 try: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 511 print(f" Generating scatter plot for '{target_value}'...") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 512 fig = plot_scatter(true_values, predicted_values) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 513 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 514 # Create output filename with target value | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 515 safe_target_name = target_value.replace('/', '_').replace('\\', '_').replace(' ', '_') | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 516 output_filename = f"{output_name_base}_{safe_target_name}.{args.format}" | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 517 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 518 output_path = output_dir / output_filename | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 519 print(f" Saving scatter plot to: {output_path.absolute()}") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 520 fig.save(output_path, dpi=args.dpi, bbox_inches='tight') | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 521 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 522 successful_plots += 1 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 523 print(f" Scatter plot for '{target_value}' generated successfully!") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 524 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 525 except Exception as e: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 526 print(f" Error generating plot for '{target_value}': {str(e)}") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 527 skipped_plots += 1 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 528 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 529 # Summary | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 530 print(" Summary:") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 531 print(f" Successfully generated: {successful_plots} plots") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 532 print(f" Skipped: {skipped_plots} plots") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 533 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 534 if successful_plots == 0: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 535 raise ValueError("No scatter plots could be generated. Check your data and target values.") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 536 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 537 print("Scatter plot generation completed!") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 538 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 539 if not is_flexynesis_format: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 540 print("Labels are not in flexynesis format (Custom labels)") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 541 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 542 if not args.true_label or not args.predicted_label: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 543 raise ValueError("For custom labels, please specify --true_label and --predicted_label arguments.") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 544 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 545 # Check if labels are numeric and convert | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 546 true_values = pd.to_numeric(labels[args.true_label], errors='coerce') | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 547 predicted_values = pd.to_numeric(labels[args.predicted_label], errors='coerce') | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 548 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 549 if true_values.isna().all() or predicted_values.isna().all(): | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 550 print("No valid numeric values found for known or predicted labels") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 551 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 552 try: | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 553 print(" Generating scatter plot...") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 554 fig = plot_scatter(true_values, predicted_values) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 555 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 556 # Create output filename with target value | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 557 output_filename = f"{output_name_base}.{args.format}" | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 558 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 559 output_path = output_dir / output_filename | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 560 print(f" Saving scatter plot to: {output_path.absolute()}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 561 fig.save(output_path, dpi=args.dpi, bbox_inches='tight') | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 562 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 563 except Exception as e: | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 564 print(f" Error generating plot: {str(e)}") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 565 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 566 print("Scatter plot generation completed!") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 567 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 568 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 569 def generate_label_concordance_heatmap(labels, args, output_dir, output_name_base): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 570 """Generate label concordance heatmap""" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 571 print("Generating label concordance heatmaps...") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 572 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 573 # Check if this is the specific format with sample_id, known_label, predicted_label | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 574 required_cols = ['sample_id', 'variable', 'class_label', 'probability', 'known_label', 'predicted_label'] | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 575 is_flexynesis_format = all(col in labels.columns for col in required_cols) | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 576 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 577 if is_flexynesis_format: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 578 # Parse target values from comma-separated string | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 579 if args.target_value: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 580 target_values = [val.strip() for val in args.target_value.split(',')] | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 581 else: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 582 # If no target values specified, use all unique variables | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 583 target_values = labels['variable'].unique().tolist() | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 584 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 585 print(f"Processing target values: {target_values}") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 586 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 587 for target_value in target_values: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 588 print(f"\nProcessing target value: '{target_value}'") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 589 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 590 # Filter labels for the current target value | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 591 target_labels = labels[labels['variable'] == target_value] | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 592 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 593 if target_labels.empty: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 594 print(f" Warning: No data found for target value '{target_value}' - skipping") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 595 continue | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 596 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 597 true_values = target_labels['known_label'].tolist() | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 598 predicted_values = target_labels['predicted_label'].tolist() | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 599 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 600 try: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 601 print(f" Generating heatmap for '{target_value}'...") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 602 fig = plot_label_concordance_heatmap(true_values, predicted_values) | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 603 plt.close(fig) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 604 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 605 # Create output filename with target value | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 606 safe_target_name = target_value.replace('/', '_').replace('\\', '_').replace(' ', '_') | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 607 output_filename = f"{output_name_base}_{safe_target_name}.{args.format}" | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 608 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 609 output_path = output_dir / output_filename | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 610 print(f" Saving heatmap to: {output_path.absolute()}") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 611 fig.savefig(output_path, dpi=args.dpi, bbox_inches='tight') | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 612 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 613 except Exception as e: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 614 print(f" Error generating heatmap for '{target_value}': {str(e)}") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 615 continue | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 616 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 617 print("Label concordance heatmap generated successfully!") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 618 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 619 if not is_flexynesis_format: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 620 print("Labels are not in flexynesis format (Custom labels)") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 621 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 622 if not args.true_label or not args.predicted_label: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 623 raise ValueError("For custom labels, please specify --true_label and --predicted_label arguments.") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 624 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 625 true_values = labels[args.true_label].tolist() | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 626 predicted_values = labels[args.predicted_label].tolist() | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 627 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 628 try: | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 629 print(" Generating heatmap for...") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 630 fig = plot_label_concordance_heatmap(true_values, predicted_values) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 631 plt.close(fig) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 632 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 633 # Create output filename with target value | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 634 output_filename = f"{output_name_base}.{args.format}" | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 635 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 636 output_path = output_dir / output_filename | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 637 print(f" Saving heatmap to: {output_path.absolute()}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 638 fig.savefig(output_path, dpi=args.dpi, bbox_inches='tight') | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 639 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 640 except Exception as e: | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 641 print(f" Error generating heatmap': {str(e)}") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 642 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 643 print("Label concordance heatmap generated successfully!") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 644 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 645 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 646 def generate_pr_curves(labels, args, output_dir, output_name_base): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 647 """Generate precision-recall curves""" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 648 print("Generating precision-recall curves...") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 649 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 650 # Check if this is the specific format with sample_id, known_label, predicted_label | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 651 required_cols = ['sample_id', 'variable', 'class_label', 'probability', 'known_label', 'predicted_label'] | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 652 is_flexynesis_format = all(col in labels.columns for col in required_cols) | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 653 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 654 if not is_flexynesis_format: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 655 raise ValueError(f"Labels are not in flexynesis format (Custom labels). Please provide a valid label file with the required columns, {required_cols}.") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 656 | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 657 # Parse target values from comma-separated string | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 658 if args.target_value: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 659 target_values = [val.strip() for val in args.target_value.split(',')] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 660 else: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 661 # If no target values specified, use all unique variables | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 662 target_values = labels['variable'].unique().tolist() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 663 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 664 print(f"Processing target values: {target_values}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 665 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 666 for target_value in target_values: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 667 print(f"\nProcessing target value: '{target_value}'") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 668 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 669 # Filter labels for the current target value | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 670 target_labels = labels[labels['variable'] == target_value] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 671 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 672 # Check if this is a regression problem (no class probabilities) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 673 prob_columns = target_labels['class_label'].unique() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 674 non_na_probs = target_labels['probability'].notna().sum() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 675 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 676 print(f" Class labels found: {list(prob_columns)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 677 print(f" Non-NaN probabilities: {non_na_probs}/{len(target_labels)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 678 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 679 # If most probabilities are NaN, this is likely a regression problem | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 680 if non_na_probs < len(target_labels) * 0.1: # Less than 10% valid probabilities | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 681 print(" Detected regression problem - precision-recall curves not applicable") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 682 print(f" Skipping '{target_value}' (use regression evaluation metrics instead)") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 683 continue | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 684 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 685 # Debug: Check data quality | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 686 total_rows = len(target_labels) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 687 missing_labels = target_labels['known_label'].isna().sum() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 688 missing_probs = target_labels['probability'].isna().sum() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 689 unique_samples = target_labels['sample_id'].nunique() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 690 unique_classes = target_labels['class_label'].nunique() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 691 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 692 print(f" Data summary: {total_rows} total rows, {unique_samples} unique samples, {unique_classes} unique classes") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 693 print(f" Missing data: {missing_labels} missing known_label, {missing_probs} missing probability") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 694 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 695 if missing_labels > 0: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 696 print(f" Warning: Found {missing_labels} missing known_label values") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 697 missing_samples = target_labels[target_labels['known_label'].isna()]['sample_id'].unique()[:5] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 698 print(f" Sample IDs with missing known_label: {list(missing_samples)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 699 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 700 # Remove rows with missing known_label | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 701 target_labels = target_labels.dropna(subset=['known_label']) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 702 if target_labels.empty: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 703 print(f" Error: No valid known_label data remaining for '{target_value}' - skipping") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 704 continue | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 705 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 706 # 1. Pivot to wide format | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 707 prob_df = target_labels.pivot(index='sample_id', columns='class_label', values='probability') | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 708 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 709 print(f" After pivot: {prob_df.shape[0]} samples x {prob_df.shape[1]} classes") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 710 print(f" Class columns: {list(prob_df.columns)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 711 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 712 # Check for NaN values in probability data | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 713 nan_counts = prob_df.isna().sum() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 714 if nan_counts.any(): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 715 print(f" NaN counts per class: {dict(nan_counts)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 716 print(f" Samples with any NaN: {prob_df.isna().any(axis=1).sum()}/{len(prob_df)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 717 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 718 # Drop only rows where ALL probabilities are NaN | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 719 all_nan_rows = prob_df.isna().all(axis=1) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 720 if all_nan_rows.any(): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 721 print(f" Dropping {all_nan_rows.sum()} samples with all NaN probabilities") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 722 prob_df = prob_df[~all_nan_rows] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 723 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 724 remaining_nans = prob_df.isna().sum().sum() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 725 if remaining_nans > 0: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 726 print(f" Warning: {remaining_nans} individual NaN values remain - filling with 0") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 727 prob_df = prob_df.fillna(0) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 728 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 729 if prob_df.empty: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 730 print(f" Error: No valid probability data remaining for '{target_value}' - skipping") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 731 continue | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 732 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 733 # 2. Get true labels | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 734 true_labels_df = target_labels.drop_duplicates('sample_id')[['sample_id', 'known_label']].set_index('sample_id') | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 735 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 736 # 3. Align indices - only keep samples that exist in both datasets | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 737 common_indices = prob_df.index.intersection(true_labels_df.index) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 738 if len(common_indices) == 0: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 739 print(f" Error: No common sample_ids between probability and true label data for '{target_value}' - skipping") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 740 continue | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 741 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 742 print(f" Found {len(common_indices)} samples with both probability and true label data") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 743 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 744 # Filter both datasets to common indices | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 745 prob_df_aligned = prob_df.loc[common_indices] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 746 y_true = true_labels_df.loc[common_indices]['known_label'] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 747 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 748 # 4. Final check for NaN values | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 749 if y_true.isna().any(): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 750 print(f" Error: True labels still contain NaN after alignment for '{target_value}' - skipping") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 751 continue | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 752 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 753 if prob_df_aligned.isna().any().any(): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 754 print(f" Error: Probability data still contains NaN after alignment for '{target_value}' - skipping") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 755 continue | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 756 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 757 # 5. Convert categorical labels to integer labels | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 758 # Create a mapping from class names to integers | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 759 class_names = list(prob_df_aligned.columns) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 760 class_to_int = {class_name: i for i, class_name in enumerate(class_names)} | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 761 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 762 print(f" Class mapping: {class_to_int}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 763 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 764 # Convert true labels to integers | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 765 y_true_np = y_true.map(class_to_int).to_numpy() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 766 y_probs_np = prob_df_aligned.to_numpy() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 767 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 768 print(f" Data shape: y_true={y_true_np.shape}, y_probs={y_probs_np.shape}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 769 print(f" Unique true labels (integers): {set(y_true_np)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 770 print(f" Class labels (columns): {class_names}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 771 print(f" Label distribution: {dict(zip(*np.unique(y_true_np, return_counts=True)))}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 772 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 773 # Check for any unmapped labels (will be NaN) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 774 if pd.isna(y_true_np).any(): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 775 print(" Error: Some true labels could not be mapped to class columns") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 776 unmapped_labels = set(y_true[y_true.map(class_to_int).isna()]) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 777 print(f" Unmapped labels: {unmapped_labels}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 778 print(f" Available classes: {class_names}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 779 continue | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 780 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 781 try: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 782 print(f" Generating precision-recall curve for '{target_value}'...") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 783 fig = plot_pr_curves(y_true_np, y_probs_np) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 784 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 785 # Create output filename with target value | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 786 safe_target_name = target_value.replace('/', '_').replace('\\', '_').replace(' ', '_') | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 787 output_filename = f"{output_name_base}_{safe_target_name}.{args.format}" | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 788 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 789 output_path = output_dir / output_filename | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 790 print(f" Saving precision-recall curve to: {output_path.absolute()}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 791 fig.save(output_path, dpi=args.dpi, bbox_inches='tight') | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 792 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 793 except Exception as e: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 794 print(f" Error generating precision-recall curve for '{target_value}': {str(e)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 795 print(f" Debug info - y_true type: {type(y_true_np)}, contains NaN: {pd.isna(y_true_np).any()}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 796 print(f" Debug info - y_probs type: {type(y_probs_np)}, contains NaN: {pd.isna(y_probs_np).any()}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 797 continue | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 798 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 799 print("Precision-recall curves generated successfully!") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 800 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 801 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 802 def generate_roc_curves(labels, args, output_dir, output_name_base): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 803 """Generate ROC curves""" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 804 print("Generating ROC curves...") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 805 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 806 # Check if this is the specific format with sample_id, known_label, predicted_label | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 807 required_cols = ['sample_id', 'variable', 'class_label', 'probability', 'known_label', 'predicted_label'] | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 808 is_flexynesis_format = all(col in labels.columns for col in required_cols) | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 809 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 810 if not is_flexynesis_format: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 811 raise ValueError(f"Labels are not in flexynesis format (Custom labels). Please provide a valid label file with the required columns, {required_cols}.") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 812 | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 813 # Parse target values from comma-separated string | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 814 if args.target_value: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 815 target_values = [val.strip() for val in args.target_value.split(',')] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 816 else: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 817 # If no target values specified, use all unique variables | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 818 target_values = labels['variable'].unique().tolist() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 819 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 820 print(f"Processing target values: {target_values}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 821 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 822 for target_value in target_values: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 823 print(f"\nProcessing target value: '{target_value}'") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 824 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 825 # Filter labels for the current target value | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 826 target_labels = labels[labels['variable'] == target_value] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 827 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 828 # Check if this is a regression problem (no class probabilities) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 829 prob_columns = target_labels['class_label'].unique() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 830 non_na_probs = target_labels['probability'].notna().sum() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 831 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 832 print(f" Class labels found: {list(prob_columns)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 833 print(f" Non-NaN probabilities: {non_na_probs}/{len(target_labels)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 834 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 835 # If most probabilities are NaN, this is likely a regression problem | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 836 if non_na_probs < len(target_labels) * 0.1: # Less than 10% valid probabilities | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 837 print(" Detected regression problem - ROC curves not applicable") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 838 print(f" Skipping '{target_value}' (use regression evaluation metrics instead)") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 839 continue | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 840 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 841 # Debug: Check data quality | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 842 total_rows = len(target_labels) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 843 missing_labels = target_labels['known_label'].isna().sum() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 844 missing_probs = target_labels['probability'].isna().sum() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 845 unique_samples = target_labels['sample_id'].nunique() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 846 unique_classes = target_labels['class_label'].nunique() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 847 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 848 print(f" Data summary: {total_rows} total rows, {unique_samples} unique samples, {unique_classes} unique classes") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 849 print(f" Missing data: {missing_labels} missing known_label, {missing_probs} missing probability") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 850 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 851 if missing_labels > 0: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 852 print(f" Warning: Found {missing_labels} missing known_label values") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 853 missing_samples = target_labels[target_labels['known_label'].isna()]['sample_id'].unique()[:5] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 854 print(f" Sample IDs with missing known_label: {list(missing_samples)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 855 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 856 # Remove rows with missing known_label | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 857 target_labels = target_labels.dropna(subset=['known_label']) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 858 if target_labels.empty: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 859 print(f" Error: No valid known_label data remaining for '{target_value}' - skipping") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 860 continue | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 861 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 862 # 1. Pivot to wide format | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 863 prob_df = target_labels.pivot(index='sample_id', columns='class_label', values='probability') | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 864 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 865 print(f" After pivot: {prob_df.shape[0]} samples x {prob_df.shape[1]} classes") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 866 print(f" Class columns: {list(prob_df.columns)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 867 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 868 # Check for NaN values in probability data | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 869 nan_counts = prob_df.isna().sum() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 870 if nan_counts.any(): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 871 print(f" NaN counts per class: {dict(nan_counts)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 872 print(f" Samples with any NaN: {prob_df.isna().any(axis=1).sum()}/{len(prob_df)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 873 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 874 # Drop only rows where ALL probabilities are NaN | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 875 all_nan_rows = prob_df.isna().all(axis=1) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 876 if all_nan_rows.any(): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 877 print(f" Dropping {all_nan_rows.sum()} samples with all NaN probabilities") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 878 prob_df = prob_df[~all_nan_rows] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 879 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 880 remaining_nans = prob_df.isna().sum().sum() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 881 if remaining_nans > 0: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 882 print(f" Warning: {remaining_nans} individual NaN values remain - filling with 0") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 883 prob_df = prob_df.fillna(0) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 884 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 885 if prob_df.empty: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 886 print(f" Error: No valid probability data remaining for '{target_value}' - skipping") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 887 continue | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 888 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 889 # 2. Get true labels | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 890 true_labels_df = target_labels.drop_duplicates('sample_id')[['sample_id', 'known_label']].set_index('sample_id') | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 891 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 892 # 3. Align indices - only keep samples that exist in both datasets | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 893 common_indices = prob_df.index.intersection(true_labels_df.index) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 894 if len(common_indices) == 0: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 895 print(f" Error: No common sample_ids between probability and true label data for '{target_value}' - skipping") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 896 continue | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 897 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 898 print(f" Found {len(common_indices)} samples with both probability and true label data") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 899 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 900 # Filter both datasets to common indices | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 901 prob_df_aligned = prob_df.loc[common_indices] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 902 y_true = true_labels_df.loc[common_indices]['known_label'] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 903 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 904 # 4. Final check for NaN values | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 905 if y_true.isna().any(): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 906 print(f" Error: True labels still contain NaN after alignment for '{target_value}' - skipping") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 907 continue | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 908 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 909 if prob_df_aligned.isna().any().any(): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 910 print(f" Error: Probability data still contains NaN after alignment for '{target_value}' - skipping") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 911 continue | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 912 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 913 # 5. Convert categorical labels to integer labels | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 914 # Create a mapping from class names to integers | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 915 class_names = list(prob_df_aligned.columns) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 916 class_to_int = {class_name: i for i, class_name in enumerate(class_names)} | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 917 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 918 print(f" Class mapping: {class_to_int}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 919 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 920 # Convert true labels to integers | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 921 y_true_np = y_true.map(class_to_int).to_numpy() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 922 y_probs_np = prob_df_aligned.to_numpy() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 923 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 924 print(f" Data shape: y_true={y_true_np.shape}, y_probs={y_probs_np.shape}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 925 print(f" Unique true labels (integers): {set(y_true_np)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 926 print(f" Class labels (columns): {class_names}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 927 print(f" Label distribution: {dict(zip(*np.unique(y_true_np, return_counts=True)))}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 928 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 929 # Check for any unmapped labels (will be NaN) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 930 if pd.isna(y_true_np).any(): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 931 print(" Error: Some true labels could not be mapped to class columns") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 932 unmapped_labels = set(y_true[y_true.map(class_to_int).isna()]) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 933 print(f" Unmapped labels: {unmapped_labels}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 934 print(f" Available classes: {class_names}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 935 continue | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 936 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 937 try: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 938 print(f" Generating ROC curve for '{target_value}'...") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 939 fig = plot_roc_curves(y_true_np, y_probs_np) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 940 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 941 # Create output filename with target value | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 942 safe_target_name = target_value.replace('/', '_').replace('\\', '_').replace(' ', '_') | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 943 output_filename = f"{output_name_base}_{safe_target_name}.{args.format}" | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 944 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 945 output_path = output_dir / output_filename | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 946 print(f" Saving ROC curve to: {output_path.absolute()}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 947 fig.save(output_path, dpi=args.dpi, bbox_inches='tight') | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 948 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 949 except Exception as e: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 950 print(f" Error generating ROC curve for '{target_value}': {str(e)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 951 print(f" Debug info - y_true type: {type(y_true_np)}, contains NaN: {pd.isna(y_true_np).any()}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 952 print(f" Debug info - y_probs type: {type(y_probs_np)}, contains NaN: {pd.isna(y_probs_np).any()}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 953 continue | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 954 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 955 print("ROC curves generated successfully!") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 956 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 957 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 958 def generate_box_plots(labels, args, output_dir, output_name_base): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 959 """Generate box plots for model predictions""" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 960 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 961 # Check if this is the specific format with sample_id, known_label, predicted_label | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 962 required_cols = ['sample_id', 'variable', 'class_label', 'probability', 'known_label', 'predicted_label'] | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 963 is_flexynesis_format = all(col in labels.columns for col in required_cols) | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 964 | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 965 if not is_flexynesis_format: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 966 raise ValueError(f"Labels are not in flexynesis format (Custom labels). Please provide a valid label file with the required columns, {required_cols}.") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 967 | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 968 print("Generating box plots...") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 969 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 970 # Parse target values from comma-separated string | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 971 if args.target_value: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 972 target_values = [val.strip() for val in args.target_value.split(',')] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 973 else: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 974 # If no target values specified, use all unique variables | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 975 target_values = labels['variable'].unique().tolist() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 976 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 977 print(f"Processing target values: {target_values}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 978 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 979 for target_value in target_values: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 980 print(f"\nProcessing target value: '{target_value}'") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 981 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 982 # Filter labels for the current target value | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 983 target_labels = labels[labels['variable'] == target_value] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 984 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 985 if target_labels.empty: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 986 print(f" Warning: No data found for target value '{target_value}' - skipping") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 987 continue | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 988 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 989 # Check if this is a classification problem (has probabilities) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 990 prob_columns = target_labels['class_label'].unique() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 991 non_na_probs = target_labels['probability'].notna().sum() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 992 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 993 print(f" Class labels found: {list(prob_columns)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 994 print(f" Non-NaN probabilities: {non_na_probs}/{len(target_labels)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 995 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 996 # If most probabilities are NaN, this is likely a regression problem | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 997 if non_na_probs < len(target_labels) * 0.1: # Less than 10% valid probabilities | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 998 print(" Detected regression problem - precision-recall curves not applicable") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 999 print(f" Skipping '{target_value}' (use regression evaluation metrics instead)") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1000 continue | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1001 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1002 # Debug: Check data quality | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1003 total_rows = len(target_labels) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1004 missing_labels = target_labels['known_label'].isna().sum() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1005 missing_probs = target_labels['probability'].isna().sum() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1006 unique_samples = target_labels['sample_id'].nunique() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1007 unique_classes = target_labels['class_label'].nunique() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1008 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1009 print(f" Data summary: {total_rows} total rows, {unique_samples} unique samples, {unique_classes} unique classes") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1010 print(f" Missing data: {missing_labels} missing known_label, {missing_probs} missing probability") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1011 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1012 if missing_labels > 0: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1013 print(f" Warning: Found {missing_labels} missing known_label values") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1014 missing_samples = target_labels[target_labels['known_label'].isna()]['sample_id'].unique()[:5] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1015 print(f" Sample IDs with missing known_label: {list(missing_samples)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1016 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1017 # Remove rows with missing known_label | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1018 target_labels = target_labels.dropna(subset=['known_label']) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1019 if target_labels.empty: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1020 print(f" Error: No valid known_label data remaining for '{target_value}' - skipping") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1021 continue | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1022 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1023 # Remove rows with missing data | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1024 clean_data = target_labels.dropna(subset=['known_label', 'probability']) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1025 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1026 if clean_data.empty: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1027 print(" No valid data after cleaning - skipping") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1028 continue | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1029 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1030 # Get unique classes | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1031 classes = clean_data['class_label'].unique() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1032 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1033 for class_label in classes: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1034 print(f" Generating box plot for class: {class_label}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1035 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1036 # Filter for current class | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1037 class_data = clean_data[clean_data['class_label'] == class_label] | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1038 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1039 try: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1040 # Create the box plot | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1041 fig = plot_boxplot( | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1042 categorical_x=class_data['known_label'], | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1043 numerical_y=class_data['probability'], | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1044 title_x='True Label', | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1045 title_y=f'Predicted Probability ({class_label})', | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1046 ) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1047 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1048 # Save the plot | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1049 safe_class_name = str(class_label).replace('/', '_').replace('\\', '_').replace(' ', '_').replace(':', '_') | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1050 safe_target_name = target_value.replace('/', '_').replace('\\', '_').replace(' ', '_') | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1051 output_filename = f"{output_name_base}_{safe_target_name}_{safe_class_name}.{args.format}" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1052 output_path = output_dir / output_filename | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1053 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1054 print(f" Saving box plot to: {output_path.absolute()}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1055 fig.savefig(output_path, dpi=args.dpi, bbox_inches='tight') | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1056 plt.close(fig) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1057 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1058 except Exception as e: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1059 print(f" Error generating box plot for class '{class_label}': {str(e)}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1060 continue | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1061 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1062 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1063 def main(): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1064 """Main function to parse arguments and generate plots""" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1065 parser = argparse.ArgumentParser(description="Generate plots using flexynesis") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1066 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1067 # Required arguments | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1068 parser.add_argument("--labels", type=str, required=False, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1069 help="Path to labels file generated by flexynesis") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1070 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1071 # Plot type | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1072 parser.add_argument("--plot_type", type=str, required=True, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1073 choices=['dimred', 'kaplan_meier', 'cox', 'scatter', 'concordance_heatmap', 'pr_curve', 'roc_curve', 'box_plot'], | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1074 help="Type of plot to generate: 'dimred' for dimensionality reduction, 'kaplan_meier' for survival analysis, 'cox' for Cox proportional hazards analysis, 'scatter' for scatter plots, 'concordance_heatmap' for label concordance heatmaps, 'pr_curve' for precision-recall curves, 'roc_curve' for ROC curves, or 'box_plot' for box plots.") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1075 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1076 # Arguments for dimensionality reduction | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1077 parser.add_argument("--embeddings", type=str, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1078 help="Path to input data embeddings file (CSV or tabular format). Required for dimred plots.") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1079 parser.add_argument("--method", type=str, default='pca', choices=['pca', 'umap'], | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1080 help="Transformation method ('pca' or 'umap'). Default is 'pca'. Used for dimred plots.") | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1081 parser.add_argument("--color", type=str, default=None, | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1082 help="User-defined color for the plot.") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1083 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1084 # Arguments for Kaplan-Meier | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1085 parser.add_argument("--survival_data", type=str, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1086 help="Path to survival data file with columns: duration and event. Required for kaplan_meier plots.") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1087 parser.add_argument("--surv_time_var", type=str, required=False, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1088 help="Column name for survival time") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1089 parser.add_argument("--surv_event_var", type=str, required=False, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1090 help="Column name for survival event") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1091 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1092 # Arguments for Cox analysis | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1093 parser.add_argument("--important_features", type=str, | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1094 help="Path to calculated feature importance file. Required for cox plots.") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1095 parser.add_argument("--clinical_train", type=str, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1096 help="Path to training dataset (pickle file). Required for cox plots.") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1097 parser.add_argument("--clinical_test", type=str, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1098 help="Path to test dataset (pickle file). Required for cox plots.") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1099 parser.add_argument("--omics_train", type=str, default=None, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1100 help="Path to training omics dataset. Optional for cox plots.") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1101 parser.add_argument("--omics_test", type=str, default=None, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1102 help="Path to test omics dataset. Optional for cox plots.") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1103 parser.add_argument("--clinical_variables", type=str, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1104 help="Comma-separated list of clinical variables to include in Cox model (e.g., 'AGE,SEX,HISTOLOGICAL_DIAGNOSIS,STUDY')") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1105 parser.add_argument("--top_features", type=int, default=20, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1106 help="Number of top important features to include in Cox model. Default is 5") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1107 parser.add_argument("--crossval", action='store_true', | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1108 help="If True, performs K-fold cross-validation and returns average C-index. Default is False") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1109 parser.add_argument("--n_splits", type=int, default=5, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1110 help="Number of folds for cross-validation. Default is 5") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1111 parser.add_argument("--random_state", type=int, default=42, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1112 help="Random seed for reproducibility. Default is 42") | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1113 parser.add_argument("--layer", type=str, default=None, | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1114 help="Class label for filtering important features.") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1115 | 
| 6 
3f7481deaff1
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 1afbaf45449e25238935e222f983da62392c067a
 bgruening parents: 
2diff
changeset | 1116 # Arguments for dimred, scatter plot, heatmap, PR curves, ROC curves, and box plots | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1117 parser.add_argument("--target_value", type=str, default=None, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1118 help="Target value for scatter plot.") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1119 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1120 # Arguments for scatter plots and concordance heatmaps | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1121 parser.add_argument("--true_label", type=str, default=None, | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1122 help="Column name for true labels in scatter plots and concordance heatmaps.") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1123 parser.add_argument("--predicted_label", type=str, default=None, | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1124 help="Column name for predicted labels in scatter plots and concordance heatmaps.") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1125 # Common arguments | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1126 parser.add_argument("--output_dir", type=str, default='output', | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1127 help="Output directory. Default is 'output'") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1128 parser.add_argument("--output_name", type=str, default=None, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1129 help="Output filename base") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1130 parser.add_argument("--format", type=str, default='jpg', choices=['png', 'pdf', 'svg', 'jpg'], | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1131 help="Output format for the plot. Default is 'jpg'") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1132 parser.add_argument("--dpi", type=int, default=300, | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1133 help="DPI for the output image. Default is 300") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1134 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1135 args = parser.parse_args() | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1136 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1137 try: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1138 # validate plot type | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1139 if not args.plot_type: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1140 raise ValueError("Please specify a plot type using --plot_type") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1141 if args.plot_type not in ['dimred', 'kaplan_meier', 'cox', 'scatter', 'concordance_heatmap', 'pr_curve', 'roc_curve', 'box_plot']: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1142 raise ValueError(f"Invalid plot type: {args.plot_type}. Must be one of: 'dimred', 'kaplan_meier', 'cox', 'scatter', 'concordance_heatmap', 'pr_curve', 'roc_curve', 'box_plot'") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1143 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1144 # Validate plot type requirements | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1145 if args.plot_type in ['dimred']: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1146 if not args.embeddings: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1147 raise ValueError("--embeddings is required when plot_type is 'dimred'") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1148 if not os.path.isfile(args.embeddings): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1149 raise FileNotFoundError(f"embeddings file not found: {args.embeddings}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1150 if not args.labels: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1151 raise ValueError("--labels is required for dimensionality reduction plots") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1152 if not args.method: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1153 raise ValueError("--method is required for dimensionality reduction plots") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1154 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1155 if args.plot_type in ['kaplan_meier']: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1156 if not args.survival_data: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1157 raise ValueError("--survival_data is required when plot_type is 'kaplan_meier'") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1158 if not os.path.isfile(args.survival_data): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1159 raise FileNotFoundError(f"Survival data file not found: {args.survival_data}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1160 if not args.labels: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1161 raise ValueError("--labels is required for dimensionality reduction plots") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1162 if not args.method: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1163 raise ValueError("--method is required for dimensionality reduction plots") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1164 if not args.surv_time_var: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1165 raise ValueError("--surv_time_var is required for Kaplan-Meier plots") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1166 if not args.surv_event_var: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1167 raise ValueError("--surv_event_var is required for Kaplan-Meier plots") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1168 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1169 if args.plot_type in ['cox']: | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1170 if not args.important_features: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1171 raise ValueError("--important_features is required when plot_type is 'cox'") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1172 if not os.path.isfile(args.important_features): | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1173 raise FileNotFoundError(f"Important features file not found: {args.important_features}") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1174 if not args.clinical_train: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1175 raise ValueError("--clinical_train is required when plot_type is 'cox'") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1176 if not os.path.isfile(args.clinical_train): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1177 raise FileNotFoundError(f"Training dataset file not found: {args.clinical_train}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1178 if not args.clinical_test: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1179 raise ValueError("--clinical_test is required when plot_type is 'cox'") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1180 if not os.path.isfile(args.clinical_test): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1181 raise FileNotFoundError(f"Test dataset file not found: {args.clinical_test}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1182 if not args.omics_train: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1183 raise ValueError("--omics_train is required when plot_type is 'cox'") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1184 if not os.path.isfile(args.omics_train): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1185 raise FileNotFoundError(f"Training omics dataset file not found: {args.omics_train}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1186 if not args.omics_test: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1187 raise ValueError("--omics_test is required when plot_type is 'cox'") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1188 if not os.path.isfile(args.omics_test): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1189 raise FileNotFoundError(f"Test omics dataset file not found: {args.omics_test}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1190 if not args.surv_time_var: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1191 raise ValueError("--surv_time_var is required for Cox plots") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1192 if not args.surv_event_var: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1193 raise ValueError("--surv_event_var is required for Cox plots") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1194 if not args.clinical_variables: | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1195 print("--clinical_variables is not set for Cox plots") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1196 if not isinstance(args.top_features, int) or args.top_features <= 0: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1197 raise ValueError("--top_features must be a positive integer") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1198 if not args.crossval: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1199 args.crossval = False | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1200 if not isinstance(args.n_splits, int) or args.n_splits <= 0: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1201 raise ValueError("--n_splits must be a positive integer") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1202 if not isinstance(args.random_state, int): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1203 raise ValueError("--random_state must be an integer") | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1204 if not args.layer: | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1205 print("--layer is not specified, using all classes from labels") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1206 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1207 if args.plot_type in ['scatter']: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1208 if not args.labels: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1209 raise ValueError("--labels is required for scatter plots") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1210 if not args.target_value: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1211 print("--target_value is not specified, using all unique variables from labels") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1212 if not os.path.isfile(args.labels): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1213 raise FileNotFoundError(f"Labels file not found: {args.labels}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1214 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1215 if args.plot_type in ['concordance_heatmap']: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1216 if not args.labels: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1217 raise ValueError("--labels is required for concordance heatmap") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1218 if not args.target_value: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1219 print("--target_value is not specified, using all unique variables from labels") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1220 if not os.path.isfile(args.labels): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1221 raise FileNotFoundError(f"Labels file not found: {args.labels}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1222 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1223 if args.plot_type in ['pr_curve']: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1224 if not args.labels: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1225 raise ValueError("--labels is required for precision-recall curves") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1226 if not args.target_value: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1227 print("--target_value is not specified, using all unique variables from labels") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1228 if not os.path.isfile(args.labels): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1229 raise FileNotFoundError(f"Labels file not found: {args.labels}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1230 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1231 if args.plot_type in ['roc_curve']: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1232 if not args.labels: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1233 raise ValueError("--labels is required for ROC curves") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1234 if not args.target_value: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1235 print("--target_value is not specified, using all unique variables from labels") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1236 if not os.path.isfile(args.labels): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1237 raise FileNotFoundError(f"Labels file not found: {args.labels}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1238 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1239 if args.plot_type in ['box_plot']: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1240 if not args.labels: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1241 raise ValueError("--labels is required for box plots") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1242 if not args.target_value: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1243 print("--target_value is not specified, using all unique variables from labels") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1244 if not os.path.isfile(args.labels): | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1245 raise FileNotFoundError(f"Labels file not found: {args.labels}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1246 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1247 # Validate other arguments | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1248 if args.method not in ['pca', 'umap']: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1249 raise ValueError("Method must be 'pca' or 'umap'") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1250 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1251 # Create output directory | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1252 output_dir = Path(args.output_dir) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1253 output_dir.mkdir(parents=True, exist_ok=True) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1254 print(f"Output directory: {output_dir.absolute()}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1255 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1256 # Generate output filename base | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1257 if args.output_name: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1258 output_name_base = args.output_name | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1259 else: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1260 if args.plot_type == 'dimred': | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1261 embeddings_name = Path(args.embeddings).stem | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1262 output_name_base = f"{embeddings_name}_{args.method}" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1263 elif args.plot_type == 'kaplan_meier': | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1264 survival_name = Path(args.survival_data).stem | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1265 output_name_base = f"{survival_name}_km" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1266 elif args.plot_type == 'cox': | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1267 model_name = Path(args.important_features).stem | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1268 output_name_base = f"{model_name}_cox" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1269 elif args.plot_type == 'scatter': | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1270 labels_name = Path(args.labels).stem | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1271 output_name_base = f"{labels_name}_scatter" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1272 elif args.plot_type == 'concordance_heatmap': | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1273 labels_name = Path(args.labels).stem | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1274 output_name_base = f"{labels_name}_concordance" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1275 elif args.plot_type == 'pr_curve': | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1276 labels_name = Path(args.labels).stem | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1277 output_name_base = f"{labels_name}_pr_curves" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1278 elif args.plot_type == 'roc_curve': | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1279 labels_name = Path(args.labels).stem | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1280 output_name_base = f"{labels_name}_roc_curves" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1281 elif args.plot_type == 'box_plot': | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1282 labels_name = Path(args.labels).stem | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1283 output_name_base = f"{labels_name}_box_plot" | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1284 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1285 # Generate plots based on type | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1286 if args.plot_type in ['dimred']: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1287 # Load labels | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1288 print(f"Loading labels from: {args.labels}") | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1289 labels = load_labels(args.labels) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1290 # Load embeddings data | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1291 print(f"Loading embeddings from: {args.embeddings}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1292 embeddings, sample_names = load_embeddings(args.embeddings) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1293 print(f"embeddings shape: {embeddings.shape}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1294 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1295 # Match samples to embeddings | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1296 matched_labels = match_samples_to_embeddings(sample_names, labels) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1297 print(f"Successfully matched {len(matched_labels)} samples for dimensionality reduction") | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1298 print(f"Matched labels shape: {matched_labels.shape}") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1299 print(f"Columns in matched labels: {matched_labels.columns.tolist()}") | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1300 generate_dimred_plots(embeddings, matched_labels, args, output_dir, output_name_base) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1301 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1302 elif args.plot_type in ['kaplan_meier']: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1303 # Load labels | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1304 print(f"Loading labels from: {args.labels}") | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1305 labels = load_labels(args.labels) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1306 # Load survival data | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1307 print(f"Loading survival data from: {args.survival_data}") | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1308 survival_data = load_labels(args.survival_data) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1309 print(f"Survival data shape: {survival_data.shape}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1310 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1311 generate_km_plots(survival_data, labels, args, output_dir, output_name_base) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1312 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1313 elif args.plot_type in ['cox']: | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1314 # Load important_features and datasets | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1315 print(f"Loading important features from: {args.important_features}") | 
| 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1316 important_features = load_labels(args.important_features) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1317 print(f"Loading training dataset from: {args.clinical_train}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1318 clinical_train = load_omics(args.clinical_train) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1319 print(f"Loading test dataset from: {args.clinical_test}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1320 clinical_test = load_omics(args.clinical_test) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1321 print(f"Loading training omics dataset from: {args.omics_train}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1322 omics_train = load_omics(args.omics_train) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1323 print(f"Loading test omics dataset from: {args.omics_test}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1324 omics_test = load_omics(args.omics_test) | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1325 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1326 generate_cox_plots(important_features, clinical_train, clinical_test, omics_test, omics_train, args, output_dir, output_name_base) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1327 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1328 elif args.plot_type in ['scatter']: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1329 # Load labels | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1330 print(f"Loading labels from: {args.labels}") | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1331 labels = load_labels(args.labels) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1332 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1333 generate_plot_scatter(labels, args, output_dir, output_name_base) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1334 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1335 elif args.plot_type in ['concordance_heatmap']: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1336 # Load labels | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1337 print(f"Loading labels from: {args.labels}") | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1338 labels = load_labels(args.labels) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1339 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1340 generate_label_concordance_heatmap(labels, args, output_dir, output_name_base) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1341 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1342 elif args.plot_type in ['pr_curve']: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1343 # Load labels | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1344 print(f"Loading labels from: {args.labels}") | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1345 labels = load_labels(args.labels) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1346 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1347 generate_pr_curves(labels, args, output_dir, output_name_base) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1348 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1349 elif args.plot_type in ['roc_curve']: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1350 # Load labels | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1351 print(f"Loading labels from: {args.labels}") | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1352 labels = load_labels(args.labels) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1353 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1354 generate_roc_curves(labels, args, output_dir, output_name_base) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1355 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1356 elif args.plot_type in ['box_plot']: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1357 # Load labels | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1358 print(f"Loading labels from: {args.labels}") | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1359 labels = load_labels(args.labels) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1360 | 
| 7 
693011647a67
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit 6b520305ec30e6dc37eba92c67a5368cea0fc5ad
 bgruening parents: 
6diff
changeset | 1361 generate_box_plots(labels, args, output_dir, output_name_base) | 
| 2 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1362 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1363 print("All plots generated successfully!") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1364 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1365 except (FileNotFoundError, ValueError, pd.errors.ParserError) as e: | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1366 print(f"Error: {e}") | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1367 return 1 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1368 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1369 return 0 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1370 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1371 | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1372 if __name__ == "__main__": | 
| 
ee7e61ab554d
planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/flexynesis commit b2463fb68d0ae54864d87718ee72f5e063aa4587
 bgruening parents: diff
changeset | 1373 exit(main()) | 
