annotate src/breadcrumbs/README.md @ 0:0de566f21448 draft default tip

v2
author sagun98
date Thu, 03 Jun 2021 18:13:32 +0000
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
sagun98
parents:
diff changeset
1 # BreadCrumbs #
sagun98
parents:
diff changeset
2
sagun98
parents:
diff changeset
3 BreadCrumbs is an unofficial collection of scripts and code intended to consolidate functions for tool development and contain scripts for command line access to commonly used functions. Breadcrumbs tends to include functionality associated with metagenomics analysis but you never know what you will find!
sagun98
parents:
diff changeset
4
sagun98
parents:
diff changeset
5
sagun98
parents:
diff changeset
6 ## Dependencies: ##
sagun98
parents:
diff changeset
7
sagun98
parents:
diff changeset
8 1. Cogent https://pypi.python.org/pypi/cogent
sagun98
parents:
diff changeset
9 2. MatplotLib http://matplotlib.org/downloads.html
sagun98
parents:
diff changeset
10 3. Mercurial http://mercurial.selenic.com/ (optional for downloading)
sagun98
parents:
diff changeset
11 4. Numpy http://www.numpy.org/
sagun98
parents:
diff changeset
12 5. Python 2.x http://www.python.org/download/
sagun98
parents:
diff changeset
13 6. SciPy http://www.scipy.org/install.html
sagun98
parents:
diff changeset
14 7. biom support http://biom-format.org/
sagun98
parents:
diff changeset
15
sagun98
parents:
diff changeset
16
sagun98
parents:
diff changeset
17 ## How to download ##
sagun98
parents:
diff changeset
18
sagun98
parents:
diff changeset
19 To download BreadCrumbs from BitBucket use the command:
sagun98
parents:
diff changeset
20
sagun98
parents:
diff changeset
21 > hg clone https://bitbucket.org/timothyltickle/breadcrumbs
sagun98
parents:
diff changeset
22
sagun98
parents:
diff changeset
23 To update BreadCrumbs, in the BreadCrumbs directory use the 2 commands sequentially:
sagun98
parents:
diff changeset
24
sagun98
parents:
diff changeset
25 > hg pull
sagun98
parents:
diff changeset
26 > hg update
sagun98
parents:
diff changeset
27
sagun98
parents:
diff changeset
28
sagun98
parents:
diff changeset
29 ## Scripts: ##
sagun98
parents:
diff changeset
30
sagun98
parents:
diff changeset
31 Scripts are included to expose core functionality through the command line. Currently these scripts center on manipulating and visualizing abundance tables.
sagun98
parents:
diff changeset
32 A quick description of the scripts include:
sagun98
parents:
diff changeset
33
sagun98
parents:
diff changeset
34 * *Hclust.py* Flexible script to create a visualization of hierarchical clustering of abundance tables (or other matrices).
sagun98
parents:
diff changeset
35
sagun98
parents:
diff changeset
36 * *scriptBiplotTSV.R* Allows one to plot a tsv file as a biplot using nonmetric multidimensional scaling.
sagun98
parents:
diff changeset
37
sagun98
parents:
diff changeset
38 * *scriptPlotFeature.py* Allows one to plot a histogram, boxplot, or scatter plot of a bug or metadata in an abundance table. Will work on any row in a matrix.
sagun98
parents:
diff changeset
39
sagun98
parents:
diff changeset
40 * *scriptManipulateTable.py* Allows one to perform common functions on an abundance table including, summing, normalizing, filtering, stratifying tables.
sagun98
parents:
diff changeset
41
sagun98
parents:
diff changeset
42 * *scriptPcoa.py* Allows one to plot a principle covariance analysis (PCoA) plot of an abundance table.
sagun98
parents:
diff changeset
43
sagun98
parents:
diff changeset
44 * *scriptConvertBetweenBIOMAndPCL.py* Allows one to convert between BIOM and PCL file formats.
sagun98
parents:
diff changeset
45
sagun98
parents:
diff changeset
46
sagun98
parents:
diff changeset
47 ## Programming Classes: ##
sagun98
parents:
diff changeset
48
sagun98
parents:
diff changeset
49 Brief descriptions of classes are as follows. More detailed descriptions are given in the classes themselves.
sagun98
parents:
diff changeset
50
sagun98
parents:
diff changeset
51 * *AbundanceTable* Data structure to contain and perform operations on an abundance table.
sagun98
parents:
diff changeset
52
sagun98
parents:
diff changeset
53 * *BoxPlot* Wrapper to plot box plots.
sagun98
parents:
diff changeset
54
sagun98
parents:
diff changeset
55 * *CClade* Helper object used in hierarchical summing and normalization
sagun98
parents:
diff changeset
56
sagun98
parents:
diff changeset
57 * *Cladogram* Object that manipulated an early dendrogram visualization. Deprecated, should use the GraPhlan visualization tool on bitbucket instead.
sagun98
parents:
diff changeset
58
sagun98
parents:
diff changeset
59 * *CommandLine* Collection of code to work with command line. Deprecated. Should use sfle calls.
sagun98
parents:
diff changeset
60
sagun98
parents:
diff changeset
61 * *ConstantsBreadCrumbs* Contains generic constants.
sagun98
parents:
diff changeset
62
sagun98
parents:
diff changeset
63 * *ConstantsFiguresBreadCrumbs* Contains constants associated with formatting figures.
sagun98
parents:
diff changeset
64
sagun98
parents:
diff changeset
65 * *KMedoids* Code from MLPY which performs KMedoids sample selection.
sagun98
parents:
diff changeset
66
sagun98
parents:
diff changeset
67 * *MLPYDistanceAdaptor* Used to allow custom distance matrices to be used by KMedoids.
sagun98
parents:
diff changeset
68
sagun98
parents:
diff changeset
69 * *Metric* Difference functions associated with distance and diversity metrics.
sagun98
parents:
diff changeset
70
sagun98
parents:
diff changeset
71 * *PCoA* Functionality surrounding the plotting of a PCoA
sagun98
parents:
diff changeset
72
sagun98
parents:
diff changeset
73 * *PlotMatrix* Allows on to plot a matrix of numbers.
sagun98
parents:
diff changeset
74
sagun98
parents:
diff changeset
75 * *SVM* Support Vector Machine associated scripts.
sagun98
parents:
diff changeset
76
sagun98
parents:
diff changeset
77 * *Utility* Generic functions
sagun98
parents:
diff changeset
78
sagun98
parents:
diff changeset
79 * *UtilityMath* Generic math related functions
sagun98
parents:
diff changeset
80
sagun98
parents:
diff changeset
81 * *ValidateData* Collection of functions to validate data types when needed.
sagun98
parents:
diff changeset
82
sagun98
parents:
diff changeset
83
sagun98
parents:
diff changeset
84 ## Demo input files: ##
sagun98
parents:
diff changeset
85
sagun98
parents:
diff changeset
86 * *fastunifrac_Ley_et_al_NRM_2_sample_id_map.txt* Example Unifrac Id mapping file (source http://bmf2.colorado.edu/fastunifrac/tutorial.psp)
sagun98
parents:
diff changeset
87
sagun98
parents:
diff changeset
88 * *GreenGenesCore-May09.ref.tre* Example Greengenes core set reference for Unifrac demo (source http://bmf2.colorado.edu/fastunifrac/tutorial.psp)
sagun98
parents:
diff changeset
89
sagun98
parents:
diff changeset
90 * *Test.pcl* Example file / Test PCL file to run scripts on.
sagun98
parents:
diff changeset
91
sagun98
parents:
diff changeset
92 * *Test.biom* Example file / Test BIOM file to run scripts on.
sagun98
parents:
diff changeset
93
sagun98
parents:
diff changeset
94 * *Test_no_metadata.pcl* Example file / Test PCL file to run scripts on which does not have metadata.
sagun98
parents:
diff changeset
95
sagun98
parents:
diff changeset
96 * *Test_no_metadata.biom* Example file / Test BIOM file to run scripts on which does not have metadata.
sagun98
parents:
diff changeset
97
sagun98
parents:
diff changeset
98 * *Test-biplot.tsv* Example file / Test file for the scriptBiplotTSV.R
sagun98
parents:
diff changeset
99
sagun98
parents:
diff changeset
100
sagun98
parents:
diff changeset
101 ## Contributing Authors: ##
sagun98
parents:
diff changeset
102 Timothy Tickle, George Weingart, Nicola Segata, Curtis Huttenhower
sagun98
parents:
diff changeset
103
sagun98
parents:
diff changeset
104
sagun98
parents:
diff changeset
105 ## Contact: ##
sagun98
parents:
diff changeset
106 Please feel free to contact ttickle@hsph.harvard.edu with questions.