# HG changeset patch # User antmarge # Date 1490752579 14400 # Node ID 3ed885628c9f06060d33074a8a1f3ae5f01680ba # Parent b66f4a551e2596fbc6e64158b7fc33da9a625516 Uploaded diff -r b66f4a551e25 -r 3ed885628c9f dataOverview.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/dataOverview.xml Tue Mar 28 21:56:19 2017 -0400 @@ -0,0 +1,63 @@ + + + + + summarize Tn-Seq libraries and genome + + + perl + perl_getopt_long + bioperl + + + + dataOverview.pl + -f $fastaFile + -r $ref + -w $weight_ceiling + -c $cutoff + -o $outfile + $input + #for $a in $additionalcsv + ${a.input2} + #end for + + + + + + + + + + + + + + + + + + + + + **What it does** + + This tool summarizes the Tn-Seq single insertion libraries and the organism's genome. + + **The options explained** + + The csv fitness file(s): These are the csv (comma separated values) files containing the fitness values that will be used in downstream analyses. Since they should have been produced by the "Calculate Fitness" tool, each line besides the header should represent the following information for an insertion location: position,strand,count_1,count_2,ratio,mt_freq_t1,mt_freq_t2,pop_freq_t1,pop_freq_t2,gene,D,W,nW + + GenBank reference genome: the reference genome of whatever model you're working with, which needs to be in standard genbank format. For more on that format see the genbank website. + + Weight ceiling: This value lets you set a weight ceiling for the weights of fitness values. It's only relevant if you're using weighted algorithms. + + Cutoff: This value lets you ignore the fitness scores of any insertion locations with an average count (the number of counts from t1 and t2 divided by 2) less than it. + + The name of your output file: self-explanatory. Remember to have it end in ".csv". + + + + +