# HG changeset patch # User mkhan1980 # Date 1362397133 18000 # Node ID 5f235b95619fbacd5ecb8c5e3c5ede0890b90198 # Parent 2cceb9398d33bb8c32c4892ee5d9edc1da9f2918 Uploaded diff -r 2cceb9398d33 -r 5f235b95619f check2.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/check2.xml Mon Mar 04 06:38:53 2013 -0500 @@ -0,0 +1,43 @@ + + + check2.pl $input $input2 $output + + + + + + + + + + + + + + + + + + + +Background: +This tool computationally predicts CTCF sites for a nucleotide sequence located on the reverse strand. The user is required to provide two files as inputs. The first is the nucleotide sequence of interest on the - strand in FASTA format (this can be obtained from UCSC genome browser or Ensembl). The second file must be a FASTA formatted file containing the chromosome number and the genomic position of the last nucleotide sequence (separated by a tab). For example, if the sequence of interest is located on chromosome 3 with an ending genomic position of 1870000, the first line of the second input file must start with a fasta tag, and the second line will be chr3 1870000 + +Details of Algorithm: +CTCF sites are predicted by applying the following equation +w( ,j) = log2 (((f( ,j) + sqrt(N) x b( )) / (N + sqrt(N))) / b( )) + +Where w( ,j) is the weight of nucleotide at position j, N is the total number of binding sites or the sum of all nucleotide occurrences in the column, and b is the prior background frequency of the nucleotide . + +The sum of weights for corresponding nucleotides at each column of the matrix then estimates the likelihood of any sequence of length m to be an instance of a CTCF binding site and takes into account the GC content of the genomic region being scanned. + + +Citation and further help: For further details of the algorithm, please refer to + +Khan MA, Soto-Jimenez LM, Howe T, Streit A, Sosinsky A, Stern CD (2013). Computational tools and resources for prediction and analysis of gene regulatory regions in the chick genome.. Genesis, , - . doi:10.1002/dvg.22375 + + +For queries/questions, email ucbtmaf@ucl.ac.uk + + +