view ETn_example/DESCRIPTION.txt @ 18:565118df598a draft

Uploaded 20170531
author fabio
date Wed, 31 May 2017 11:07:22 -0400
parents 4c8a31fb202a
children
line wrap: on
line source

This example contains two region datasets "ETn fixed", "Control" and one feature "Recombination hotspots content".
In particular, the region dataset "ETn fixed" contains 1296 genomic regions of 64 kb surrounding
fixed ETns elements (32-kb flanking sequences upstream and 32-kb flanking sequences downstream
of each element). The region dataset "Control" contains 1142 regions of 64 kb without elements,
used as control in the test. The regions are aligned around their center (i.e. around the ETn integration
sites).
Recombination hotspots measurements are associated to each "ETn fixed" and "Control" region. In
particular, this feature is measured in 1-kb windows, so that each region is associated to a recombination
hotspots curve made of 64 values. The measurement used is the feature content, i.e. the
fraction of the 1-kb window that is covered by recombination hotspots

Data have been collected and pre-processed by: R Campos-Sanchez, MA Cremona, A Pini, F
Chiaromonte and KD Makova (2016). Integration and fixation preferences of human and mouse
endogenous retroviruses uncovered with Functional Data Analysis. PLoS Computational Biology.
12(6): 1-41.
Fixed ETn positions come from: Y Zhang, IA Maksakova, L Gagnier, LN van de Lagemaat, DL
Mager (2008). Genome-wide assessments reveal extremely high levels of polymorphism of two
active families of mouse endogenous retroviral elements. PLoS Genetics. 4: e1000007.
Recombination hotspots data come from: H Brunschwig, L Levi, E Ben-David, RW Williams,
B Yakir, S Shifman (2012). Fine-scale maps of recombination rates and hotspots in the mouse
genome. Genetics. 191: 757-764.