comparison README @ 0:cec60c540546

Uploaded
author galaxyp
date Wed, 26 Jun 2013 15:56:16 -0400
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:cec60c540546
1 Inputs:
2
3 - A tabular file that contains a column with a peptide sequence and a column with an identifier for a reference sequence
4 - fasta files for the reference sequences
5 - gff or gtf for mapping the reference sequences to a genome
6 - reference genome fasta
7
8 Ensembl transcript_id files: Homo_sapiens.GRCh37.71.gtf,GRCh37.fa
9 transcript gtf+reference
10 map peptide to 3-frame translation of transcript
11 map to reference genome with ensembl gtf
12
13 ECGene ec_id files: ECgene_hg18_b1_low.fa,GRCh37.fa
14 transcript from ecgene.fa
15 map peptide to 3-frame translation of transcript
16 map transcript to reference genome with blat
17
18 Augustus id files: ssc10.2.RNA.hints.augustus.fa, ssc10.2.RNA.hints.augustus.gff
19 map peptide to augustus protien fasta
20 map to reference genome with GFF3
21
22 EEJ files: Homo_sapiens.GRCh37.71.gtf,eej_sus_scrofa_core_70_102.fa
23 map peptide to eej fasta
24 parse id to find exon names and junc_pos
25 map to reference genome with exon_id in ensembl GTF
26
27
28 Output:
29 a GFF3 file that specifies the position of the peptide in a reference genome
30
31
32 Mapping:
33 find transcript in cDNA fasta:
34 find transcript in translated fasta:
35
36
37 peptide to transcript:
38 translate transcript to animo acid sequence and search for peptide
39 tblastn
40 Biopython
41
42 transcript to genome:
43 If the fasta id lines contain the genomic mapping, use that
44 Map transcript to reference genome with BLAT
45 see if peptide cross exon boundaries
46