diff shm_first.htm @ 39:a24f8c93583a draft

Uploaded
author davidvanzessen
date Thu, 22 Dec 2016 09:39:27 -0500
parents
children ba33b94637ca
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/shm_first.htm	Thu Dec 22 09:39:27 2016 -0500
@@ -0,0 +1,127 @@
+<html>
+
+<head>
+<meta http-equiv=Content-Type content="text/html; charset=windows-1252">
+<meta name=Generator content="Microsoft Word 14 (filtered)">
+<style>
+<!--
+ /* Font Definitions */
+ @font-face
+	{font-family:Calibri;
+	panose-1:2 15 5 2 2 2 4 3 2 4;}
+ /* Style Definitions */
+ p.MsoNormal, li.MsoNormal, div.MsoNormal
+	{margin-top:0in;
+	margin-right:0in;
+	margin-bottom:10.0pt;
+	margin-left:0in;
+	line-height:115%;
+	font-size:11.0pt;
+	font-family:"Calibri","sans-serif";}
+.MsoChpDefault
+	{font-family:"Calibri","sans-serif";}
+.MsoPapDefault
+	{margin-bottom:10.0pt;
+	line-height:115%;}
+@page WordSection1
+	{size:8.5in 11.0in;
+	margin:1.0in 1.0in 1.0in 1.0in;}
+div.WordSection1
+	{page:WordSection1;}
+-->
+</style>
+
+</head>
+
+<body lang=EN-US>
+
+<div class=WordSection1>
+
+<p class=MsoNormalCxSpFirst style='margin-bottom:0in;margin-bottom:.0001pt;
+text-align:justify;line-height:normal'><span lang=EN-GB style='font-size:12.0pt;
+font-family:"Times New Roman","serif"'>Table showing the order of each
+filtering step and the number and percentage of sequences after each filtering
+step. </span></p>
+
+<p class=MsoNormalCxSpMiddle style='margin-bottom:0in;margin-bottom:.0001pt;
+text-align:justify;line-height:normal'><u><span lang=EN-GB style='font-size:
+12.0pt;font-family:"Times New Roman","serif"'>Input:</span></u><span
+lang=EN-GB style='font-size:12.0pt;font-family:"Times New Roman","serif"'> The
+number of sequences in the original IMGT file. This is always 100% of the
+sequences.</span></p>
+
+<p class=MsoNormalCxSpMiddle style='margin-bottom:0in;margin-bottom:.0001pt;
+text-align:justify;line-height:normal'><u><span lang=EN-GB style='font-size:
+12.0pt;font-family:"Times New Roman","serif"'>After &quot;no results&quot; filter: </span></u><span
+lang=EN-GB style='font-size:12.0pt;font-family:"Times New Roman","serif"'>IMGT
+classifies sequences either as &quot;productive&quot;, &quot;unproductive&quot;, &quot;unknown&quot;, or &quot;no
+results&quot;. Here, the number and percentages of sequences that are not classified
+as &quot;no results&quot; are reported.</span></p>
+
+<p class=MsoNormalCxSpMiddle style='margin-bottom:0in;margin-bottom:.0001pt;
+text-align:justify;line-height:normal'><u><span lang=EN-GB style='font-size:
+12.0pt;font-family:"Times New Roman","serif"'>After functionality filter:</span></u><span
+lang=EN-GB style='font-size:12.0pt;font-family:"Times New Roman","serif"'> The
+number and percentages of sequences that have passed the functionality filter. The
+filtering performed is dependent on the settings of the functionality filter.
+Details on the functionality filter <a name="OLE_LINK12"></a><a
+name="OLE_LINK11"></a><a name="OLE_LINK10">can be found on the start page of
+the SHM&amp;CSR pipeline</a>.</span></p>
+
+<p class=MsoNormalCxSpMiddle style='text-align:justify'><u><span lang=EN-GB
+style='font-size:12.0pt;line-height:115%;font-family:"Times New Roman","serif"'>After
+removal sequences that are missing a gene region:</span></u><span lang=EN-GB
+style='font-size:12.0pt;line-height:115%;font-family:"Times New Roman","serif"'>
+In this step all sequences that are missing a gene region (FR1, CDR1, FR2,
+CDR2, FR3) that should be present are removed from analysis. The sequence
+regions that should be present are dependent on the settings of the sequence
+starts at filter. <a name="OLE_LINK9"></a><a name="OLE_LINK8">The number and
+percentage of sequences that pass this filter step are reported.</a> </span></p>
+
+<p class=MsoNormalCxSpMiddle style='text-align:justify'><u><span lang=EN-GB
+style='font-size:12.0pt;line-height:115%;font-family:"Times New Roman","serif"'>After
+N filter:</span></u><span lang=EN-GB style='font-size:12.0pt;line-height:115%;
+font-family:"Times New Roman","serif"'> In this step all sequences that contain
+an ambiguous base (n) in the analysed region or the CDR3 are removed from the
+analysis. The analysed region is determined by the setting of the sequence
+starts at filter. The number and percentage of sequences that pass this filter
+step are reported.</span></p>
+
+<p class=MsoNormalCxSpMiddle style='text-align:justify'><u><span lang=EN-GB
+style='font-size:12.0pt;line-height:115%;font-family:"Times New Roman","serif"'>After
+filter unique sequences</span></u><span lang=EN-GB style='font-size:12.0pt;
+line-height:115%;font-family:"Times New Roman","serif"'>: The number and
+percentage of sequences that pass the &quot;filter unique sequences&quot; filter. Details
+on this filter </span><span lang=EN-GB style='font-size:12.0pt;line-height:
+115%;font-family:"Times New Roman","serif"'>can be found on the start page of
+the SHM&amp;CSR pipeline</span></p>
+
+<p class=MsoNormalCxSpMiddle style='text-align:justify'><u><span lang=EN-GB
+style='font-size:12.0pt;line-height:115%;font-family:"Times New Roman","serif"'>After
+remove duplicate based on filter:</span></u><span lang=EN-GB style='font-size:
+12.0pt;line-height:115%;font-family:"Times New Roman","serif"'> The number and
+percentage of sequences that passed the remove duplicate filter. Details on the
+&quot;remove duplicate filter based on filter&quot; can be found on the start page of the
+SHM&amp;CSR pipeline.</span></p>
+
+<p class=MsoNormalCxSpMiddle style='text-align:justify'><a name="OLE_LINK17"></a><a
+name="OLE_LINK16"><u><span lang=EN-GB style='font-size:12.0pt;line-height:115%;
+font-family:"Times New Roman","serif"'>Number of matches sequences:</span></u></a><span
+lang=EN-GB style='font-size:12.0pt;line-height:115%;font-family:"Times New Roman","serif"'>
+The number and percentage of sequences that passed all the filters described
+above and have a (sub)class assigned.</span></p>
+
+<p class=MsoNormalCxSpMiddle style='text-align:justify'><u><span lang=EN-GB
+style='font-size:12.0pt;line-height:115%;font-family:"Times New Roman","serif"'>Number
+of unmatched sequences</span></u><span lang=EN-GB style='font-size:12.0pt;
+line-height:115%;font-family:"Times New Roman","serif"'>: The number and percentage
+of sequences that passed all the filters described above and do not have
+subclass assigned.</span></p>
+
+<p class=MsoNormal><span lang=EN-GB>&nbsp;</span></p>
+
+</div>
+
+</body>
+
+</html>