SAM

namespace: SMRUCC.genomics.SequenceModel.SAM

The Sequence Alignment/Map (SAM) format is a generic nucleotide alignment format that describes the alignment of query sequences or sequencing reads to a reference sequence or assembly.
Importantly:

It is flexible enough to store all the alignment information generated by various alignment programs;
It is simple enough to be easily generated by alignment programs or converted from existing alignment formats;
It is compact in file size;
It allows most of the operations on the alignment to work on a stream without loading the whole alignment into memory;
It allows the file to be indexed by genomic position to efficiently retrieve all reads aligning to a locus.

SAM is a tab-delimited text format. SAM is a bit slow to parse; so there is a binary equivalent to SAM, called BAM.

SAM allows optional fields to be stored. In SAM, each alignment must contain a fixed number of mandatory fields that describe
the key information about the alignment (such as coordinate detailed alignment and sequences) and may contain a variable
number of optional fields which are less important or aligner specific.

SAM is able to store clipped alignments, spliced alignments, multi-part alignments, padded alignments and alignments in colour space.
The extended CIGAR string is the key to describing these types of alignments.

SAM stands for Sequence Alignment/Map format. It is a TAB-delimited text format consisting of a
header section, which is optional, and an alignment section. If present, the header must be prior to
the alignments. Header lines start with `@’, while alignment lines do not. Each alignment line has 11
mandatory fields for essential alignment information such as mapping position, and variable number of
optional fields for flexible or aligner specific information.

SAM格式的文件是一种序列比对文件,使用TAB符号进行分隔,文件的格式为一个可选的标题头部区域,标题头部使用@符号起始而比对区域则不需要
每一行序列比对的数据有11个域用于储存比对信息,诸如:mapping的位置之类

Methods

Assembling

1
SMRUCC.genomics.SequenceModel.SAM.SAM.Assembling(System.Collections.Generic.Dictionary{System.Int32,Microsoft.VisualBasic.List{SMRUCC.genomics.SequenceModel.SAM.AlignmentReads}},System.Boolean)
Parameter Name Remarks
Alignment 请注意先按照方向排序
Reversed -

Load

1
SMRUCC.genomics.SequenceModel.SAM.SAM.Load(System.String,System.Text.Encoding)

从一个文本文件之中加载一个SAM格式的文件数据

Parameter Name Remarks
Path -
encoding Default value is utf8 encoding.

TrimUnmappedReads

1
SMRUCC.genomics.SequenceModel.SAM.SAM.TrimUnmappedReads

移除没有被mapping到基因组上面的reads

Properties

AlignmentsReads

The object of this value is the details alignment data.

If present, the header must be prior to the alignments. Header lines start With `@’, while alignment lines do not.
(文件的可选头部区域必须要在比对数据区域的前面并且每一行以@符号开始)