The Sequence Alignment/Map (SAM) format is a generic nucleotide alignment format that describes the alignment of query sequences or sequencing reads to a reference sequence or assembly.
It is flexible enough to store all the alignment information generated by various alignment programs;
It is simple enough to be easily generated by alignment programs or converted from existing alignment formats;
It is compact in file size;
It allows most of the operations on the alignment to work on a stream without loading the whole alignment into memory;
It allows the file to be indexed by genomic position to efficiently retrieve all reads aligning to a locus.
SAM is a tab-delimited text format. SAM is a bit slow to parse; so there is a binary equivalent to SAM, called BAM.
SAM allows optional fields to be stored. In SAM, each alignment must contain a fixed number of mandatory fields that describe
the key information about the alignment (such as coordinate detailed alignment and sequences) and may contain a variable
number of optional fields which are less important or aligner specific.
SAM is able to store clipped alignments, spliced alignments, multi-part alignments, padded alignments and alignments in colour space.
The extended CIGAR string is the key to describing these types of alignments.
SAM stands for Sequence Alignment/Map format. It is a TAB-delimited text format consisting of a
header section, which is optional, and an alignment section. If present, the header must be prior to
the alignments. Header lines start with `@’, while alignment lines do not. Each alignment line has 11
mandatory fields for essential alignment information such as mapping position, and variable number of
optional fields for flexible or aligner specific information.
|encoding||Default value is utf8 encoding.|
The object of this value is the details alignment data.
If present, the header must be prior to the alignments. Header lines start With `@’, while alignment lines do not.