10bp DNA for SFS

Sample DNA data set coded as binary data

The top panel shows a typical set of 10 bp DNA sequences from six individuals in a population. DNA sequence variants occur at all 10 positions, which are called Single Nucleotide Polymorphisms (SNPs) or Segregating Sites. The middle panel re-codes the DNA sequences in binary form, in each case taking the state in Sequence I as 0 and any SNP as 1. We will assume for the moment that all SNP changes are from 0 1, and we call any 0 the ancestral state and any 1 the derived state. The bottom panel extracts the binary codes for the derived sites as shaded "1"for ease of comparison. Each of the six individuals has a distinct haplotype, that may be written for example for #III = 0101010010. [Note that the three SNPs all involve transversions (alternative purine / pyrimidine bases): t/g, c/a, & a/t].


Figure &Text material © 2022 by Steven M. Carr