The Minimum Letter Flip Problem for Haplotyping a Single Individual
When haplotyping a single individual, DNA is replicated, broken into smaller fragments, and then sequenced by a machine. The minimum letter flip problem is one approach to correcting errors that arise in shotgun sequencing. We refer to the minimum letter flip problem as stated in other texts as CGMLF. CGMLF changes a minimal number of single nucleotide polymorphisms (SNPs) to create a feasible SNP matrix. Since finding partitions that have this property is especially difficult, we relate CGMLF to several 2-median problem formulations. Our major result is that the CGMLF is equivalent to a non-polynomial 2-median problem formulation. We develop algorithms used to solve the 2-median problems and discuss their complexity. In conclusion, we develop inequality relations for our problem formulations based on minimum number of flips and prove that CGMLF can be bounded from above in polynomial time.