Code Tinkering: How does Speed Congenics work

I operate a speed congenics core facility and I will explain what this is in this post. Since I'm in the process of automating a lot of mundane tasks associated with this facility and I'm going to share the code on this blog.

Lets talk about congenics first.

In genetics a congenic mouse is a mouse that differs only at one locus in the genome. This difference is normally introduced by a molecular genetic technique. Such molecular genetic techniques can include a knock-in, knock-out, or transgene.

Knock-in - a targeted site of insertion into a particular locus in the mouse genome of a protein coding cDNA sequence
Knock-out - a targeted deletion of a particular gene in the mouse
transgene - a random site of insertion of a protein coding cDNA sequence

Inbred mouse strains are mice which are the same at every locus. A very common inbred mouse strain is the C57BL/6 (I will call this mouse B6) mouse. This inbred mouse strain is favored because it was the first mouse that was fully sequenced. Therefore just by inertia it became the most widely used and studied inbred mouse strain.

However, every genetically modified mouse is not made on the B6 background and not every researcher uses the B6 inbred mouse. Therefore via breeding the mouse needs to be transfered onto the B6 background or the background of the researchers choice.

Now keep in mind the Binary search algorithm.

Example: a guess the number game between 0 and 100. Use binary search.

Each time we guess, we guess the number right in the middle.

100 feedback #of Guesses

Guess 50 Less 1

Guess 25 Less 2

Guess 12 greater 3

Guess 18 greater 4

Guess 21 greater 5

Guess 23 Less 6

Guess 22 Correct 7

However, in the case of speed congenics our guess number is always 100 because we are trying to transfer from 0% B6 (I will use B6 is the example recipient background) to ~99% B6. Also the speediest way to get this accomplished is fairly close to the same as the worst case scenario of binary search.

Speed Congenics reduces time of backcrossing by choosing the best mouse of each generation. Traditional backcrossing can take 12 generations (3-4 years). When the best mouse is chosen from each generation it can take 5-6 generations (1.5 - 2 years).

To make everything as efficient as possible a backcross protocol is followed that maximizes number of mice and cleans the sex chromosomes very quickly. Highest percent B6 mice are chosen by microsatellite Analysis that covers a full genome every 10 cM. The primers are fluorescently labeled and multiplexed to reduce time. The pictures below is a backcross of one one from the CBA background to the B6 background. These are actual results for each generation.

So what are microsatellites? Microsatellites are short tandem repeats of DNA. Here is an example:GCT AC AC AC AC CA AC AA AC GT This example has Di-nucleotide (AC) and tri-nucleotide (CAA) repeats. Microsatellites are useful for speed congenics because they are polymorphic between mouse strains. Here's an example of peaks seen in Genemapper software. You can see here that each of three different mouse strains have a different size for this particular microsatellite marker.

Here is an example of a heterozygous marker vs a homozygous marker. The analysis of process if very time consuming. Some automatic calling via the genemapper software is completed but it take a lot of manual time to transfer the data into a "picture" so the customers can easily understand the data. I'm still thinking about the best way to automate this.

Here is a little outline of the entire process. I will be writing and posting programs for the Design of panel and Analysis steps of this process.

Friday, April 19, 2013

How does Speed Congenics work