One way of looking at the data is as a mathematical graph, a collection of nodes and interconnecting undirected edges. This analysis doesn't use any known or suspected family tree information, only the DNA results (so far). Here, the nodes represent a particular DNA marker set result, and are labelled with the kit number(s) of the DNA tests, and the group from the project results page. The group information is used only to display on the result. The nodes connected with lines are a genetic distance of 1 apart, and the lines are labelled with the name of the marker that changed. Fast-changing markers are in red, slower-changing markers are in black.
For a description of how these graphs are created, and how you could do it manually, see the howto page.
Sometimes nodes that are further apart than 1 are still related enough to be interesting. In this case the program will create nodes with intermediate DNA results, and labels them with 't' followed by a number so you can refer to them. Since the distance between such nodes is greater than one, and usually more than one marker has changed, and you don't know which marker changed first, there are usually 2 intermediate nodes between nodes of distance 2, and 6 intermediate nodes between nodes of distance 3. If you look at a graph of two nodes of distance 3 with the 6 intermediate nodes cockeyed, it will look like a cube (perhaps bent out of shape a bit). This is not a coincidence. There are normally 3 different markers involved, and each is effectively a different dimension of difference.
This example graph uses 37 markers and creates intermediate nodes for
nodes of distance 2. Some things to note:
It takes at least 3 tests to make this interesting. For one test, you can
predict that all of the ancestors match exactly, in the absence of any
other information.
For two tests, you can predict that any differences between them
occurred somewhere along the one of the lines back to the common ancestor,
but you don't know where.
For three or more tests, you can start predicting where the change occurred.
If three people descended from a common ancestor, and two have one marker value
and the other has a different one, you can assume the mutation happened in
the line of the odd man out.
On the descendant charts:
Group 1 is so far the only interesting group.
Here, there are 3 related
subgroups but the exact relationship is unknown.
Group 3 has the potential to be interesting, but it
contains a disconnect somewhere, as two supposedly-related people end up
being unrelated according to DNA.
The graph is divided into "segments". A segment is part of a line of descent
that stops at either a person with a test result or someone with more than
one descendent. The segment is marked on the graph as "[X]" and a mutation
can be located to somewhere between a person on a specific segment and
their father. There is no way to determine which mutation happened first.
Exactly where can't be determined without more tests of
different people.
Matching DNA to Family Trees
Now I'm going to try to use the difference graphs and known relationships
to figure out where all the mutations are. I'm assuming that mutations
are relatively rare: in any line of descent, it's possible for marker X to
mutate in one generation and mutate back in the next, leaving no net change.
I'll assume that doesn't happen.