Finding the patterns in genetic genealogy research is really a fundamental thing we do when looking for the clues to our roots. Even in traditional genealogy, looking at a pedigree chart reveals patterns in geographical locations, dates and names which help our research. Looking for patterns is not new, even if we don’t realize we do it.
How do we find the patterns are we looking for? Spreadsheets?
Betty Jean’s Raw Data is file loaded to GEDmatch. If we do a One to Many matches analysis we can capture the entire results list via cut and paste, and insert it into a spreadsheet. We can grab similar results lists from most of the DNA testing or results companies. Getting all the formats similar in a spreadsheet takes some juggling and tweaking but it’s worth it. It is made easier by learning how to create things like separate first, middle, maiden and last name entries by converting text to columns and using filters to clean-up the columns. Sorting is a whiz, once you have all the information in a sortable state.
Sort your spreadsheets by Chromosome, Segment Locations and Last Name and you have a pretty clear view to the people on the sheets who share your DNA and where they share it.
Who do your Matches Belong to? What Familial Surname?
Don’t get that? Who is a matches MCRA (Most common Recent Ancestor) in common with you? Some of the files you download from the various companies give you common surnames. Well that helps, doesn’t it? Sometimes? They also show how many Generations back you might share one of those surnames (some people have 50 or more Surnames). That’s it. That is the answer!
Try adding each of those surnames to a sheet individually or using the text to columns conversion and…
Family Trees and Pedigree Charts
Aside from adding columns upon columns of surnames to your spread sheet there really isn’t any way (that I know of) to add a pedigree or family tree to a spreadsheet – it might be doable, but…I don’t have enough finger tips or time for that.
There are great places where you can upload your GEDCOM to a DNA testing or analysis site, but the DNA isn’t in any way correlated with the tree. It’s just there and you have to use your brain and knowledge of what you are working on to make any sense of it. ESPECIALLY if it’s further back than a generation or so.
But I have something…
I was at a conference in the spring of 2016 where some of the current icons of Genetic Genealogy were a part of a Panel Discussion on the future of Genetic Genealogy. Something brought up by one of the Panelists was that we don’t really have anywhere to make the connection between a World Family Tree and DNA.
I was a bit shocked and dismayed that not one of these Genetic Genealogy Icons brought up WikiTree. WikiTree, where genealogists collaborate on a true, single, world family tree. WikiTree, where I, you, them, anyone can add all current and future DNA test’s and have the test information auto-populate every single ancestor with that test information. For auDNA tests, back to at least our 64-4th great grandparents. For Y and mtDNA tests back into the depths of our shared pedigree. WikiTree even maps XDNA for it’s DNA tested members! WikiTree, where if something happens (in so many different scenarios), will carry on with nothing more than a hiccup for ever and ever – really. Not ONE mention.
What I use daily in my work is the WikiTree DNA Sandbox.
This is where I start looking for patterns that aren’t obvious or easy to correlate anywhere else.
Take GEDmatch and it’s GEDCOM + DNA tool. I can scroll down a list of people to see the pedigrees of my matches. Once I find a Pedigree that matches, I run a One to One comparison. Then I cut and paste the One to One Match comparison information to a section for the match into the DNA Sandbox.
The section titles show the match name and shared chromosome numbers. If I continue this process over time it will start to reveal patterns:
- 4.4 (Familial Surname) Carter
- 4.4.1 Match Name – First Cousin once removed Chromosomes 2, 3, 4, 5, 7, 8, 10, 11, 15
- 4.4.2 Gerald Ford – Chromosomes 1, 3, 6, 8, 12, 13
- 4.4.3 Bubba Carter – Chromosomes 2, 5, 7, 15, 18, 20
- 4.4.4 Modine Carter Bush – Chromosomes 5, 8, 12, 16
- 4.4.5 Mozette Carter Obama – Chromosomes 2, 4, 15
- 4.4.6 Ron Reagan – Chromosomes 7, 18, 22
- 4.4.7 George Washington – Chromosomes 15, 18, 22
With this view I can start to make connections between specific Chromosomes and Familial Surnames. It will also show outliers – matches who probably don’t belong to a specific familial surname group even though at first blush they may appear to belong. Try working on the Smith family of NY and see how many matches with the last name Smith are outliers to YOUR Smith family.
In the table of contents for the DNA Sandbox you can get a peak into those patterns. Take the mtDNA Matches. Obviously matches 3.1.3 to 3.1.5 need some further looking into as do the paternal haplogroup matches 4.1.3-5.1.5. Since these DNA matches posted their mtDNA and YDNA haplogroups information in their auDNA results on GEDMatch we are able to see right off the bat, from the title sections, that they share Chromosome 15. Do they share overlapping segments? A quick look at the meat of the information and…
Yes two do. Granted, they are distant connections – 5.1 generations to 6.1 generations – but they do overlap. If I can figure out the MCRA and add a familial surname to this grouping? It’s a HUGE step toward finding more matches that share Chromosome 15 with you who also are in this Familial Surname grouping.
Betty Jean’s DNA Sandbox
Betty Jean? Where is she in all this? Well, back in the Spring when we started her search for her birth family, I started her WikiTree DNA Sandbox.
Bit and Pieces become patterns
Working steadily with small bits and pieces of data from different testing companies, I pasted data into her sandbox. It started with her highest DNA match on 23andme, her first cousin once removed, who is also an adoptee. We’ll call her Pat (she is very much still in the midst of figuring out her own identity and dealing with the emotional roller coaster that comes with finding ones birth family).
Pat’s information wasn’t sitting alone in the sandbox for long. 12 of Betty Jean’s top 15 matches belonged to the same family – the Howard’s and the Brotherton’s. All these people had either had their DNA tested on their own or were prodded by Jane to get their tests done to help in finding Pat’s birth family. Lucky Betty Jean again, having Jane in her corner.
So within a few weeks, adding Betty Jean’s one to one matches, researching the pedigrees and using the number of cM (centimorgens) – “In genetics, a centimorgan (abbreviated cM) or map unit (m.u.) is a unit for measuring genetic linkage. It is defined as the distance between chromosome positions (also termed loci or markers) for which the expected average number of intervening chromosomal crossovers in a single generation is 0.01.) WikiPedia
– and the generations estimate, her sandbox began to show patterns. Surnames began to have specific chromosomes connected to them.
As Jane and I worked, I compiled a list of Surnames for Betty. Surnames to use and Surnames to discount. If a pedigree or tree leads to one of the discounted Surnames? Then attention can be focused elsewhere. The list, added to the Sandbox, includes links within WikiTree to the MCRA for a specific Surname in the line. With the sandbox filling up, jumping around the great big ole shared tree with ease, working WikiTree’s relationship tools as well as the DNA tools, I was finding answers in a flash.
Fully Customize the sandbox to your level of expertise and knowledge
And there is no hard and fast rule about what goes into the sandbox. Some have graphics with triangulated groups. Some have Haplogroup information by Surname. All have the ability to make finding answers in the DNA connected world tree that is WikiTree an easier thing to do.
Now, after this VERY long post, I need to go find some patterns in nature for a while. Does Blueberry pie get cold FAST in -2 degree weather? an experiment I must try.