After working with DNA Painter and GEDmatch matches I discovered that 15% of my DNA matches are actively collaborating in genealogy.
Yesterday Roberta Estes wrote a blog about DNA Painter (she Actually has a series on DNA Painter – see below). Reading her latest sent me into a distracted by DNA Painter Day. Thank you Roberta.
I like DNA Painter and have used it to help my with working out information for my work, but today I decided to paint a bit of my own lines:
I opened GEDmatch and went to my one-to-many matches list. Over on the left hand side of my matches is a column with links to GEDCOM’s uploaded to GEDmatch or a WikiTree8 Generation pedigree.
I have used these links many times when doing quick look-ups on how a DNA match might be related to me or clients, are there common surnames? or are there common ancestors? It’s a great way to use what other people have shared to see who you are.
I followed the information in the GEDCOM File or WikiTree Pedigree and connected 12 new DNA matches to 5 of my ancestor couples using DNA Painter. Nice!
I made some obversations
Of the first 222 matches on my list 37 had GEDCOM’s or WikiTree links, three of the GEDCOM’s listed actually had no GEDCOM’s. That leaves a total of 34 total shared family files to go along with the DNA.
From this we can estimate that 15% of the people in my lines are sharing their genealogy. It’s a rough estimate for sure. Is this a good rough estimate for the amount of people who are willing to share their genealogy? It is a very low number.
Email Tennis Example
I have been working with a client to help identify her mother’s birth family. It’s a hard one because her mother was born in 1916. It’s a hard one because the information on the original birth certificate appears to be “made-up”. The first clue here was that the delivery doctor’s surname was given as the child’s middle name.
I have sent out many runs of emails to groups who match this lady (there is a second cousin match with no identifying information who has not answered many attempts to contact them via the testing companies messaging system – oh if they would!). Yesterday I sent another run to 10 matches asking if they would share a tree or pedigree with me. One person answered with asking me to give him her parents names.
I gave him the adoptees story and why I don’t have that information. I sent him to the research for this adoptee listed on her WikiTree profile. He said he would do his own research into her parents, if I could only give him that then he could see if she matched anyone in his tree.
We sent several volleys of emails in this vein and his suggestion I upload the DNA to other sites might help me find an aunt or uncle…no, no aunt or uncle would be alive… Frustration would be a good word to use to describe the volley. The last email I sent was very polite and specific about how sharing genealogy with someone, literally, can help that person find out who they are.
The Little Exercise
I walked through 10% of my total matches on GEDmatch to find shared genealogies and found how many were collaborative Genealogists. The percentage I got was 15%. Is this indicative of Genealogy as a whole?
WikiTree boasts 554,626 collaborative Genealogists. What percentage of all Genealogist’s (from Hobbyists to professionals) is this number?
How do we get the word out to all the DNA testers that there is more to their DNA test than just “What geographic region do their ancestors come from”?
One thing we haven’t had until now is easy X-chromosome comparison links. X comparisons can be especially powerful for genealogy because there is a more limited inheritance pattern on the X than the autosome and almost everyone who has taken an autosomal DNA test (all 10 or 12 million of us!) has X chromosome test results too. There is a lot of untapped potential for DNA confirmation using X matches.
Here’s an example of how you might use this. Look on your DNA Ancestors page — this is the “DNA” link on the pull-down menu that starts with your WikiTree ID — and scroll to the X Chromosome section. These are the ancestors from whom you inherited your X DNA. Choose one of the distant ones and click the DNA Descendants icon next to their name.
On your ancestor’s DNA Descendants page scroll to the X Chromosome section. These are the descendants — yourself and your cousins — who are likely to match each other on the X. If more than one of you are on GEDmatch you can click the “[compare]” links to see whether you match as you would expect.
Here are a couple examples of DNA Descendants pages where you can see the new GEDmatch comparison links:
Maybe a more informed genetic genealogist will follow up here with advice on doing the actual DNA confirmations, or with other ideas for using this new feature.
Onward and upward,
P.S. A big thank you to John Olson, Curtis Rogers, and our other friends at GEDmatch for enabling us to create these links. Thank you to Blaine Bettinger for his early and ongoing evangelism for X chromosome usage. (We used Blaine’s charts to create our XDNA ancestor and descendant pages.) And thank you to Mags Gaulden, Kay Wilson and the other DNA Project members for their leadership on these subjects, most especially — especially — thank you to Peter Roberts, who suggested this feature and helped it all come together, as he has with many of our DNA features.”
This is just great Chris (and Peter),
X-DNA is often overlooked, but can be a powerful tool because it’s inheritance is very specific. Click on your DNA link as Chris suggested and look at how this sex chromosome is inherited.
For a female:
From your Dad and his Mother.
From your Mother and her parents
For a Male:
From your Mother and her parents
It’s so specific. The Confirmation Citation is really informative too:
* Maternal relationship is confirmed by a 108.0 cM X chromosome match between John Kingman GEDmatch T782948 and his second cousin once removed Kelly Miller GEDmatch A721343. Their MCRA is Charles Cyrus Babst.
Take some time to look at some of those X-Matches WikiTree has posted for you. You might get a pleasant surprise.
I noticed a post today about auDNA Raw Data File upload to GEDMatch. The comment that struck me was the idea that people, in general, are nervous, overwhelmed, uncomfortable with the process of downloading their raw DNA data from their testing company and uploading to GEDmatch.
Well, to calm those nerves – we aren’t talking about brain surgery. Not talking about a 120 story tight rope walk. We are not talking about a trip to Mars.
Ir’s just downloading a file to your computer, then uploading the file to GEDmatch. It is exciting, there is no denying that. First time working with DNA results is incredibly exciting. You do all the file portation and in 8 to 24 hours you are connected to people from ALL the Genealogy Testing Companies – not just the company you tested with.
Get your DNA Tested for Genealogy
No you can’t upload a paternity test using DNA to a Genealogical Testing Site or to GEDmatch. Get a DNA test from one of the Genealogical DNA testing Companies:
This one is easy AND you can protect your privacy by providing an Alias. Though I am not all that fond of Aliases. One of the first things I do when searching for matches is scan the one-to-many result for a kit to see if any of the known surnames appear in the list (this is easy using your browsers “find” feature). An initial (any initial) and LNAB (last name at birth) can be enough to protect privacy (in my opinion).
Download your Raw Data File to Your Computer
Here are the links to directions for downloading your Raw Data File:
You can download your raw Data from other companies and upload them into GEDmatch Genesis – Google it – “Download my raw data from _____.”
Make sure you know where the file ends up on your computer. When you download the file make sure it goes to your desktop or downloads folder. If you download it and have no idea how to find the downloaded file, then the anxiety can kick-in. If you can’t find it go back to your browser and click on Downloads in the browser to see where the file might have ended up.
Upload your Raw Data file from your computer to GEDmatch.
GEDmatch posts pertinent information about it’s site for users at the top of your profile page. Note the information about the 23andMe chipset and it working in Genesis?
Once you are on your Profile page you will see the above box on the right of your page. Click on the Generic upload and it will take you to:
On Saturday May 27th at 3:00PM EDT, please join me (Mags), WikiTree Leader Peter Roberts, DNA Project Coordinator Emma MacBeath, and Julie Ricketts for a live chat on “WikiTree and DNA – Third Party DNA Sites.
This is the sixth in the WikiTree and DNA Series and goes along with some of the changes going on with the DNA Project. Join the chat to ask us DNA Features questions.
Pull up a chair to watch or ask questions in the LiveCast chat, either way we promise an hour of WikiTree fun! If you want to see a complete list of past and future LiveCasts click the graphic below or follow this link.
P.S. Do you have someone you would like us to interview? Post some answers with your picks for LiveCast Guests – it can even be yourself!
Finding the patterns in genetic genealogy research is really a fundamental thing we do when looking for the clues to our roots. Even in traditional genealogy, looking at a pedigree chart reveals patterns in geographical locations, dates and names which help our research. Looking for patterns is not new, even if we don’t realize we do it.
How do we find the patterns are we looking for? Spreadsheets?
Betty Jean’s Raw Data is file loaded to GEDmatch. If we do a One to Many matches analysis we can capture the entire results list via cut and paste, and insert it into a spreadsheet. We can grab similar results lists from most of the DNA testing or results companies. Getting all the formats similar in a spreadsheet takes some juggling and tweaking but it’s worth it. It is made easier by learning how to create things like separate first, middle, maiden and last name entries by converting text to columns and using filters to clean-up the columns. Sorting is a whiz, once you have all the information in a sortable state.
Sort your spreadsheets by Chromosome, Segment Locations and Last Name and you have a pretty clear view to the people on the sheets who share your DNA and where they share it.
Who do your Matches Belong to? What Familial Surname?
Don’t get that? Who is a matches MCRA (Most common Recent Ancestor) in common with you? Some of the files you download from the various companies give you common surnames. Well that helps, doesn’t it? Sometimes? They also show how many Generations back you might share one of those surnames (some people have 50 or more Surnames). That’s it. That is the answer!
Try adding each of those surnames to a sheet individually or using the text to columns conversion and…
Family Trees and Pedigree Charts
Aside from adding columns upon columns of surnames to your spread sheet there really isn’t any way (that I know of) to add a pedigree or family tree to a spreadsheet – it might be doable, but…I don’t have enough finger tips or time for that.
There are great places where you can upload your GEDCOM to a DNA testing or analysis site, but the DNA isn’t in any way correlated with the tree. It’s just there and you have to use your brain and knowledge of what you are working on to make any sense of it. ESPECIALLY if it’s further back than a generation or so.
But I have something…
I was at a conference in the spring of 2016 where some of the current icons of Genetic Genealogy were a part of a Panel Discussion on the future of Genetic Genealogy. Something brought up by one of the Panelists was that we don’t really have anywhere to make the connection between a World Family Tree and DNA.
I was a bit shocked and dismayed that not one of these Genetic Genealogy Icons brought up WikiTree. WikiTree, where genealogists collaborate on a true, single, world family tree. WikiTree, where I, you, them, anyone can add all current and future DNA test’s and have the test information auto-populate every single ancestor with that test information. For auDNA tests, back to at least our 64-4th great grandparents. For Y and mtDNA tests back into the depths of our shared pedigree. WikiTree even maps XDNA for it’s DNA tested members! WikiTree, where if something happens (in so many different scenarios), will carry on with nothing more than a hiccup for ever and ever – really. Not ONE mention.
This is where I start looking for patterns that aren’t obvious or easy to correlate anywhere else.
Take GEDmatch and it’s GEDCOM + DNA tool. I can scroll down a list of people to see the pedigrees of my matches. Once I find a Pedigree that matches, I run a One to One comparison. Then I cut and paste the One to One Match comparison information to a section for the match into the DNA Sandbox.
The section titles show the match name and shared chromosome numbers. If I continue this process over time it will start to reveal patterns:
With this view I can start to make connections between specific Chromosomes and Familial Surnames. It will also show outliers – matches who probably don’t belong to a specific familial surname group even though at first blush they may appear to belong. Try working on the Smith family of NY and see how many matches with the last name Smith are outliers to YOUR Smith family.
In the table of contents for the DNA Sandbox you can get a peak into those patterns. Take the mtDNA Matches. Obviously matches 3.1.3 to 3.1.5 need some further looking into as do the paternal haplogroup matches 4.1.3-5.1.5. Since these DNA matches posted their mtDNA and YDNA haplogroups information in their auDNA results on GEDMatch we are able to see right off the bat, from the title sections, that they share Chromosome 15. Do they share overlapping segments? A quick look at the meat of the information and…
Yes two do. Granted, they are distant connections – 5.1 generations to 6.1 generations – but they do overlap. If I can figure out the MCRA and add a familial surname to this grouping? It’s a HUGE step toward finding more matches that share Chromosome 15 with you who also are in this Familial Surname grouping.
Betty Jean’s DNA Sandbox
Betty Jean? Where is she in all this? Well, back in the Spring when we started her search for her birth family, I started her WikiTree DNA Sandbox.
Bit and Pieces become patterns
Working steadily with small bits and pieces of data from different testing companies, I pasted data into her sandbox. It started with her highest DNA match on 23andme, her first cousin once removed, who is also an adoptee. We’ll call her Pat (she is very much still in the midst of figuring out her own identity and dealing with the emotional roller coaster that comes with finding ones birth family).
Pat’s information wasn’t sitting alone in the sandbox for long. 12 of Betty Jean’s top 15 matches belonged to the same family – the Howard’s and the Brotherton’s. All these people had either had their DNA tested on their own or were prodded by Jane to get their tests done to help in finding Pat’s birth family. Lucky Betty Jean again, having Jane in her corner.
So within a few weeks, adding Betty Jean’s one to one matches, researching the pedigrees and using the number of cM (centimorgens) – “In genetics, a centimorgan (abbreviated cM) or map unit (m.u.) is a unit for measuring genetic linkage. It is defined as the distance between chromosome positions (also termed loci or markers) for which the expected average number of intervening chromosomal crossovers in a single generation is 0.01.) WikiPedia – and the generations estimate, her sandbox began to show patterns. Surnames began to have specific chromosomes connected to them.
As Jane and I worked, I compiled a list of Surnames for Betty. Surnames to use and Surnames to discount. If a pedigree or tree leads to one of the discounted Surnames? Then attention can be focused elsewhere. The list, added to the Sandbox, includes links within WikiTree to the MCRA for a specific Surname in the line. With the sandbox filling up, jumping around the great big ole shared tree with ease, working WikiTree’s relationship tools as well as the DNA tools, I was finding answers in a flash.
Fully Customize the sandbox to your level of expertise and knowledge
And there is no hard and fast rule about what goes into the sandbox. Some have graphics with triangulated groups. Some have Haplogroup information by Surname. All have the ability to make finding answers in the DNA connected world tree that is WikiTree an easier thing to do.
Now, after this VERY long post, I need to go find some patterns in nature for a while. Does Blueberry pie get cold FAST in -2 degree weather? an experiment I must try.
Betty Jean had her DNA tested with 23andMe in an attempt to find out if she had any medical issues which she may have passed along to her children. Along with her health test, she was also was in 23andMe’s Genetic pool of genes. Having her genes in DNA gene Pools will help us in her adoption search.
On first look, Betty Jean’s information included some fairly close cousin’s. The closest was a predicted 2nd cousin sharing 1.76% of their DNA. There were 12, 2nd to fourth cousin matches. I sent notes to all of them via 23andMe’s internal messaging system.
I also took some time to look to see if there were any common surnames in these matches. There were – Brotherton and Howard. At the time 23andMe had no DNA analytical tools, so I immediately downloaded Betty Jean’s raw DNA Data file (to download a DNA Data File from 23andMe see this help information) and uploaded it to GEDmatch (you must register for GEDmatch to be able to upload) via the Generic Upload Fast New, Beta.
NOTE: 23andMe has recently added DNA analysis tools which lets it’s users do chromosome mapping and comparisons to other matches. This is great news for anyone who has their DNA tested with 23andMe. It does not preclude a tester from uploading data to GEDmatch, because a tester would want to have their DNA in GEDmatch’s large Gene pool along with people (anyone who uploaded their raw data to GEDmatch) from all the testing companies.
Betty Jean’s GEDmatch Matches
After uploading Betty Jean’s Raw data file from 23andMe we found Betty Jean’s genes swimming in the pool with many of her close cousins – the big one was a 1st cousin once removed at 342.8 total cM.
What do we know from this list? Not much for this search since we don’t have any family line we can identify in the matches at face value, without being able to correlate the information with her matches family trees. GEDmatch does have a GEDCOM upload function, but not many of Betty Jean’s matches had their family trees on GEDmatch.
Gathering Family Trees
Again, using the emails for the matches on GEDMatch I sent emails explaining that Betty Jean Matched them and asking if they have family tree’s online or available for access in some other way. I also contacted Jane to discuss the matches. Jane and I spent a bit of time exchanging emails and connecting the dots of Betty Jean’s matches to Jane’s tree.
Remember the 20 foot tree I printed of Jane’s Ancestry Tree? At first I started trying to jump around that monster to mark where the matches landed in the tree. It was cumbersome and frustrating and I had to come up with a better way to be able to see ALL of it at once, and…
There was one more thing about Jane’s tree that needed some space to work-out. It seemed from a quick scan that the Howard and Brotherton lines, as well as other lines that married into them, were a product of Endogamy.
Endogamy is not uncommon in the US colonies as our social spheres were limited by small communities and vast distance between them. This occurred in Appalachia to an extent that one often hears jokes about “my cousin is my wife”. Jokes aside, the area of North Carolina where the Howard’s and Brotherton’s lived is on the outside edge of Appalachia.
Why should Endogamy be something we need to look into carefully and closely? Simply put, it skews the numbers. If cousins marry, then the DNA mix is a mix from one family rather than two. So there is a double infusion of Genes.
My Map of Betty Jean’s Family
It started with one single 8 1/2 x 11 sheet of paper. In the middle of that first sheet of the paper I wrote a list of Betty Jean’s top matches, the 12 Jane had in her tree and a few more. Then I started adding lines back. Creating pedigree charts from the DNA matches for each of the family lines identified by the DNA and Jane’s tree. As I went I added more blank sheets to fill in as I added family. At this point the Howard’s and Brotherton’s were extending to the right, radially from the DNA matches circle in the middle. I added papers to the map so that it was 3 sheets long and 3 sheets wide. Thankfully Jane’s tree was now easy to see, even with all the complicated connections within the Howard and Brotherton families.
For each person added to my map, I added or connected them to WikiTree to create Betty Jeans birth/mirror tree. It was a great help having WikiTree’s relationship tools at the ready to help me define how these people might be connected to Betty Jean. It also helped me when trying to decipher Jane’s voluminous emails on family connections.