When you purchase a DNA test from 23andMe, your results will include a breakdown of your ancestors' locations. This is also known as an ethnicity or blend report. But how exactly is the composition?
This article takes an in-depth look at how 23andMe calculates your percentages and ancestor locations. Then we check for accuracy and other common questions customers ask.
table of contents
How to Read Your 23andMe Ancestry Makeup (Ethnic Estimates)
Ethnicity estimates can be found on 23andMe using the "Ancestry Composition" link in the main ancestry dropdown.
You'll see a percentage breakdown on the left and a map on the right.
The division on the left represents ancestral regions that can go back many hundreds of years. The map on the right attempts to show the location of your most recent ancestors.
The following section describes how to interpret the outline of the hierarchy in the left pane.
The following image shows part of my breakdown. I've labeled the different levels of the region's hierarchy.
capa superior continental
The top tier of your division is at the continent level.
23andMe originally started with three continental regions in 2008. Their division at the time showed percentages of European, Eastern and African heritage.
The top level has now been expanded to six categories:
- sub-saharan african
- West Asia and North Africa
- Central and South Asia
- East Asian and Native American
I don't think there are many complaints about 23andMe's continental estimates. You are correct for me with a 50/50 split between my mother's Irish heritage and paternal African heritage.
23andMe's vast regions cover a wide geography, e.g. "Northwest Europe". This is the current list of their regions within the continental categories:
|european||sub-saharan african||West Asia and North Africa||East Asian and Native American||Central and South Asia|
|northwest europe||northeast africa||Arabic, Egyptian and Levantine||China and Southeast Asia||Central Asia, Northern India and Pakistan|
|eastern europeans||Congolese and Southeast African||north african||American native||south india|
|About Europe||Western Africa||Northwest Asia||japanese and korean||south south asia|
|Ashkenazi Judas||african hunter-gatherer||north asia|
The Melanesian category no longer has a breakdown.
At the next level, it becomes more interesting for customers. You will often see one or two countries as part of the label. Each tag represents a 23andMe reference population, which I'll explain in more detail later.
Now think of it as a more specific layer below the broader region.
My European ancestry fully matched the British and Irish reference population.
Why do these two countries unite? This means that the 23andMe analysis cannot distinguish between the DNA within the two adjacent islands.
However, your analysis can draw a line between 'British and Irish' and another reference population called 'French and German'.
I think it would be helpful for customers if 23andMe showed all referenced actions in the breakdown, with zero percent next to those that don't apply.
If you want to know which reference populations you havethey are notappropriate, you willfind the list here.
If you were to look at your 23andMe compounding report prior to 2018, your results would stop at this level.
However, they published different estimates in April 2018.
Current ancestral locations
I have no current ancestral locations under my African collapse. Others might see places like "Burundians" or "Nigerians".
The lack of locations simply means that my DNA doesn't match 23andMe's currently very limited African groups.
However, under my British and Irish breakdown I have many recent ancestral sites.
What is the difference between reference populations and recent ancestral locations? Good question. I'll explain this in a later section.
The ancestral composition map
Most of your screen is an interactive display map.
When you open the report, the visualization is color-coded at the overall region level.
Unlike other sites, you don't zoom in on the map to see lower levels.
If you want to dive into recent ancestral locations, you can browse using the drill down in the left pane. You cannot actually click on an ancestor's location in the drilldown.
Instead, click on the reference population above your destination.This will cause the screen to jump to the most recent level of ancestral locations. The map zooms in on participating countries.
To be honest, I find navigation on this map a bit confusing.
What is the 23andMe Referrals Panel?
When 23andMe processes its DNA kit, it compares its results to a large collection of reference samples that it hopes represent different ethnicities from around the world.
In theory, 23andMe would be more accurate if it could use historical DNA samples that go back centuries. This may be possible in the future with technological developments, but it is currently not possible.
Therefore, your compositional analysis from any DNA testing company is a compromise based on modern samples.
The goal is to limit these reference DNA samples to individuals belonging to a single ethnicity. Given the number of increasingly mixed Western populations, this is not an easy task.
I'll use the infographic below to describe the process of creating reference panels. Let's work from left to right.
Academic DNA Projects
23andMe is based on DNA samples collected from three major academic projects. By the way, Ancestry.com uses the same blueprints. Both companies are confident that these designs will be accurate once they have identified samples with a single ethnicity.
Another set of DNA samples came from 23andMe customers who agreed to participate in the research project.
Customers must also indicate that all four grandparentsthey were born in the same country.
23andMe Private Collections
23andMe's white paper on its ethnicity algorithm states that the DNA samples only come from academic projects and 23andMe's clients. But this article was published in 2014.
Since then, 23andMe has greatly expanded its non-European heritage regions.
I've seen the company refer to "private collections" on their reference panels. My guess is that the company was actively distributing DNA kits to volunteers in countries that weren't well represented in their database.
Next step: remove relatives
23andMe's ethnicity algorithm is based on statistics. Therefore, you must be careful not to have multiple family members among your customers to influence your sampling.
After collecting as many DNA samples as possible, they screen them for close relatives. It's simply a matter of finding specimens that share a large number of centimorgans.
The algorithm ensures that only one from a nearby cluster can proceed to the next step.
Next step: remove outliers
After removing family members, the next step is to clean up statistical outliers.
Suppose a testator has indicated that all four of your grandparents are from Scotland. But his DNA mostly matches samples from Poland. This sample is reserved.
How did this situation come about? It may be an adoption where the testator is unaware of your biological heritage.
Box 4: Reference panel
The result of the step-by-step process is a panel representing different pools of DNA.
It is with this panel that your DNA will be compared.
How does 23andMe calculate your ethnic makeup?
I'll give you the basics here so your percentages don't come out of a black box.
Let's say your four grandparents are from Scotland, Italy, Uganda and northern China.
We assume that ancestral lines have remained in these regions for many generations. Her father's ethnicity is primarily Scottish/Italian and her mother is primarily Ugandan/Northern Chinese.
But 23andMe's algorithms don't know your ethnicity when you submit your DNA test. How do you calculate your composition?
23andMe cuts your DNA into many small, contiguous pieces. The size is chosen to be small enough to only accommodate DNA inherited from a single ancestor going back generations.
In our example, a piece of DNA came from Scotland, Italy, Uganda or northern China. A larger piece of DNA can be inherited from a mixture of two or three ancestors. That's why smaller is better!
The next big challenge for consumer DNA testing companies is that today's technology doesn't tell them which part is your mother's and which is your father's.
23andMe uses complex statistical analysis to infer whether a piece of DNA is maternal or paternal in nature. This is called phase.
But that's a compromise. As better DNA sampling technology becomes more accessible, ethnic estimates become more accurate.
Comparison with reference populations
Let's take one of those parts of your DNA that represents a single ancestor of an ethnicity.
Now the reference panel becomes the key. 23andMe compares the small piece of DNA to all different reference populations. The most similar is sought.
Assume that the first piece is most similar to the British and Irish reference populations. In our example, you inherited this piece on your father's Scottish paternal side.
23andMe has introduced a major change to composition estimates in 2020. Many customers have noticed that their percentages have increased or decreased. Some customers missed out on certain regions entirely.
This change occurred due to a new mathematical step in the process. 23andMe calls this "smoothing".
One of the purposes of smoothing is to correct phase errors; h when the DNA is assigned to the wrong parent.
Another purpose is to remove unusual patterns. Our DNA sample was inherited from four regions. About 25% of our DNA pieces must match reference samples from one of these four.
If a few pieces out of many thousands are from East Asia, then earlier estimates may have given you "traces" of your ancestry from this fifth region. Smoothing removed many traces and small percentages of ancestors.
We worked with an example of a single piece of DNA. 23andMe's algorithms sift through tens of thousands of these tiny segments to create a complete picture.
What are the current ancestral sites?
23andMe featured Recent Ancestor Locations in 2012. They are one notch below reference populations in composition. You may not have one in some of your regions.
23andMe says its recent ancestor locations are based on more than 400,000 23andMe customers "of known ancestry". I put it in quotes because I'm not sure how 23andMe "knows" the font.
Back thenWhite paperAs for the reference populations, they cite a very detailed survey they conducted among nearly 9,000 customers who agreed to participate as reference samples.
Perhaps they used the same method with those 400,000 customers.
The other possibility is that they used family background information that customers can provide in their account profiles. Here you can use 23andMe to enter the birthplaces of all four of your grandparents.
23andMe isn't as upfront about these calculations as it is about breaking it down by region.
However, it appears that the estimates are based on the people in that 400,000 customer base who are our DNA relatives.
Using DNA matching, the locations should represent younger generations than the general reference populations. This should be the heritage of the ancestors of the past two hundred years.
The use of statistics and probabilities means that these estimates are subject to some degree of uncertainty.
23andMe places a level of confidence in its estimates of recent ancestral locations. In the image below you can see two different levels.
These are the different levels:
- Very likely match: 80% confidence
- Probable agreement: 50-79.9% confidence
- Possible match: 30-49.9% confidence
How accurate is the 23andMe lineage composition?
Two words came up over the course of reviewing this 23andMe makeup article: stats and estimates. Its composition is a collection of estimates based on statistical comparisons with a reference database.
Several factors can reduce accuracy. Let's look at some of these challenges.
Reference panel accuracy
I mentioned that the reference panel is already a step behind in using century-old DNA.
That said, 23andMe relies on genealogy reported by testers to be accurate. But pedigrees can be wrong.
less developed regions
Many countries or communities do not have a tradition of registering birth, marriage and death events as part of civil administration.
Therefore, 23andMe must rely on verbal statements of grandparents' place of birth when selecting samples for its reference panel.
Now consider countries and regions that have suffered civil war or economic hardship over the last hundred years. Both lead to population migration and sometimes generational loss of knowledge.
The problem is that this leaves 23andMe with regions where the number of testers is not statistically significant.
The challenge of statistical analysis
I just want to mention here that 23andMe chose a combination of statistical modeling techniques over other options.
Did you understand correctly? And if it's right for one customer, does the same model apply to the next?
Judging by reactions on genealogy forums, no single DNA testing company has the correct ethnic estimate for everyone.
My experience with 23andMe Ancestry Composition
I'm happy with the accuracy of my composition at the regional level.
However, the breakout gets out of hand in the lower echelon.
Both Ancestry.com and MyHeritage have a similar function to 23andMe ancestor locations. These two competitors have relegated my recent heritage to a small county in Ireland.
On the other hand, 23andMe places my "very likely" ancestor in Scotland. Not everyone can be right.
23andMe may be the correct version. But that doesn't match my genealogy research.
How often does 23andMe change the ancestral composition?
Customers who tested with the V5 chip (i.e. after the end of 2017) experienced changes in their stock composition approximately once a year.
These are the dates on which the most important changes for customers were introduced:
- May 24, 2019
We hope to release it in late 2021 or 2022.
Does 23andMe rate Native American ethnicity?
This is a common question from new raters who are surprised that their ethnicity does not match their verbal family history.
I listed 23andMe regions in an earlier section. This includes an American Indian reference panel.
Why doesn't 23andMe show my Native American heritage?
If you don't see a percentage of Native Americans in your makeup, that doesn't mean you don't have Native American ancestry.
Due to the random nature of inheritance, you cannot inherit the DNA of all of your great-grandparents.
An ethnic region may have been reduced to levels that cannot be detected by a DNA test. Of course, the same logic applies to any ethnic region you think you're missing out on.
Why does my father have populations that I don't?
Her mother was tested and her composition shows a percentage of 1% of Italian ancestry. But Italy does not appear in its collapse. What is happening?
Since her mother is at the top of her DNA relatives list, no baby swaps are taking place at the hospital.
Does this prove that ancient compositions are meaningless? No, not in those small percentages.
This is where the random nature of inheritance comes into play. You don't get a complete representation of your mother's DNA in the half you inherit.
The small percentage that 23andMe identified as Italy may not be in their DNA.
However, it may be present in your brother's DNA! Which brings us to the next section.
Why does my brother have ancestors that I don't?
You and your siblings don't inherit exactly the same pieces of DNA from your parents.
A small percentage, such as B. 1% Italian, may be present in your brother's composition and absent in yours.
Submit your DNA for alternative ethnicity estimates
Other major companies also offer ethnicity estimates. You don't necessarily have to pay to view additional reports.
He canDownload your raw DNA from 23andMeand download it for free from other trusted sites.
You can find all options in our article.where can you submit your dna. We have tutorials for most DNA sites.
When you upload your DNA to GEDmatch, your ethnicity estimates are available for free. There is a rather bewildering array of different estimates.
Get an overview with ourGuide to the Best Ethnicity Calculators on GEDmatch.
When you upload your DNA to MyHeritage, ethnicity estimates often require an activation fee. I'm not convinced that MyHeritage's estimates are better than 23andMe's.
However, the company added a new feature in 2020 that significantly improves ethnic characteristics. you can look at ourArticles about MyHeritage genetic pools.