Supplementary Methods (Analyses and parameters)
Maximum Likelihood tree
A Maximum likelihood tree for unique haplotypes was constructed using PHYML . The HKY substitution model, with Gamma distributed rates and Invariable sites, received the best likelihood prediction through likelihood ratio tests using MODELTESTv.3.7  in conjunction with PAUP v.4.0b10  and were implemented in the Maximum likelihood analysis. The tree topology search employed was nearest neighbour interchange (NNI). An approximate likelihood ratio test (aLRT) was computed to determine branch support . Trees were visualized in MEGA4 .
Median Joining Networks
Networks of the sequences were constructed using the Median Joining algorithm  of NETWORK v.126.96.36.199 . Networks were subjected to maximum parsimony post-analysis using the Steiner maximum parsimony algorithm  within NETWORKv.188.8.131.52. For network analysis the epsilon parameter was set to 2 and transversions were weighted 3x the weight of transitions. Furthermore the weight of the 16189 position was reduced 10x and the weight each of the CA repeats at position 523 was reduced 5x per nucleotide in the repeat.
Time to most recent common ancestor (TMRCA) of the L0d haplogroup and the L0d sub-haplogroups was calculated from the median joining network using the Rho statistic . A mutation rate of 2.5 x 10-6 per nucleotide per generation  was assumed. This mutation rate is also similar to the widely used rate of Soares et al.,; which is given as 9.883 × 10-8 mutations per nucleotide per year for whole of the control region and thereforeamounts to 2.47 x 10-6 per nucleotide per generation (compared to 2.5 x 10-6 per nucleotide per generation of Ward et al.,). Time estimates were also calculated using other published mutation rates (i.e. 1.75 x 10-6 per nucleotide per generation ; 4.5 x 10-6 per nucleotide per generation ; 2.1 x 10-6 per nucleotide per generation  but because of its intermediate value the mutation rate of Ward et al., (which is similar to the Soares et al., mutation rate)was used in subsequent discussions and analyses. A generation time of 25 years was used throughout.
Mismatch distributions of populations and haplogroups were calculated in ARLEQUIN v.3.11 . From these the validity of demographic expansions and the date of expansions were inferred. The demographic expansion scenario is tested through simulating a population going through an expansion and testing whether the actual data is significantly different from the simulated expansion scenario. A non-significant Sum of Squared deviation (SSD) p-value will therefore indicate a population/group of sequences that went through an expansion. Parameters calculated are θ1 , θ0 , and τ.τgives an indication of the time of the expansion. The mutation rate of 2.5 x 10-6 per nucleotide per generation  and a generation time of 25 years were used to convert τ(Tau) to T (Time BP when expansion took place) by using the equation T= (τ/2μ) x generation time. In the equation μ is the mutation rate per gene per generation (2.5 x 10-6 per nucleotide per generation  x 1124 sites results in μ = 2.81 x 10-3).
Haplogroup isofrequency maps
Haplogroup isofrequency maps were generated applying the Kriging method[15, 16]incorporated in the SURFER v.8.06.39 . Mitochondrial contour plots were based on frequencies of the L0d/k subgroups on the background of the L0d/k group as a whole. This was done to eliminate the effects that admixture from Bantu-speakers and non-Africans would have on the distribution of the L0d/k subgroups. When frequencies were calculated, sample size effects were corrected by adjusting the total sample sizes in all groups to the same value.
Bayesian Skyline Plots
To visually represent the effective population size changes through time, Bayesian Skyline Plots (BSP)  were constructed. For each of the haplogroups, BSPs of effective population size through time were constructed using a Markov Chain Monte Carlo (MCMC) sampling algorithm, as implemented in BEAST v.1.4.8 . The population size function of the BSP can be implemented using either a piecewise constant or a piecewise linear function of population size change. In the present study, a piecewise linear model made up of 10 control points was used. The general time-reversible (GTR) substitution model with estimated base frequencies and a Gamma + Invariant Sites heterogeneity model was used to infer the ancestral gene trees for each haplogroup. The mean substitution rate was fixed to the rate of Ward et al.,  and a relaxed molecular clock (Uncorrelated Lognormal) was employed. Each MCMC sampling was repeated for 40 000 000 generations, sampled every 4 000, with the first 4 000 000 generations discarded as burn-in. All runs had an effective sample size of at least 1 000 for the parameters of interest. Each independent run was repeated at least twice and results were combined using the LOGCOMBINER v.1.4.8 tool included in the BEAST package. BSPs were visualized in TRACER v.1.4 .
Summary statistics and neutrality tests
The summary statistics; number of sequences, haplotype number, gene diversity  and nucleotide diversity , for haplogroups were calculated in DNASP v.4.10 . The population mutation parameter (θ) was estimated from using segregating sites (θs per nucleotide site) as well as the Waterson estimator (W-θs per sequence) . The selective neutrality tests of Tajima’s D , Fu’s Fs statistic  and the R2 statistic  were also calculated using DNASP v.4.10.
Population pairwise Fst
Population pairwise differences were calculated with ARLEQUIN v3.11  by using Fst  incorporating the nucleotide correction model of Tamura and Nei and a gamma correction of 0.532. Groups with N<10 were excluded from the population analyses (Table 1). The distance matrices were visualized through UPGMA trees in PAST v.1.54 .
Genetic vs. geographic distance comparison
The relationships between geographic and genetic distances for different population groups were investigated by doing a linear regression inR. The regression was applied to a scatter plot resulting from pairwise comparisons of distance matrices based on geographic and genetic distances (Fst). The linear regression model involved fitting a straight line with a gradient to the graph and recording significance values to the model and model variables (such as the gradient). Additionally, a Mantel test implemented in ARLEQUIN v.3.11  was also done to test the correlation between the geographic and genetic distance matrices.
The geographic distance matrix was constructed by obtaining latitude and longitude information of the different sampling locations from the website “Google Maps Latitude, Longitude Popup”  and calculating the great circle distance (in km) between the points using the “Latitude/Longitude Distance Calculation” .
Supplementary Methods References
1.Guindon S, Lethiec F, Duroux P, Gascuel O: PHYML Online--a web server for fast maximum likelihood-based phylogenetic inference. Nucleic Acids Res 2005, 33(Web Server issue):W557-559.
2.Posada D, Crandall KA: MODELTEST: testing the model of DNA substitution. Bioinformatics 1998, 14(9):817-818.
3.Swofford DL: PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4. Sunderland, Massachusetts: Sinauer Associates; 1998.
4.Anisimova M, Gascuel O: Approximate likelihood-ratio test for branches: A fast, accurate, and powerful alternative. Syst Biol 2006, 55(4):539-552.
5.Tamura K, Dudley J, Nei M, Kumar S: MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol Biol Evol 2007, 24(8):1596-1599.
6.Bandelt HJ, Forster P, Rohl A: Median-joining networks for inferring intraspecific phylogenies. Mol Biol Evol 1999, 16(1):37-48.
8.Polzin T, Daneschmand SV: On Steiner trees and minimum spanning trees in hypergraphs. Operations Res Lett 2003, 31:12–20.
9.Forster P, Harding R, Torroni A, Bandelt HJ: Origin and evolution of Native American mtDNA variation: a reappraisal. Am J Hum Genet 1996, 59(4):935-945.
10.Ward RH, Frazier BL, Dew-Jager K, Paabo S: Extensive mitochondrial diversity within a single Amerindian tribe. Proc Natl Acad Sci U S A 1991, 88(19):8720-8724.
11.Soares P, Ermini L, Thomson N, Mormina M, Rito T, Rohl A, Salas A, Oppenheimer S, Macaulay V, Richards MB: Correcting for purifying selection: an improved human mitochondrial molecular clock. Am J Hum Genet 2009, 84(6):740-759.
12.Horai S, Hayasaka K, Kondo R, Tsugane K, Takahata N: Recent African origin of modern humans revealed by complete sequences of hominoid mitochondrial DNAs. Proc Natl Acad Sci U S A 1995, 92(2):532-536.
13.Soodyall H, Vigilant L, Hill AV, Stoneking M, Jenkins T: mtDNA control-region sequence variation suggests multiple independent origins of an "Asian-specific" 9-bp deletion in sub-Saharan Africans. Am J Hum Genet 1996, 58(3):595-608.
14.Excoffier L, Laval G, Schneider S: Arlequin ver. 3.0: An integrated software package for population genetics data analysis. Evol Bioinfor Online 2005, 1:47-50.
15.Oliver MA, Webster R: Kriging: a method of interpolation for geographical information systems. International Journal of Geographical Information Systems 1990, 4(3):313.
16.Xue FZ, Wang JZ, Hu P, Li GR: The "Kriging" model of spatial genetic structure in human population genetics. Yi Chuan Xue Bao 2005, 32(3):219-233.
17.Surfer Demo [
18.Drummond AJ, Rambaut A, Shapiro B, Pybus OG: Bayesian coalescent inference of past population dynamics from molecular sequences. Mol Biol Evol 2005, 22(5):1185-1192.
19.Drummond AJ, Rambaut A: BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol 2007, 7:214.
20.Tracer v1.4. [
21.Nei M: Molecular Evolutionary Genetics. New York, USA: Columbia University Press; 1987.
22.Rozas J, Sanchez-DelBarrio JC, Messeguer X, Rozas R: DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics 2003, 19(18):2496-2497.
23.Tajima F: The amount of DNA polymorphism maintained in a finite population when the neutral mutation rate varies among sites. Genetics 1996, 143(3):1457-1465.
24.Tajima F: Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 1989, 123(3):585-595.
25.Fu YX: Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics 1997, 147(2):915-925.
26.Ramos-Onsins SE, Rozas J: Statistical properties of new neutrality tests against population growth. Mol Biol Evol 2002, 19(12):2092-2100.
27.Reynolds J, Weir BS, Cockerham CC: Estimation of the Coancestry Coefficient: Basis for a Short-Term Genetic Distance. Genetics 1983, 105(3):767-779.
28.Tamura K, Nei M: Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol 1993, 10(3):512-526.
29.Hammer O, Harper DAT, Ryan PD: PAST: Palaeontological Statistics software package for education and data analysis. Palaeontologia Electronica 2001, 4(1):9.
30.Google Maps Latitude, Longitude Popup [
31.Latitude/Longitude Distance Calculation [