Putative positive selection signals in THB horses
We used the XP-EHH method [
9] to find genes under positive selection in THB, which calculates haplotype decay separately for each group using the EHH [
23]. In addition, we calculated the XP-CLR statistics between THB and JH breeds. This statistic searches for the selective sweep on SNPs in the vicinity of the selected allele, using Brownian motion to model genetic drift under neutrality through allele frequency differentiation between populations [
6]. The Manhattan plot of the −log
10 transformed XP-EHH and XP-CLR score p-values is presented in
Figure 2a and 2b, respectively. Using the top 1% outlier regions, a total of 288 genes were detected using XP-EHH (98 genes) and XP-CLR (200 genes) population statistics (
Supplementary Table S3,
S4).
A comparison between THB and JH was appropriate because these populations have been bred under different environments for a long time. We calculated the XP-EHH values and XP-CLR scores as the window statistic of a total 44,826 and 44,844 genetic regions, respectively. By dividing the genome into a non-overlapping segment of 50 kb, we compared the genomic regions across populations and defined those genetic regions on whole horse genome. Empirical distributions using total regions can be constructed due to whole genome sequencing data. XP-EHH scores of 44,826 genetic regions and XP-CLR scores of 44,844 genetic regions showed normal distribution as expected (
Supplementary Figure S3,
S4). In this analysis, we used the outlier approach in distribution to detect a significant selective region [
24]. We defined the top 1 percent of the XP-EHH and the XP-CLR score as a significant selective region and identified 448 significant genetic regions each which were a selective region in THB compared to JH. We identified 98 genes (XP-EHH) and 200 genes (XP-CLR) in 116 and 164 (with annotation) of total 448 significant regions.
We thought that regions with outlier XP-EHH and XP-CLR score provided several important pieces of evidence of THB domestication and selection. We constructed a biological network using Gene Ontology analysis which resulted in 72 gene ontology biological process terms (
Figure 3). Then, the BP terms were grouped into 20 categories based on genes involved in which we focused on those supporting THB’s characteristics (
Supplementary Table S5). We hypothesize that the BP terms enriched that are related to immune function, ocular size and visual function, and energy metabolism might contribute to the THB’s superior racing performances (
Table 1).
The BP term categories “Negative regulation of intracellular transport of viral material” is related to immunity. The genes involved in this term (bone marrow stromal cell antigen 2 [
BST2] and tripartite motif containing 5 [
TRIM5]) are associated with negative regulation of intracellular transport of viral material term, referring to any process that stops, prevents or reduces the frequency, rate or extent of intracellular transport of viral material.
BST2 is associated with growth and development of B-cells. It is an interferon inducible transmembrane protein that provides innate immune response activity by inhibiting members of the retrovirus, filovirus, arenavirus, and herpesvirus families [
25]. Equine tetherin orthologues without dual tyrosine motif could potently activate the nuclear factor kappa B subunit 1 (NF-κB) signaling. NF-κB plays a key role in regulating the immune response to infection [
26].
TRIM5 gene encodes a member of the tripartite motif (
TRIM) family that include three zinc-binding domains, a ring, a B-box type 1 and a B-box type 2, and a coiled-coil region. The protein forms homo-oligomers via the coiled-coil region and localizes to cytoplasmic bodies. It appears to function as an E3 ubiquitin-ligase and ubiquitinates itself to regulate its subcellular localization. It may play a role in retroviral restriction. Multiple alternatively spliced transcript variants encoding different isoforms have been described for this gene [
27]. Another immune-related gene (
RIN3) which plays a role in the maturation of phagosomes that engulf pathogens have been previously found under positive selection in THB associated with racing performance [
28].
THB are the fastest runners among the horse breeds used in the horse racing industry. Quite a few researchers studied why they run faster [
1,
5,
28]. Here, we report genes and BP terms that potentially contribute for the superior racing performances of THB. THB has been selected for structural and functional adaptions that contribute to its fast running performance [
28]. We identified that photoreceptor cell development and otic vesicle morphogenesis BP terms were enriched in relation to eye and ear development, respectively (
Table 1). Genes related to eye photoreceptor cell differentiation include centrosomal protein 290, G protein subunit gamma transducin 1, crumbs 1, cell polarity complex component (
CRB1), olfactomedin 3, and neurotrophic receptor tyrosine kinase 2 (
NTRK2). We inferred that strong selection of eye photoreceptor cell differentiation can directly affect increment of ocular size which leads to increased horse eyesight in the view of biological evolution at intra-species level. In vertebrate animals, ocular characteristic is influenced by many factors including body or head size, diet, and activity pattern. Heard-Booth and his colleague stressed that maximum locomotive speed plays a key role in determining ocular shape in mammals [
5]. Leuckart’s Law describes the relationship between a measure of axial eye diameter and maximum speed [
5]. It has been reported that absolute ocular diameter is significantly correlated to maximum running speed in mammals [
2]. This law also proposed that animals capable of achieving fast running speed require large eyes to enhance visual acuity and avoid collisions with environmental obstacles. The relationship between maximum running speed and eye size in a diverse sample of mammals proved this law [
5]. Additionally, there were two more GO terms which supported directly or indirectly positive selection of ocular size and function in THB in this study; dendrite development, and regulation of synapse assembly. Several genes (potein kinase, CGMP-dependent, type I [
PRKG1], rap guanine nucleotide exchange factor 2 [
RAPGEF2], RAB17, member RAS oncogene family [
RAB17], neural EGFL like 1 [
NELL1], DCC netrin 1 receptor [
DCC], NCK adaptor protein 2 [
NCK2], ghrelin and obestatin prepropeptide [
GHRL], adhesion G protein-coupled receptor B3 [
ADGRB3], and
NTRK2) were identified that trigger dendrite development [
29–
31]. When light reaches retina after traveling through cornea and lens, ganglion cells take electronic signal through dendrite and send this signal down to the optic nerve. EPH receptor A5,
RAB17, SH3 and multiple ankyrin repeat domains 2, and
GHRL are related to regulation of synapse assembly [
32–
34]. Based on this knowledge related to optic nerve, we reasoned that eyesight is closely related to synapse because the retina has several neuron layers and communication among these several neuron layers is very important in eye function. Adenylate cyclase 1, involved in the regulatory processes in the central nervous system that play a role in memory and learning, have been found to be under selection in racehorse populations [
28].
The BP term brown fat cell differentiation, defined as the process in which a relatively unspecialized cells acquire specialized features of a brown adipocyte, is an animal connective tissue cell involved in adaptive thermogenesis [
35]. Brown adipose tissue differs from white adipose tissue in the way they expend energy [
1]. The type, intensity, and duration of exercise determine the amount of form of fuel used (carbohydrate vs free fatty acid) that, aerobic activities (long duration, low intensity) use more free fatty acids as fuel than anaerobic activities (short duration, high intensity), which use more glucose. However, the horse is almost always using both types to some degree, at the same time. As activity level (e.g. running speed) increases, oxygen consumption rises to meet increased demand for ATP production. Brown fat has more mitochondria than other cells. When the body needs to use energy, it uses ATP. ATP is mainly produced in the mitochondria of cells. When brown fat is activated, it creates a protein called uncoupling protein 1, which prevents ATP production from mitochondria. Instead of generating ATP, heat energy is generated to increase body temperature. The effect of fat supplementation of horse diet on horse performances has been reported. Genes including peroxisomal biogenesis factor 11 alpha, laminin subunit alpha 4, zinc finger protein 516, and PR/SET domain 16 are related to brown fat cell differentiation. The positive selection of genes involved in brown adipose tissue differentiation has been previously identified in THB [
1].
Insulin receptor signaling pathway is another pathway enriched which control critical energy functions such as glucose and lipid metabolism. It has been found previously to be enriched in THB horses [
1] in relation to racing performance. It has also a role in the differentiation of brown adipocytes [
36].
Through QTL analysis, we identified six QTL regions that overlapped to genes in selective regions of THB (
Table 2). Ceramide synthase 6 (QTL chr18:48212639_48319679) is well-known racing distance associated gene and
ADGRB3 (QTL chr20:60009473_60987311) is closely related to recurrent uveitis disease of the eye. Recurrent uveitis is an acute, non-granulomatous inflammation of the uveal tract of the eye, occurring commonly in horses of all types of breeds universally [
37].
THB are the epitomes of variation under domestication, yet much of the evolutionary processes underlying the genetics of this diversity are poorly understood. So we tried to detect novel selective regions which were not reported, previously. We attained novel selective regions using XP-CLR analysis which helped us to observe the relationship between THB and JH in a different angle. These results can be used to characterize functional variants and explore the specificity of the Thoroughbred breed.