Estimation of Genetic Parameters for Somatic Cell Scores of Holsteins Using Multi-trait Lactation Models in Korea

The study was conducted to analyze the genetic parameters of somatic cell score (SCS) of Holstein cows, which is an important indicator to udder health. Test-day records of somatic cell counts (SCC) of 305-day lactation design from first to fifth lactations were collected on Holsteins in Korea during 2000 to 2012. Records of animals within 18 to 42 months, 30 to 54 months, 42 to 66 months, 54 to 78 months, and 66 to 90 months of age at the first, second, third, fourth and fifth parities were analyzed, respectively. Somatic cell scores were calculated, and adjusted for lactation production stages by Wilmink’s function. Lactation averages of SCS (LSCS1 through LSCS5) were derived by further adjustments of each test-day SCS for five age groups in particular lactations. Two datasets were prepared through restrictions on number of sires/herd and dams/herd, progenies/sire, and number of parities/cow to reduce data size and attain better relationships among animals. All LSCS traits were treated as individual trait and, analyzed through multiple-trait sire models and single trait animal models via VCE 6.0 software package. Herd-year was fitted as a random effect. Age at calving was regressed as a fixed covariate. The mean LSCS of five lactations were between 3.507 and 4.322 that corresponded to a SCC range between 71,000 and 125,000 cells/mL; with coefficient of variation from 28.2% to 29.9%. Heritability estimates from sire models were within the range of 0.10 to 0.16 for all LSCS. Heritability was the highest at lactation 2 from both datasets (0.14/0.16) and lowest at lactation 5 (0.11/0.10) using sire model. Heritabilities from single trait animal model analyses were slightly higher than sire models. Genetic correlations between LSCS traits were strong (0.62 to 0.99). Very strong associations (0.96 to 0.99) were present between successive records of later lactations. Phenotypic correlations were relatively weaker (<0.55). All correlations became weaker at distant lactations. The estimated breeding values (EBVs) of LSCS traits were somewhat similar over the years for a particular lactation, but increased with lactation number increment. The lowest EBV in first lactation indicated that selection for SCS (mastitis resistance) might be better with later lactation records. It is expected that results obtained from these multi-trait lactation model analyses, being the first large scale SCS data analysis in Korea, would create a good starting step for application of advanced statistical tools for future genomic studies focusing on selection for mastitis resistance in Holsteins of Korea.


INTRODUCTION
Most dairy breeding programs aim at the improvement in milk production by selection which consequently lead to decreased udder health, reproduction performances of cows (Dematawewa and Berger, 1998;Negussie et al., 2008) and impose increasing production costs on milk. Mastitis, as an udder disease, always captures the most interest being the highly frequent and costly dairy disease and, eventually leads to an involuntary and premature culling of milking cows for defective udder characteristics and decreased milk yield.
Given the economic importance of mastitis, selection for udder health or resistance to mastitis becomes very important consideration in dairy breeding programs. But genetic evaluation of mastitis is particularly difficult as it is a low heritable trait and categorical in nature (Rupp and Boichard, 1999;Carlen et al., 2004;Negussie et al., 2008;Bloemhof at al., 2009). Instead, the somatic cell count (SCC) is considered to be the most suitable indicator trait for mastitis resistance in view of its medium to high genetic correlation with mastitis (0.36 to 0.98) and its greater heritability than mastitis (Mrode and Swanson, 1996;Poso and Mantysaari, 1996;Rupp and Boichard, 1999;Carlen et al., 2004;Negussie et al., 2008;Bloemhof et al., 2009). The SCC is also easy to record. The selection for lower SCC in milk has a positive effect against the incidence of mastitis (Rupp and Boichard, 1999;Nash et al., 2000). Selection against high SCC also does not deteriorate the immune system of cattle and decreases a risk of infection at the same time (Rupp et al., 2000;Boettcher et al., 2002;Odegard et al., 2003;Madsen et al., 2008). However, the genetic evaluation of this trait is mostly based on somatic cell score (SCS), a logarithmic transformation of SCC to achieve normality of distribution (Ali and Shook, 1980).
In Korea, Holsteins are the major source of domestic milk production. Recently, an active participation of Korea in the INTERBULL dairy evaluation programs for Holstein cattle greatly emphasizes the analysis of large scale lactation data. Local breeders and producers are also little aware about the genetic contribution for the deterioration in udder health of Holsteins. Except some studies for milk production traits, no study has reported a large scale evaluation on SCS yet. For this reason, as a priori, the present study was aimed to estimate the genetic parameters for SCS of Holstein cows using multi-trait lactation models.

Animals, data recording and data transformation
A total of 1,409,052 test-day lactation SCC raw records were collected from first five parities of Holstein cows in South Korea from 2000 and 2012. Records of SCC less than 6 thousand or greater than 3 million cells/mL of milk were discarded from the original dataset. Also, the SCC with test dates recorded before 5 days in milk (DIM) and after 305 DIM, and less than 5 SCC records/lactation of a cow were removed. Initial edits on the data also included SCC records from cows that have calved from 18 to 42 mo, 30 to 54 mo, 42 to 66 mo, 54 to 78 mo, and 66 to 90 mo of age in the first, second, third, fourth, and fifth calving, respectively.
As SCC data usually have a non-normal distribution, a logarithmic scaling of SCC was carried out, as suggested by earlier studies, to achieve a data with more normal distribution, increased heritability, and reduced heterogeneity of within-herd and within-sire variances (Shook, 2006). Each of the test-day SCC was transformed to SCS as SCS = Log 2 (SCC/100)+4. Instead of a constant 3 which most studies used, a constant 4 was used to avoid the problem of negative value in the data (Martins et al., 2011). Each of the test-day SCS was adjusted by Wilmink's (1987) function for milking stage of the cow due to the changing pattern of SCS within its production trajectory. The adjustment function used was y ij =a 0 +a 1 t ij +a 2 exp(-0.05t ij ) +e ij , where y ij is a record of SCS for i th animal at j th test-day, e ij is the residual errors, t ij is the difference in days between j th test-date and calving-date of the i th animal, and a 0 , a 1 , and a 2 are the regression coefficients of SCS (y ij ) on t ij and exp(-0.05t ij ) estimated within each lactation stage. An average lactation SCS (LSCS) was obtained from all testday SCSs which were finally adjusted for lactation age groups (5 groups/ parity; at 5 months interval each).
Finally, two datasets were created applying restrictions for herd, number of sires and dams, and number of LSCS records/animal. For both datasets, record of a herd that had at least 3 selected sires and 3 dams were included. And a sire was selected if it contributed minimum 10 progeny records from 5 herds. Also, an animal was included if it had at least two (dataset 1) or three consecutive parity records (dataset 2). Above restrictions were implemented repeatedly through an iterative process until a dataset was stabilized. Above basic edits were performed on original dataset to obtain a computational feasibility and a better linked data structure for genetic variance components.

Animal pedigree
Animal pedigrees were prepared separately for animal and sire models according to the datasets. A total of 3,641 and 3,152 animals were found for the sire model analyses of dataset 1 and 2, respectively. Respective pedigrees were traced back to 16 and 15 generations as well. For animal models, pedigrees sizes for dataset 1 and 2 were 252,917 and 151,589 animals, respectively, with a pedigree depth of 20 generations for each. About 32% to 36% inbred were found among pedigree files. Most of the inbred (86% to 95%), which generally found to be in the parent generations, had inbreeding coefficient less than 0.05 among pedigree sets. Average inbreeding coefficients among pedigree datasets varied from 0.007 to 0.009.

Data analysis
The datasets for analyses contained a total of 264,716 and 81,955 animal records, respectively. Sire model and animal model analyses were carried out using both datasets. In the analysis each LSCS record was considered as individual traits. The herd-year (HY) classes were created by merging respective herd number and calving year, and fitted as a random effect in various models. Age at calving of animals was regressed with the response variable as a fixed covariate. All factors were fitted parity-wise to their respective LSCS using either multivariate sire models or univariate animal models that employed the restricted maximum likelihood approach, to estimates the components of variances, covariances and genetic parameters of each trait. Genetic variances and covariances were estimated using the VCE 6.0 software package (Neumaier and Groeneveld, 1998). The genetic model applied to calculate additive genetic and residual variances, in matrix notation, was as follows: where y is the vector of observations, b is the vector of fixed effects and covariates, u is the vector of random additive effects, q is the vector of random HY effects, e is the vector of random residual effects, and X, Z, and W are known design matrices relating observations to the fixed and random effects b, u and q respectively. The assumptions for distribution of y, u, q and e can be described as where G is the variance-covariance matirx of the random effects vector u, H is the variance-covariance matrix of the random effects vector q, and R is the matrix of residual variances and covariances. The G, H, and R matrices are described as G = A⊗G 0 , where A is the animal relationship matrix, G 0 is the additive genetic variance-covariance matrix between traits, and ⊗ is the direct product operator, H = I⊗H 0 and, R = I⊗R 0 , where I is the identity matrix of the same order as y, H 0 is the matrix of HY variance-covariance between traits, R 0 is the matrix of residual variance-covariance between traits, 0 is the null vector, and Φ is the null matrix.
Next, we calculated heritabilities (h 2 ) for traits using (co)variance estimates of sire and animal models. Genetic and phenotypic correlations were calculated from the (co)variance estimates of the multivariate sire models only. To obtain h 2 from sire model, the estimated sire variance for a trait was multiplied by 4, as it is known that sire model can obtain only 1/4 th of the total genetic variances. Genetic trends for LSCS in dataset 1 were obtained from estimated breeding values (EBV) for all animals using (co)variances obtained by sire-models, which were derived by PEST software package (Groeneveld et al., 1990).

Descriptive statistics of the datasets
Summary data on lactation average SCSs in dataset 1 and 2 are shown in Table 1 and 2. Number of records from fifth parity was comparatively lower than earlier lactations. In dataset 1, the average LSCS among parities varied from 3.507±1.09 to 4.322±1.48, corresponding to a range between 71,000 to 125,000 cells/mL. Dataset 2 also derived similar average LSCS in Holsteins. Average LSCS were  increased with the increment of lactations number in the cows, although differences in means were little (Figure 1). We obtained generally low to moderate coefficient of variation (CV) for LSCS traits which ranged from 28.2% to 29.9%. Despite the difference in use of a constant 4 to obtain a log-transformed SCS where many studies used a value of 3 or that related to differences in measurement of mean, our obtained average LSCSs were somewhat in agreement with earlier reports. Similar averages in Holsteins for logSCC by Ivkić et al. (2012;3.31), for first parity SCS by Ptak et al. (2009a;3.51), and for lactation SCS and daily SCS by Ptak et al. (2009b;3.71, 3.49). Our means were comparable to means found in French Holsteins by Biochard and Rupp (1997), and Rupp and Biochard (1999). However, slightly greater SCS observed in Finnish Holstein by Luttinen and Juga (1997) and Dube et al. (2008). A study of Portuguese Holstein-Friesians by Martins et al. (2011) also showed a similar increasing pattern of SCS as the number of parity increased. This increment of SCS over lactations generally occurs in a dairy breed with faster milk flow, in which the increased milk flow causes an animal more prone to mastitis. The trend of fairly stable and moderate LSCSs found in the study (Figure 2) indicates that mastitis infection in Korean Holstein population was found often; generally in cows with more consecutive parity records.

Heritability estimates
The variance components, heritability estimates of traits derived via multi-trait sire and animal model analyses are shown in Table 3 and 4, respectively. A highest heritability estimate of 0.14 was found in lactation 2 and 4 using sire models. Heritabilities (h 2 ) among parities obtained through sire model analyses were more or less similar among datasets, especially for h 2 of LSCS 2 and LSCS 3 . In contrast, the univariate animal model heritabilities were higher than those from sire model estimates and, the observed differences in h 2 estimates among datasets were negligible too. However, the analyses using all datasets yielded lowest heritabilities for LSCS 5 trait. This is probably because of the bias caused by noticeably fewer records from 5th parity cows as compared to other parities, and resulted underestimated genetic variances for the respective trait; whereas, gradually increased genetic variances were noticed from parity 1 to 4 traits. Another source of bias in the 5th lactation estimates (LSCS 5 ) using animal model might be associated with the presence of noticeably fewer dams than other parities, which was less influential to sire model estimates as expected. With the parity number increases, the proportional increases of variance for herd-year and residual effects deemed relatively higher than the additive genetic variance, indicating more environmental influences on SCS increase. A higher SCS also indicates the consequences of more infection of mastitis in the udder. Environmental variances were found to be relatively lower in the animal model analyses than those found in the sire models, as an opposite of the genetic variances, and those are possibly due to the presence of complete pedigree relationships in the animal model analyses.
Generally, a low to medium range of heritability was reviewed in previous reports, such as 0.05 to 0.29 (Schutz et al., 1990) or 0.06 to 0.13 (Ptak et al., 2009a). However, most of the authors have published h 2 estimates for LSCS between 0.10 and 0.27 (Monardes et al., 1990;Mrode and Swanson, 1996;Rupp and Boichard, 1999;Mark and Sullivan, 2005). Rupp and Boichard (2003) reviewed that LSCS trait, obtained by averaging the individual test day records, shows consistently higher h 2 estimate around 0.15. Estimates of heritabilities in this study deemed very consistent with most of these earlier studies. Further agreement is found in Austrian Fleckvieh cows which showed a mean h 2 of 0.09 to 0.13 for SCS traits in first five lactations (Koeck et al., 2010). Heritabilities obtained by Kadarmideen (2004) and Charfeddine et al. (1997) are also comparable (0.13 to 0.14). Our estimates in first three lactations are concordant with Swedish Holstein (Carlen et al., 2004), and Polish Holstein (Rzewuska et al., 2011), even though traits were defined differently. Similar h 2 estimates were obtained from the repeated measures testday SCS model for first three lactations (Reents et al., 1995). However, some disagreements were observed from several reports (Haile-Mariam et al., 2001;Mrode and Swanson, 2003;Heringstad et al., 2008;Dube et al., 2008;Martins et al., 2011;Ivkić et al., 2012), where h 2 estimates tend to be higher or lower than those estimates found in this study. Increase of genetic variances and heritability estimates along with the increment in parity numbers have also been reported in earlier studies (Schutz et al., 1990;Samore et al., 2002;Muir et al., 2007;Ptak et al., 2007;Martins et al., 2011).

Genetic and phenotypic correlations
Genetic correlations among LSCS traits derived via sire models are shown in Table 5. The estimates of genetic correlation deemed strong among lactation SCSs. The highest genetic correlation was observed between LSCS 4 and LSCS 5 (0.99) whereas, the lowest genetic correlation was found between LSCS 1 and LSCS 5 (0.62). Genetic correlations were generally strong between successive lactations (0.82 to 0.99), and tended to decrease between distant lactations. Sire model analysis using dataset 2 also derived similar estimates (Table 5). The phenotypic correlations among LSCS traits, however, were expectedly low to moderate. The highest phenotypic association of 0.54 was observed between LSCS 4 and LSCS 5 in the study. Phenotypic correlations between other lactations ranged between 0.22 and 0.51. Genetic trends of LSCS traits were shown in Figure 2. Even though it was found that  LSCS, lactation averages of somatic cell score; G, additive genetic variance; HY, herd-year class; E, residual variance. 1 LSCS1 to 5, lactation average somatic cell score traits for 1st to 5th parity.
Variance estimates corresponded to single-trait and multiple-trait analyses using animal and sire models, respectively.
heritabilities increased slightly with more parity records per cow (dataset 2), their genetic correlations rather deemed less influenced. In fact, more animals with at least two successive parity records were sufficient to obtain similar correlations as estimated from animals having more than 2 successive parity records. For correlations between traits, there was a harmony between the present study and previous reports. The strong genetic correlations among traits in the present study are found to be similar with estimates by Dube et al. (2008), Reents et al. (1995), Carlen et al. (2004), Mostert et al. (2004), and Mrode and Swanson (1996). A review by Rupp and Boichard (2003) also depicted similar estimates. A greater agreement for strong genetic correlation between adjacent LSCS was also observed from studies (Haile-Mariam et al., 2001;Koivula et al., 2002Koivula et al., , 2004Odegard et al., 2003;Negussie et al., 2006). These high estimates of genetic correlation (greater than 0.8) indicate that genetic determinism of SCC is close within and across lactations. Phenotypic correlations were also are concordant with earlier studies.
Phenotypically, the yearly trends of LSCS ( Figure 1) were consistently lower and higher LSCS at lactation 1 and 5, respectively. Interestingly, the genetic trend of LSCS 1 using EBV of animals from dataset 1 (Figure 2) showed fairly consistent and very low EBVs in animals across the years for the same trait. Estimated breeding value of LSCS 2 , LSCS 3 and LSCS 4 , although higher than EBV of LSCS 1 , showed somewhat similar trends, indicating less degradations over time. The observed EBVs indicate that genetic merit for high SCS (susceptibility to mastitis) in the first lactation of Holsteins is negligible but increases as lactation number progresses. Evidently, these results seem to be in line with the claim that first lactation trait probably is a different trait from second and later lactations (Reents et al., 1995). They found more incidence of mastitis in later lactations which is also similar to our study. Reents et al. (1995) further suggested that SCS from later lactations might be a better indicator for mastitis susceptibility than only first lactation SCS. Shook and Schutz (1994) also suggested use of later lactation records, ideally in a multiple-trait model.
Mastitis resistance is a complex trait that depends on genetic component as well as physiological and environmental factors including infection pressure (Rupp and Biochard, 2003). It is difficult to improve this trait by udder health management only when its genetic trend shows an increment. So, a selection for reducing milk SCC through breeding approaches should be considered. Otherwise, if SCC is kept unnoticed the SCC could increase due to correlated response when intense selection for high milk yield is performed. The selection of rear udder height and udder cleft type traits can also result the same. Contrarily, if SCS is the only criteria for resistance against mastitis, the response will be slow because of its lower heritability, which perhaps will even be slowed down by the lack of perfect genetic correlation between SCS and mastitis (Koivula et al., 2005). So, the best way to SCS evaluations might be through an economic index with a lesser emphasis on SCS relative to the most desired high milk yield traits; otherwise, a greater emphasis would decrease genetic gain in yield traits which are economically more important (Schutz, 1994).
Statistical methods for SCS evaluation vary widely among reports such as, lactation average model, test-day model, or random regression models. Each of them provides particular advantages than the other. Lactation average model is the simplest model of evaluation where a LSCS, the mean (or a weighted mean) of the test-day SCS are used. Considering high correlations among lactations test-day models using monthly repeated records of SCC were also developed (Reents et al., 1995). Random regression model, which allows more accurate modelling of the each test-day records, supports more complex statistical methods, but are computationally limited in many countries computation infrastructure. As a result, the preferred statistical model in earlier reports commonly reflected their available data structure, and proper computational tools to analyze them. In this study, we considered the lactation average model, based on the current limitation with the data structure (i.e. missing lactation records/animal), and computation platform. The choice of LSCS is justified by the common genetic determinism of SCS all along the lactation (Rupp and Biochard, 2003). In addition to data pruning and transformation, we fitted SCS with test-day (DIM) via Wilmink's function, which performed an adjustment for the test-days as well as for the time at minimum SCS occurs (50 days post-partum approximately), to better fit the lactation curve of animals and to obtain less biased estimates. Further, the consideration of multi-trait lactation model, due to high correlation between lactation data, also exploited the existing (co)variance between lactation records. To conclude, it can be noted that genetic parameters in this study are consistent with many previous reports. Heritability estimates and (co)variances components indicate the possibility of selection for SCS or mastitis resistance. Genetic correlations are favorable for improvement of one lactation SCS trait by selection of another LSCS trait. Although the evidence for genetic merit degradation of SCS traits is not strong, it is not to be ignored either. And, the EBV estimate indicates that it will be better if later lactation records are given priority for selection than only first lactation SCS. Strongly correlated successive LSCS traits in this study also suggest that multiple-trait evaluation may yield better result against selection of mastitis. It is expected that these results will be useful in setting up breeding targets, and selection criteria for dairy development in Korea and, as a valuable resource while working alongside the research goals of INTERBULL. We hope that the estimated variances and covariances will provide a better step to start with and compare estimates when random regression models will be implemented for the genomic evaluation of Korean Holsteins.