Strategies for whole-exome sequencing analysis in a case series study of familial male infertility


Background: Infertility is one of the common health issues around the world. The prevalence of male factor infertility among infertile couples is approximately 30%- 35%, of which genetic factors account for 15%. The family-based whole-exome sequencing (WES) approach can accurately detect novel variants. However, selecting an appropriate sample for data generation using WES has proven to be challenging in familial male infertility studies. The aim of this study was to identify types of pathogenic male infertility in cases of familial asthenozoospermia.

Case: Two families with multiple cases were recruited for the purpose of WES. The study population included two affected cases in pedigree I and three affected cases in pedigree II. Two different variant callers (SAMtools and GATK) with a single-sample calling strategy (SSCS) and a multiple-sample calling strategy (MSCS), were applied to identify variant sites.

Conclusion: In this study, we represented the results for variant prioritization of WES data without sequencing fertile siblings in the same pedigree by applying two different pipelines (homozygosity and linkage-based strategy). Using the aforementioned strategies, we prioritized annotated variants and generated a logical shortlist of private variants for each pedigree.

Key words: Male infertility, Whole-exome sequencing, GATK, SAMtools.

[1] Jungwirth A, Diemer T, Dohle GR, Giwercman A, Kopa Z, Krausz C, et al. Guidelines on male infertility. Eur Urol 2015; 62: 324–332.

[2] Shi X, Chan CPS, Waters T, Chi L, Chan DYL, Li TC. Lifestyle and demographic factors associated with human semen quality and sperm function. Syst Biol Reprod Med 2018; 64: 358–367.

[3] O’Flynn O’Brien KL, Varghese AC, Agarwal A. The genetic causes of male factor infertility: A review. Fertil Steril 2010; 93: 1–12.

[4] Theisen A, Shaffer LG. Disorders caused by chromosome abnormalities. Appl Clin Genet 2010; 3: 159–174.

[5] Hamada AJ, Esteves SC, Agarwal A. A comprehensive review of genetics and genetic testing in azoospermia. Clinics 2013; 68 (Suppl.): 39–60.

[6] Agarwal A, Mulgund A, Hamada A, Chyatte MR. A unique view on male infertility around the globe. Reprod Biol Endocrinol 2015; 13: 37–45.

[7] Gilissen C, Hoischen A, Brunner HG, Veltman JA. Disease gene identification strategies for exome sequencing. Eur J Hum Genet 2012; 20: 490–497.

[8] Ramasamy R, Bakirciotlu ME, Cengiz C, Karaca E, Scovell J, Jhangiani SN, et al. Whole-exome sequencing identifies novel homozygous mutation in NPAS2 in family with nonobstructive azoospermia. Fertil Steril 2015; 104: 286– 291.

[9] Shirzad H, Beiraghi N, Ataei Kachoui M, Akbari MT. Family- Based Whole-Exome Sequencing for Identifying Novel Variants in Consanguineous Families with Schizophrenia. Iran Red Crescent Med J 2016; 19: 1–8.

[10] Askari M, Karamzadeh R, Karimi-jafari MH, Mohseni A, Mehdi M, Bashamboo A, et al. Identification of a missense variant in CLDN2 in obstructive azoospermia. J Hum Genet 2019; 64: 1023–1032.

[11] Bamshad MJ, Ng SB, Bigham AW, Tabor HK, Emond MJ, Nickerson DA, et al. Exome sequencing as a tool for Mendelian disease gene discovery. Nat Rev Genet 2011; 12: 745–755.

[12] McClellan J, King MC. Genetic heterogeneity in human disease. Cell 2010; 141: 210–217.

[13] Askari M, Kordi-tamandani DM, Almadani N, Mcelreavey K. Identification of a homozygous GFPT2 variant in a family with asthenozoospermia. Gene 2019; 699: 16–23.

[14] The Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature. 2001; 409: 860–921.

[15] Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009; 25: 1754–1760.

[16] DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al. A framework for variation discovery and genotyping using next- generation DNA sequencing data. Nat Genet 2011; 43: 491–498.

[17] Wang K, Li M, Hakonarson H. ANNOVAR: Functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 2010; 38: e164–e170.

[18] Venselaar H, Te Beek TA, Kuipers RK, Hekkelman ML, Vriend G. Protein structure analysis of mutations causing inheritable diseases. An e-Science approach with life scientist friendly interfaces. BMC Bioinformatics 2010; 11: 548.

[19] Desmet FO, Hamroun D, Lalande M, Collod-Bëroud G, Claustres M, Béroud C. Human Splicing Finder: An online bioinformatics tool to predict splicing signals. Nucleic Acids Res 2009; 37: e67–e80.

[20] Cartegni L, Wang J, Zhu Z, Zhang MQ, Krainer AR. ESEfinder: A web resource to identify exonic splicing enhancers. Nucleic Acids Res 2003; 31: 3568–3571.

[21] Liu ZK, Shang YK, Chen ZN, Bian H. A three-caller pipeline for variant analysis of cancer whole-exome sequencing data. Mol Med Rep 2017; 15: 2489–2494.

[22] Cornish A, Guda C. A Comparison of Variant Calling Pipelines Using Genome in a Bottle as a Reference. Biomed Res Int 2015; 2015: 456479.

[23] 1000 Genomes Project Consortium, Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, et al. A global reference for human genetic variation. Nature 2015; 526: 68–74.

[24] Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, et al. dbSNP. the NCBI database of genetic variation. Nucleic Acids Res 2001; 29: 308–311.

[25] Lek M, Karczewski KJ, Minikel EV, Samocha KE, Banks E, Fennell T, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 2016; 536: 285–291.