شناسایی چند شکلی‌های تک‌نوکلئوتیدی (SNP) بر روی ترانسکریپتوم گاوهای هلشتاین آمریکا و کلیستانی پاکستان

قاسمی ساب, مژگان; ورکوهی, شیدا; بنابازی, محمد حسین

doi:10.22067/ijasr.v13i2.82194

شناسایی چند شکلی‌های تک‌نوکلئوتیدی (SNP) بر روی ترانسکریپتوم گاوهای هلشتاین آمریکا و کلیستانی پاکستان

نوع مقاله : علمی پژوهشی- ژنتیک و اصلاح دام و طیور

نویسندگان

¹ گروه علوم دامی، پردیس کشاورزی و منابع طبیعی، دانشگاه رازی،کرمانشاه، ایران.

² گروه علوم دامی، پردیس کشاورزی و منابع طبیعی، دانشگاه رازی، کرمانشاه، ایران.

³ بخش پژوهشهای بیوتکنولوژی، موسسه تحقیقات علوم دامی کشور، سازمان تحقیقات، آموزش و ترویج کشاورزی،کرج، ایران.

10.22067/ijasr.v13i2.82194

چکیده

مطالعه در سطح ترانسکریپتوم میتواند فاصله بین ژنوتیپ و فنوتیپ را پر نموده و به درک ساز وکارهای تبدیل توالی به عملکرد کمک نمایند. در این مطالعه به منظور کشف چند شکلیهای تکنوکلئوتیدی (SNP) از دادههای RNA-Seq دو جمعیت گاوهای هلشتاین آمریکا (Bos taurus) و کلیستانی پاکستان (Bos indicus) استفاده شد. کنترل کیفیت و ویرایش دادهها توسط نرم افزارهای FASTQC و Trimmomatic انجام شد. با همردیفی و مکانیابی خوانشهای RNA-Seq بر روی ژنوم مرجع گاو با استفاده از نرم افزار TopHat2، ترانسکریپتوم تشکیل شد، سپس با استفاده از بسته نرمافزاری Samtools، آنالیز کشف SNP بر روی ترانسکریپتوم صورت گرفت که منجر به شناسایی 50183 و 137954 جایگاه SNP به ترتیب در گاوهای هلشتاین و کلیستانی شد، که 15308 جایگاه مشترک بود. ارتباط مستقیمی بین تعداد SNPهای یافت شده و طول کروموزومها مشاهده نشد. همچنین 12 نوع SNP شناسایی شد که چهار نوع جایگزینی نوکلئوتیدی عامل و هشت نوع جایگزینی نوکلئوتیدی غیر عامل بودند. شایعترین SNPهای شناخته شده، از نوع جایگزینی نوکلئوتیدی عامل بودند، که 6/70% در نژاد کلیستانی و 6/69% در نژاد هلشتاین وجود داشت. درمطالعه حاضر 12 نوع SNP شناسایی شد. چهار نوع از SNPها، از نوع جایگزینی نوکلئوتیدی عامل وهشت نوع جایگزینی نوکلئوتیدی غیر عامل بودند. شایعترین SNPهای شناخته شده، SNPهای از نوع جایگزینی نوکلئوتیدی عامل بودند، که به ترتیب 6/70% SNPها در نژاد کلیستانی و 6/69% SNPها در نژاد هلشتاین را شامل شد. نسبت جایگزینی نوکلئوتیدی عامل به غیر عامل (Ts/Tv) SNPها در نژاد کلیستانی برابر با 4/2 و در نژاد هلشتاین برابر با 3/2 بود.

کلیدواژه‌ها

20.1001.1.20083106.1400.13.2.12.8

عنوان مقاله [English]

Single Nucleotide Polymorphism (SNP) Discovery on Transcriptomes of American Holstein and Pakistanian Cholistani Cows

نویسندگان [English]

Mojgan Ghasmi siab ¹
Sheida Varkoohi ²
Mohammad Hossein Banabazi ³

¹ Department of Animal Science, College of Agriculture & Natural Resources, Razi University, Kermanshah, Iran.

² Department of Animal Science, College of Agriculture & Natural Resources, Razi University, Kermanshah, Iran.

³ Department of Biotechnology, Animal Science Research Institute of IRAN (ASRI), Agricultural Research, Education & Extension Organization (AREEO), Karaj, Iran.

چکیده [English]

Introduction[1] (SNPs) are single nucleotide base variations, caused by transitions (C/T or G/A) or transversions (C/G, C/A, or T/A, T/G), in the same position between individual genomic DNA sequences. Single nucleotide polymorphisms have been applied as important molecular markers in genetics and breeding studies. About 40% of the Single nucleotide polymorphisms in the genes cause a change in an amino acid. The rapid advance of next generation sequencing provides a high-throughput means of SNP discovery. Transcriptome study can fill the gap between genotype and phenotype and help understanding the mechanisms from sequence to function. RNA sequencing (RNA-Seq) is a next generating sequencing based technology for studying of whole transcriptome and gene expression. It simultaneously enables study of transcriptomics sequences and very accurate quantitative gene expression (digital expression). Hence, these data are very suitable for high-throughput study of expression level of all transcribed genes and their SNPs (Single Nucleotide Polymorphism. Recently, RNA-Seq has also been used as an efficient and cost-effective method to systematically identify SNPs in transcribed regions in different species. A transcriptomics-based sequencing approach offers a cheaper alternative to identify a large number of polymorphisms and possibly to discover causative variants.
Materials and Methods In this study, RNA-Seq data were used to SNP discovery in American Holstein (Bos taurus) and Pakistanian Cholistani (Bos indicus) cows. RNA-Seq data of 21,078,477 and 20940063 paired end reads with 75 bp length resulted from pooling of whole blood samples of 40 Holstein cows at the University of Wisconsin, Dairy Cattle Center, USA, and 45 Cholistani cows at Gujait Peer Farm, Bahawalpur, Punjab, Pakistan, respectively, obtained from SRA database in NCBI for Holstein cows (http://www.ncbi.nlm.nih.gov/sra/SRX317197) and Cholistani cows http://www.ncbi.nlm.nih.gov/sra/SRS454433). MRNA sequencing was run on Illumina Genome Analyzer IIx (Illumina Inc., San Diego, CA). Data were converted from Sra format to Fastq format by fastq-dump command from Ubuntu linux version of Sratoolkit 2.5.4-1. Data quality control was checked by FastQC (v0.11.3) likewise trimmed for linked adaptors and bad quality reads by Trimmomatic 0.33 Adaptors were considered according to sequencing instrument as default (TruSeq2-PE.fa) and the minimum read length was set at 50 bp. Trimmed reads were aligned on UMD3.1 reference genome (release 81) based on annotation data by Tophat2, which applies Bowtie2 as the aligner. The transcriptome was assembled by TopHat2 software in two cow’s population by aligning and mapping the RNA-Seq reads on bovine reference genome. The SNPs were discovered by Samtools software.
Results and Discussion After data editing, the removed and low quality reads in both breeds were almost equal and relatively low. The length of whole transcriptome assembled, for example 52798651 bases in Holstein, indicates around 2% of the whole genome (around 2.6 Mbp) expressed as mRNA. In Cholistani cows, read mapping rate for forward and reverse reads were 81.3 and 79.9%, respectively, and multiple alignments rate was about 9.4%. Overall read mapping was 80.6% and concordant pair alignment was 70.1%. In Holstein cows, read mapping rate for forward and reverse reads were 66.3 and 55.4%, respectively, and multiple alignments rate was about 7.2%,. Overall read mapping was 60.8% and concordant pair alignment was 51.3%. Results show that 50183 and 137954 SNPs were discovered on the assembled transcriptome of Holstein and Cholistani cow’s samples, respectively, and 15308 SNPs were common in both breeds. No direct relation was found between the number of discovered SNPs and the chromosome length. Also 12 SNP types were identified including 4 transition and 8 transversion. The most commonly discovered SNP were transition, which were 70.6% in Cholistani and 69.6% in Holstein cows. The ratio of transition to transversion SNP (Ts / Tv) was 2.4 and 2.3 in Cholistani and Holstein cows, respectively. The number of discovered SNPs in Cholistani cows were approximately three times higher than Holstein cows. Because, for the alignment of both species used a same reference genome with Herford origin.
Conclusion the expression difference between two alleles in a single-nucleotide position causes phenotype diversity and probably explains the large part of variances between these two bovine subspecies, especially in diversity, susceptibility to disease and parasites, tolerating environmental stress such as biological and non-biological stresses in different environmental conditions. While, differential gene expression analysis or even allelic specific expression in gene level may not be able to explain phenotype diversity.

کلیدواژه‌ها [English]

Chromosome length
Nucleotide replacement
RNA-Seq data

مراجع

Arefnezhad, B., H. Kohram, M. Moradi Shahre Babak, M. Shakeri, Y. Dong, X. Zhang, W. Wang, and G. H. Hosseini Salekdeh. 2015. Genetic Variant Detection of Caspian Horse Using High-throughput Sequecing Technology (in Persian). Journal of Agriculture Biotechnology, 4:101-116. (In Persian).
Cánovas, A., G. Rincon, A. Islas-Trejo, S. Wickramasinghe, and J. F. Medrano. 2010. SNP discovery in the bovine milk transcriptome using RNA-Seq technology. Mammalian genome, 21:592-598.
Flintoft, L. 2008. Transcriptomics: Digging deep with RNA-Seq. Nature Reviews Genetics, 9(8):568- 568.
Huang, W., A. Nadeem, B. Zhang, M. Babar, M. Soller, and H. Khatib. 2012. Characterization and Comparison of the Leukocyte Transcriptomes of Three Cattle Breeds. Plos one, 7(1):e30244.
Jiang, Z., X. L. Wu, M. Zhang, L. M. Jennifer, and R. W. Wright. 2008. Thw complementary neighborhood patterns and methylation to mutation Likelihood structure of 15, 110 Single Nucleotide Polymorphisms in the bovine genome. Genetics, 180 (1): 639-647.
Park, K. D., J. Park, J. Ko, B. C. Kim, H. S. Kim, K. Ahn, K. T. Do, H. Choi, H. M. Kim, S. Song, S. Lee, S. Jho, , H. S. Kong, Y. M. Yang, B. H. Jhun, C. Kim, T. H. Kim, S. Hwang, J. Bhak, H. K. Lee, and B. W. Cho. 2012. Whole transcriptome analyses of six thoroughbred horses before and after exercise using RNA-Seq. BMC Genomics, 13:473.
Pennisi, E. 2012. ENCODE Project Writes Eulogy for Junk DNA. Science, 337(6099):1159 - 1161.
Sharma, U., P. Banerjee, J. Joshi, and R. K. Vijh. 2012. Ubiquitous Expression of Genes in tissues of Goat (Capra hircus) Using RNA-seq. International Journal of Animal and Veterinary Advances, 4(4):292 - 302.
Meuwissen, T, and M. Goddard. 2010. Accurate prediction of genetic values for complex traits by whole-genome resequencing. Genetics, 185:623 - 631.
Wang, Z., M. Gerstein, and M. Snyder. 2009. RNA-Seq: a revolutionary tool for transcriptomics. Nature Reviews Genetics, 10:57 - 63.
Wang, L., Y. Zhang, M. Zhao, R. Wang, R. Su, and J. Li. 2015. SNP Discovery from Transcriptome of Cashmere goat skin. Asian-Australasian Journal of Animal Science, 28(9):1235 - 1243.