ROUA DATABASE

Authors

  • K. Shchubelka Oakland University
  • W. Wolfsberger Oakland University
  • O.T. Oleksyk A. Novak Transcarpathian Regional Clinical Hospital
  • Ya. Hasynets Uzhhorod National University
  • S. Patskun Uzhhorod National University
  • M. Vakerych Uzhhorod National University
  • R. Kish Uzhhorod National University
  • V. Mirutenko Uzhhorod National University
  • Vl. Mirutenko Uzhhorod National University
  • C.A. Cotoraci ”Vasile Goldiș” Western University of Arad
  • C. Pop ”Vasile Goldiș” Western University of Arad
  • O. Neagu ”Vasile Goldiș” Western University of Arad
  • C. Baltă ”Vasile Goldiș” Western University of Arad
  • H. Herman ”Vasile Goldiș” Western University of Arad
  • P. Mare ”Vasile Goldiș” Western University of Arad
  • S. Dumitra ”Vasile Goldiș” Western University of Arad
  • H. Papiu ”Vasile Goldiș” Western University of Arad
  • A. Hermenean ”Vasile Goldiș” Western University of Arad
  • T. Oleksyk Oakland University

DOI:

https://doi.org/10.32782/1998-6475.2023.55.89-94

Keywords:

Whole Genome Sequencing, Carpathians, Ukraine, Romania, bioinformatics, genomes

Abstract

We present a multi-layered data source, providing the results of Whole Genome Sequencing of two human populations in the Carpathian Mountains region, specifically Ukraine’s Transcarpathia and Romania’s Satu Mare and Baia Mare provinces, areas previously underexplored in population genomics. The database contains the raw and annotated files of the whole genome sequences from 300 individuals from these regions, including annotations of common and unique genetic variants following a sampling protocol designed to capture the genetic diversity of Ukrainians and Romanians, including minority groups like Wallachians and Roma. The data is hosted on a dedicated web resource. We provide information on how to access to results of primary and secondary analysis of the data, including comparative analysis with previously published populations from Ukraine, and populations from International Genome Sample Resource and Human Genome Diversity Project. The free research access to this database is contributing to growing understanding of human genetic diversity in Central Europe. This effort emphasizes the potential for reuse of the generated data, advocating for open access to support future research in genomics, bioinformatics, and personalized medicine.

References

ALEXANDER, D.H., NOVEMBRE, J., LANGE, K. (2009) Fast model-based estimation of ancestry in unrelated individuals. Genome Research, 19(9), 1655–1664. DOI: 10.1101/gr.094052.109

BERGSTRÖM, A., MCCARTHY, S.A., HUI, R., ALMARRI, M.A., AYUB, Q., DANECEK, P., CHEN, Y., FELKEL, S., HALLAST, P., KAMM, J., BLANCHÉ, H., DELEUZE, J.F., CANN, H., MALLICK, S., REICH, D., SANDHU, M.S., SKOGLUND, P., SCALLY, A., XUE, Y., DURBIN R., TYLER-SMITH, C. (2020) Insights into human genetic variation and population history from 929 diverse genomes. Science, 367(6484):eaay5012. DOI: 10.1126/science.aay5012

CINGOLANI, P., PLATTS, A., WANG, L.L., COON, M., NGUYEN, T., WANG, L., LAND, S.J., LU, X., RUDEN, D. M. (2012) A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly, 6(2), 80. DOI: 10.4161/FLY.19695

DEPRISTO, M.A., BANKS, E., POPLIN, R., GARIMELLA, K.V., MAGUIRE, J.R., HARTL, C., PHILIPPAKIS, A.A., DEL ANGEL, G., RIVAS, M.A., HANNA, M., MCKENNA, A., FENNELL, T.J., KERNYTSKY, A.M., SIVACHENKO, A.Y., CIBULSKIS, K., GABRIEL, S.B., ALTSHULER, D., DALY, M.J. (2011) A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature Genetics, 43(5), 491–498. DOI: 10.1038/ng.806

FAIRLEY, S., LOWY-GALLEGO, E., PERRY, E., FLICEK, P. (2020) The International Genome Sample Resource (IGSR) collection of open human genomic variation resources. Nucleic Acids Research, 48(D1), D941–D947. DOI: 10.1093/NAR/GKZ836

KÖSTER, J., MÖLDER, F., JABLONSKI, K.P., LETCHER, B., HALL, M.B., TOMKINSTINCH, C.H., SOCHAT, V., FORSTER, J., LEE, S., TWARDZIOK, S.O., KANITZ, A., WILM, A., HOLTGREWE, M., RAHMANN, S., NAHNSEN, S. (2021) Sustainable data analysis with Snakemake. F1000Research, 10, 33. DOI: 10.12688/f1000research.29032.2

LANDRUM, M.J., LEE, J.M., BENSON, M., BROWN, G., CHAO, C., CHITIPIRALLA, S., GU, B., HART, J., HOFFMAN, D., HOOVER, J., JANG, W., KATZ, K., OVETSKY, M., RILEY, G., SETHI, A., TULLY, R., VILLAMARIN-SALOMON, R., RUBINSTEIN, W., MAGLOTT, D.R. (2016) ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Research, 44(D1), D862–D868. DOI: 10.1093/nar/gkv1222

LANDRUM, M.J., LEE, J.M., BENSON, M., BROWN, G.R., CHAO, C., CHITIPIRALLA, S., GU, B., HART, J., HOFFMAN, D., JANG, W., KARAPETYAN, K., KATZ, K., LIU, C., MADDIPATLA, Z., MALHEIRO, A., MCDANIEL, K., OVETSKY, M., RILEY, G., ZHOU, G., HOLMES, J.B., KATTMAN, B.L., MAGLOTT, D.R. (2018) ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Research, 46(D1), D1062–D1067. DOI: 10.1093/nar/gkx1153

OLEKSYK, T.K., WOLFSBERGER, W.W., SCHUBELKA, K., MANGUL, S., O’BRIEN, S.J. (2022) The Pioneer Advantage: Filling the blank spots on the map of genome diversity in Europe. GigaScience, 11, 1–7. DOI: 10.1093/GIGASCIENCE/GIAC081

OLEKSYK, T.K., WOLFSBERGER, W.W., WEBER, A.M., SHCHUBELKA, K., OLEKSYK, O.T., LEVCHUK, O., PATRUS, A., LAZAR, N., CASTRO-MARQUEZ, S.O., HASYNETS, Y., BOLDYZHAR, P., NEYMET, M., URBANOVYCH, A., STAKHOVSKA, V., MALYAR, K., CHERVYAKOVA, S., PODOROHA, O., KOVALCHUK, N., RODRIGUEZ-FLORES, J.L., ZHOU, W., MEDLEY, S., BATTISTUZZI, F., LIU, R., HOU, Y., CHEN, S., YANG, H., YEAGER, M., DEAN, M., MILLS, R.E., SMOLANKA, V. (2021) Genome diversity in Ukraine. GigaScience, 10(1), 1–14. DOI: 10.1093/GIGASCIENCE/GIAA159

PRICE, A.L., PATTERSON, N.J., PLENGE, R.M., WEINBLATT, M.E., SHADICK, N.A., REICH, D. (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nature Genetics, 38(8), 904–909. DOI: 10.1038/ng1847

SOLLIS, E., MOSAKU, A., ABID, A., BUNIELLO, A., CEREZO, M., GIL, L., GROZA, T., GÜNEŞ, O., HALL, P., HAYHURST, J., IBRAHIM, A., JI., Y., JOHN, S., LEWIS, E., MACARTHUR, J.A. L., MCMAHON, A., OSUMI-SUTHERLAND, D., PANOUTSOPOULOU, K., PENDLINGTON, Z., RAMACHANDRAN, S., STEFANCSIK, R., STEWART, J., WHETZEL, P., WILSON, R., HINDORFF, L., CUNNINGHAM, F., LAMBERT, S.A., INOUYE, M., PARKINSON, H., HARRIS, L.W. (2023) The NHGRI-EBI GWAS Catalog: knowledgebase and deposition resource. Nucleic Acids Research, 51(D1), D977–D985. DOI: 10.1093/NAR/GKAC1010

WOLFSBERGER, W.W. (2023) PopGen Playground (0.1). Available from: https://github.com/wwolfsberger/OU_popgen_playground (accessed 10.11.2023).

Published

2024-09-30