Accurate long-read sequencing identified GBA variants as a major genetic risk factor in the Luxembourg Parkinson’s study

Sinthuja Pachchek, Zied Landoulsi, Lukas Pavelka, Claudia Schulte, Elena Buena-Atienza, Caspar Gross, Ann-Kathrin Hauser, Dheeraj Reddy Bobbili, Nicolas Casadei, Patrick May, Rejko Krüger on behalf of the NCER-PD Consortium

Heterozygous variants in the glucocerebrosidase GBA gene are an increasingly recognized risk factor for Parkinson’s disease (PD). Due to the pseudogene GBAP1 that shares 96% sequence homology with the GBA coding region, accurate variant calling by array-based or short-read sequencing methods remains a major challenge in understanding the genetic landscape of GBA-related PD. We established a novel long-read sequencing technology for assessing the full length of the GBA gene. We used subsequent regression models for genotype-phenotype analyses. We sequenced 752 patients with parkinsonism and 806 healthy controls of the Luxembourg Parkinson’s study. All GBA variants identified showed a 100% true positive rate by Sanger validation. We found 12% of unrelated PD patients carrying GBA variants. Three novel variants of unknown significance (VUS) were identified. Using a structure-based approach, we defined a potential risk prediction method for VUS. This study describes the full landscape of GBA-related parkinsonism in Luxembourg, showing a high prevalence of GBA variants as the major genetic risk for PD. Our approach provides an important advancement for highly accurate GBA variant calling, which is essential for providing access to emerging causative therapies for GBA carriers.

Raw and derived data

The dataset for this manuscript is not publicly available as it is linked to the Luxembourg Parkinson’s Study and its internal regulations. Any requests for accessing the dataset can be directed to

Source code

The scripts used to analyse the data are available under Apache-2.0 license from institutional Gitlab here.