Signature literature review reveals AHCY, DPYSL3, and NME1 as the most recurrent prognostic genes for neuroblastoma

Chicco, Davide; Sanavia, Tiziana; Jurman, Giuseppe

doi:10.1186/s13040-023-00325-1

Research
Open access
Published: 04 March 2023

Signature literature review reveals AHCY, DPYSL3, and NME1 as the most recurrent prognostic genes for neuroblastoma

BioData Mining volume 16, Article number: 7 (2023) Cite this article

2264 Accesses
3 Citations
4 Altmetric
Metrics details

Abstract

Neuroblastoma is a childhood neurological tumor which affects hundreds of thousands of children worldwide, and information about its prognosis can be pivotal for patients, their families, and clinicians. One of the main goals in the related bioinformatics analyses is to provide stable genetic signatures able to include genes whose expression levels can be effective to predict the prognosis of the patients. In this study, we collected the prognostic signatures for neuroblastoma published in the biomedical literature, and noticed that the most frequent genes present among them were three: AHCY, DPYLS3, and NME1. We therefore investigated the prognostic power of these three genes by performing a survival analysis and a binary classification on multiple gene expression datasets of different groups of patients diagnosed with neuroblastoma. Finally, we discussed the main studies in the literature associating these three genes with neuroblastoma. Our results, in each of these three steps of validation, confirm the prognostic capability of AHCY, DPYLS3, and NME1, and highlight their key role in neuroblastoma prognosis. Our results can have an impact on neuroblastoma genetics research: biologists and medical researchers can pay more attention to the regulation and expression of these three genes in patients having neuroblastoma, and therefore can develop better cures and treatments which can save patients’ lives.

Peer Review reports

Introduction

Neuroblastoma is a type of brain tumor that affects thousands of newborns and young children worldwide [1]. Neuroblastoma almost always develops before age 5 and rarely develops in children over age 10, and it can be of low-, middle- or high-risk. The therapy for patients with high-risk neuroblastoma includes combinations of chemotherapy, surgery, high-dose chemotherapy with stem cell rescue (also known as autologous stem cell transplantation), radiation and immunotherapy [1].

Most of the times, neuroblastoma happens as a result of gene expression changes in neuroblasts that happen during the child’s development, sometimes even before birth, and the causes of these changes are still unknown [2]. To this end, discovering which genes have a prognostic role in neuroblastoma is fundamental to understand the pathogenesis and the disease progression, but also for treatment decisions on potential targeted therapies.

Genetic signatures are lists of genes that have an important prognostic or diagnostic role in patients with a specific disease: a strong prognostic signature for neuroblastoma, for example, contains a list of genes whose change of expression levels can be used in a model able to predict the survival risk of new patients.

Genetic signatures can be derived from microarray gene expression [3], RNA-Seq gene expression [4], but also from other regulatory gene expression elements like microRNA, DNA methylation [5] and other biomolecular data.

Despite their usefulness, genetic signatures of the same diseases can have low overlap between each other: signatures in fact might contain false positive genes, that are genes having a strong signal in an obsolete or noisy technology and therefore included in the signature, but actually are unrelated to that particular disease [6]. Integrating and comparing signatures of the same disease published in the literature can be a way to address this problem.

In this study, we queried the scientific literature and found ten neuroblastoma prognostic signatures [7,8,9,10,11,12,13,14,15,16], we decided to extrapolate the most common genes and investigate their role in neuroblastoma. Only three genes resulted as the most common and shared in a limited number of signatures: AHCY, DPYSL3, and NME1. This result confirms the lack of reproducibility among the prognostic genetic signatures for neuroblastoma. We then performed a survival analysis considering the largest available neuroblastoma gene expression datasets in Gene Expression Omnibus (GEO) [17] and ArrayExpress [18] in order to understand the behavior of these three genes in relation to time-to-event data. Moreover, we tested the predictive power of these three genes by employing a machine learning algorithm on a binary dataset containing data of survived and deceased patients. Finally, we performed a thorough literature validation considering all the articles that show a correlation between these three genes and neuroblastoma treatments.

To the best of our knowledge, no previous studies have used the scientific literature to detect the most recurrent prognostic genes for neuroblastoma.

We organize the rest of this article as follows. After this Introduction, we explained the methods for genes and datasets retrieval, the methods for survival analysis, the methods for the machine learning analysis, and the methods for literature validation in section Methods, and we then report the results of these phases in section Results. Eventually, we discuss the impact and the consequences of our results and report some limitations (section Discussion).

Methods

Prognostic genes discovery and dataset search

We retrieved the most recurrent neuroblastoma prognostic genes by comparing the prognostic signatures currently available in the scientific literature (Subsection S1.1). We included the neuroblastoma signatures indicated for survival and prognosis, and we excluded the diagnostic ones. We then converted each gene symbol into Ensembl ID’s through g:Convert of g:Profiler [19]. We immediately noticed both the limited number of studies providing a curated signature list of prognostic genes and the low overlap between the these lists, that motivated us for the selection of the the most recurrent genes.

For the machine learning validation phase on gene expression, we searched for neuroblastoma gene expression datasets on Gene Expression Omnibus (GEO) [17] through geoCancerPrognosticDatasetsRetriever [20] having the survived-deceased binary label for each patient.

We found and downloaded one gene expression dataset through GEOquery [21] and BioMart [22]: the Hiyama2010 dataset coded as GSE16237 [23] on GEO. We performed the gene probeset-symbol annotation done through geneExpressionFromGEO [24] and Jetset [25]. On the dataset found, we performed the batch correction [26] through limma [27]. We used this Hiyama2010 dataset for the machine learning validation phase and not for the survival analysis phase (next subsection) because it does not contain the temporal component and because we prefered to keep these two validation phases independent from each other.

Validation of risk and hazard ratio in multi-studies overall survival data

To validate the robustness of the most recurrent genes resulting from the literature search (Subsection Prognostic genes discovery and dataset search) as potential prognostic biomarkers, we first investigated the pooled risk ratio (RR) based on the number of deaths in neuroblastoma associated with high/low expression of each gene. Specifically, for this survival analysis phase we manually searched for gene expression datasets on Gene Expression Omnibus [17], ArrayExpress [18] and R2 database [28] with the survived status and the survival time feature for each feature.

Only large datasets, with at least 80 samples, providing pre-processed and normalized gene expression values with associated overall survival data were included. A \(RR>1\) represents poorer prognosis for the higher expression group, whereas a \(RR<1\) indicates poorer prognosis for the lower expression group. Each gene was first analyzed separately, and its expression was classified in a binary way, considering the expression values below the 25\(^{th}\) or above the 75\(^{th}\) percentile to identify the samples at low or high expression of that gene, respectively.

Then, a combination of the genes extracted by the literature search was also evaluated: a consensus score was generated considering the average of the ranking positions obtained for each gene expression (increasingly ordered), assigning the same ranking position to ties and inverting the ranking of the genes showing \(RRs<1\). In this case, the scores belonging to the first and last quartiles were classified as “low” and “high”, respectively. Both fixed- and random-effects models using Mantel-Haenszel [29] and restricted maximum-likelihood [30] estimators were considered, respectively. Heterogeneity across studies was assessed by the Cochrane \(I^2\) metric [31] and chi-squares statistics. Chi-squared p-values \(<0.1\) and an \(I^2\) values \(>50\)% were associated to statistically significant heterogeneity. Potential publication bias was evaluated using Begg and Mazumdar’s test [32] and Begg’s funnel plot [33], considering p-values \(<0.05\) to indicate statistical bias.

Afterwards, we considered the number of deaths in relation with the follow-up time and evaluated the expression values of the genes and their consensus score through Cox proportional hazard model [34], adjusted by the age of the patients. This confounding factor was discretized into two classes, using a threshold of 18 months. The hazard ratios (HRs) with 95% confidential intervals (CIs) for overall survival observed along a follow-up time were estimated by the model. An HR \(>1\) implied poorer survival for the higher expression group. In contrast, an HR\(<1\) implied poorer survival for the lower expression group. Both Wald test applied to the coefficient of the model used to estimate the HR and Log-rank test were considered for the evaluation. p-values less than 0.05 were considered statistically significant. All statistical analyses were performed using R programming language version 4.1.2. The R packages meta (version 5.2), dmetar (version 0.9) and survival (version 3.2-13) were used for the meta-analysis of the RRs and the survival analysis through Cox proportional hazard model, respectively.

Binary classification method

To further verify the predictive power of our three proposed genes, we decided to use them for a prognostic binary classification based on Random Forests [35] machine learning method, on an alternative dataset not employed for the previous phases of the analysis. We downloaded the Hiyama2010 dataset from GEO (GSE16237 [23]), which contains microarray gene expression of patients diagnosed with neuroblastoma. In this cohort there are 12 deceased patients and 39 survived patients, that means 23.53% negative data instances and 76.47% data instances. We decided to employ this external, alternative dataset we did not use for the other validation steps, to make our analysis as robust as possible [36].

After downloading this dataset, we applied a batch correction method through the limma [27] package of Bioconductor [37] to remove the noise of the batch effects from the microarray data [26]. We then selected only the three gene profiles of the probesets of our three proposed prognostic genes in the HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array (GPL570) microarray platform: 200903_s_at (AHCY), 201431_s_at (DPYSL3), and 201577_at (NME1). Afterwards, we applied Random Forests [35] by splitting the dataset into training (80% randomly selected patients’ profiles) and test (20% remaining patients’ profiles) sets [38]. We repeated the execution 100 times and generated confusion matrices on the tests sets, that we assessed through traditional rates such as the Matthews correlation coefficient (MCC) [39], ROC AUC and others (Table 5), recorded as average values and standard deviations. We highlighted the result measured through the MCC [40,41,42].

Validation on the literature

We investigated the role of AHCY, DPYSL3, and NME1 genes in the scientific literature by looking for studies involving at least one of these three genes and neuroblastoma in Google Scholar [43] on 20th February 2022.

We retrieved the aliases of the names of these three genes on GeneCards.org [44,45,46]:

AHCY aliases: SAHH, S-Adenosyl-L-Homocysteine Hydrolase, S-Adenosylhomocysteine Hydrolase.
DPYSL3 aliases: CRMP4, CRMP-4, ULIP, DRP-3, DRP3, ULIP1.
NME1 aliases: NM23-H1, NM23-H1, NDPKA, NM23.

For each of this term, we made a search on Google Scholar [43] including the neuroblastoma keyword (for example, “AHCY neuroblastoma”, “SAHH neuroblastoma”, etc.) and recorded all the scientific studies describing an active role of the gene and neuroblastoma survival or prognosis.

Results

Prognostic genes found

In our literature search, we found ten neuroblastoma prognostic signatures, whose quantitative characteristics and references are reported in Table 1 and in Supplementary Subsection S1.2.

Three genes resulted being more present than the others in three signatures (Table 2): AHCY, DPYSL3, and NME1.

The probability that a three-genes triple occur in three different signatures can be experimentally estimated as \(P\approx 5\cdot 10^{-8}\). In fact, randomly rearranging the 300 genes included in the 10 signatures in the same setup of the current situation, and repeating such experiments for N runs, three genes occur in three different signatures in \(\alpha \cdot N\) runs, with \(\alpha \approx 0.24\) in average. Since there are \(T=\left( {\begin{array}{c}300\\ 3\end{array}}\right) =4455100\) distinct sets of three genes, the probability P can be evaluated as \(P=\frac{\alpha }{T}\approx 5\cdot 10^{-8}\). This yields that random effects are quite unlikely responsible for the selection of the investigated triple AHCY, DPYSL3, and NME1.

This aspect confirms the low overlap between gene lists of the neuroblastoma prognostic signatures found: among 10 signatures, only 3 gene lists share 3 common genes, which are the most frequent elements among the lists (Subsection S1.2). Surprisingly, the famous neuroblastoma-related MYCN gene is not among the most frequent genes [47]. It is present only in two signatures [10, 11] out of ten.

Table 1 List of neuroblastoma prognostic signatures found in the literature. Article: reference. # genes: number of genes. We reported the lists of the genes of these signatures in Supplementary Subsection S1.2

Full size table

Table 2 Most recurrent prognostic genes for neuroblastoma. For each gene, we report its symbol, its Ensembl ID, and its complete gene name on GeneCards.org. The signatures column indicates the references of the neuroblastoma signatures which contain each gene

Full size table

Computational evaluation of the impact of the most recurrent prognostic genes across the largest neuroblastoma studies

Both array-based and sequencing-based data were considered. A description of the selected datasets, together with the information related to the features considered for the three genes of interest (AHCY, DPYSL3, NME1) is reported in Table 3. Some of these datasets were employed (as training set or test set) in the studies identifying the ten neuroblastoma signatures we found in the literature (Supplementary Table 1), but these studies leverage other datasets, too, to verify the effectiveness of their proposed signatures. The overlap between the set of datasets used by these ten signature studies and the set of dataset we used for our survival analysis and binary classification phases is always lower than 50% (Supplementary Table 1).

The expression of all the three genes resulted significantly associated with the death rate, as displayed in Fig. 1. Specifically, at higher expression of both AHCY and NME1 higher death rates were observed (considering median percentages across all the studies, 55.9% at high versus 5.8% at low AHCY expression with estimated RR 6.4 and \(p<0.0001\) from Mantel-Haenszel test, while 49.7% at high versus 12.2% at low NME1 expression with estimated RR equal to 3.85 for the random-effects model and to 4.31 for fixed(common)-effect model, obtaining \(p<0.0001\) in both models). On the other hand, lower death rates were observed at higher expression of the gene DPYSL3 (considering median percentages across all the studies, 6.4% at high versus 46.4% at low expression with estimated RR 0.2 and \(p<0.0001\) from Mantel-Haenszel test).

Combining all the genes, where the ranking of the DPYSL3 expression was inverted to be integrated with the rankings from AHCY and NME1, it is possible to observe an increased pooled RR (8.58 for the random-effects model and 8.65 for fixed(common)-effect model, obtaining \(p<0.0001\) in both models with Mantel-Haenszel test), with a median death rate equal to 42.9% at high and to 5.4% at low consensus score. The heterogeneity across the studies ranges between 41% (observed for DPYSL3, \(p=0.12\)) and 80% (observed for AHCY, \(p<0.01\)). However, funnel plots inspection (Supplementary Fig. S1) and Begg and Mazumdar’s test results indicate the presence of funnel plot asymmetry, showing that the publication bias was unlikely (all p-values \(>0.05\)).

Afterwards, we investigated the time-dependent risk of death in terms of hazard ratios (adjusted for the age). Results are reported in Table 4. All the Cox models which include both expression and age as confounding factors showed signicant association with overall survival. In 6 over 7 studies for AHCY and DPYSL3, in 4 over 6 studies for NME1 and in 5 over 6 studies for the consensus score the related HR was found significant.

Table 3 Datasets used in the meta-analysis. Description of the 7 datasets used in the meta-analysis with the related reference to the study, specifying the size, the array/sequencing platform, the pre-processing approach used to normalize the data and the type of feature (for example, probeset ID, Refseq ID, Ensembl transcript/gene ID) considered to identified each gene analyzed

Full size table

Table 4 Hazard ratios from Cox proportional hazard models testing the prognostic role for all-cause mortality. Results are reported for the three genes considered separately (AHCY, DPYSL3, NME1) and for the score which combines the expression-based ranks of the three genes, considering the reversed rank for DPYSL3 since it showed RRs \(<1\). Abbreviations: HR, Hazard Ratio; CI, Confidence Interval; LR, Log-Rank

Full size table

Moreover, our Random Forests classifier on the Hiyama2010 obtained ROC AUC = 0.877 and PR AUC = 0.952 in the [0; 1] range, with MCC = +0.517 in the \([-1;+1]\) interval, that means excellent binary classification (Table 5).

Table 5 Binary classification results on the Hiyama2010 dataset. We consider only the three genes AHCY, DPYSL3 and NME1. We report the mean scores of 100 executions with corresponding standard deviations, obtained through the Random Forests [35] algorithm. Positives: data of survived patients. Negatives: data of deceased patients. MCC: Matthews correlation coefficient. MCC: worst and minimum value \(= -1\) and best and maximum value \(= +1\). TP rate: true positive rate, sensitivity, recall. TN rate: true negative rate, specificity. PR: precision-recall curve. PPV: positive predictive value, precision. NPV: negative predictive value. ROC: receiver operating characteristic curve. AUC: area under the curve. F\(_1\) score, accuracy, TP rate, TN rate, PPV, NPV, PR AUC, ROC AUC: worst and minimum value \(= 0\) and best and maximum value \(= 1\). Confusion matrix threshold for TP rate, TN rate, PPV, and NPV: 0.5. Hiyama2010 dataset: GSE16237 [23]

Full size table

AHCY, DPYSL3, and NME1 in the neuroblastoma literature

In the following text, we report a description of the relevant role of AHCY, DPYSL3, and NME1 in neuroblastoma development highlighted by the studies considered by the state-of-the-art literature, as described in subsection Validation on the literature.

AHCY and neuroblastoma

The metabolic enzyme adenosyl-homocysteinase (AHCY) is one of the most conserved enzymes in living organisms. In mammals, AHCY mediates the reversible catalysis of S-adenosylhomocysteine (SAH) to adenosine and L-homocysteine, and it plays a key role in DNA methylation maintenance, thus resulting fundamental during epigenomic reprogramming throughout embryo development and/or in disease progression. In cancer, AHCY seems to have a dual role both as tumor suppressor [55] or tumor promoter, based on the cancer type. Its inhibition for example was linked to anti-migratory and anti-invasive activity of breast cancer cells [51, 56]. In neuroblastoma, literature evidence paint a picture where AHCY is constitutively active in cancer cells, but further enhanced by MYCN. Indeed, AHCY is directly regulated by MYC proteins and like these, associated with poor prognosis of neuroblastoma patients. As a matter of fact, the depletion of MYCN cause the down-regulation of AHCY. Moreover, AHCY inhibition reduces colony formation capacity and glutathione synthesis especially in neuroblastoma cell lines with high MYCN expression. The specific synthetic lethality through genetic or pharmacological AHCY inhibition is also corroborated by the increase in apoptosis of MYCN-amplified neuroblast cells [57]. Of note, the first evidence on the role of this enzyme as a key molecule in the progression of MYCN-amplified neuroblastoma dates back to the early 1990s when several studies in different NB mouse models were employed [58,59,60,61,62] More recently, considering the potential implication of this gene on the clinical management of NB, Novak and colleagues [63] hypothesized that the identification of its genetic variations may have significant impact during development of the recurrent or progressive disease. Non-synonymous variants in AHCY gene have been found for example to contribute to the slow progression of the disease, even in more aggressive cases. It affects the maintenance of the catalytic capacity of AHCY, leading to the consequent functional effects in NB patients. Thus, also the potential use of AHCY variants may constitute a molecular biomarker. Finally, it is known that MYCN-amplification alters key metabolic pathways as glycolysis and gluconeogenesis. Oliynyk et al. [64] found that the stressed phenotype of MYCN amplified NB cells was characterized by a shift in the metabolic balance toward robustly increased oxidative phosphorylation as well as enhanced aerobic glycolysis. Interestingly, AHCY has been recently included in the glycosyl compound metabolic process gene set, suggesting that AHCY might link glucose with adenosine or homocysteine [65] and alter this metabolic process. This data suggest that, beyond their correlation, these genes are effectively functionally interconnected to each other.

DPYSL3 and neuroblastoma

DPYSL3 (also referred to as collapsing response mediator protein 4, CRMP4) is a member of the DPYSL gene family, highly expressed in developing and adult nervous systems. It functions in a variety of cellular processes, including cell migration, differentiation, neurite extension, and axonal regeneration. It has been reported to be involved also in the metastatic process of tumor cells [66]. Some authors debrided DPYSL3 as a metastasis suppressor, while others [67] reported it facilitates pancreatic cancer cell dissemination via a strong interaction with other cell adhesion factors, including Ezrin, focal adhesion kinase and c-SRC. In breast cancer, DPYSL3 knockdown determined a reduced proliferation, but a still enhanced motility and increased expression of epithelial-to-mesenchymal transition markers, suggesting that DPYSL3 is a multifunctional signaling modulator. In neuroblastoma cells, it has been shown that DPYSL3 regulates the actin cytoskeleton, whose dynamic reorganization is known to be fundamental for cell migration. DPYSL3 was found abundant in the cytosol of B35 neuroblastoma cells and to co-localizes with F-actin in regular rib-like structures within lamellipodia of these cells, with which physically interacts. The critical functional equilibrium between DPYSL3 and F-actin is demonstrated by the fact that DPYSL3 overexpression inhibited the migration of B35 neuroblastoma cells, while its knockdown enhanced cell migration and disturbed rib-like actin-structures in lamellipodia [68]. Interestingly, studies using genetic approaches showed that DPYSL3 levels were inversely altered with changes in MYCN expression, thus suggesting a MYCN negative regulation of DPYSL3 in NB cells, probably via EZH2. This negative regulation may be also mediated by GSK3b [69]. The regulation of DPYSLs by GSK-3b in neuronal polarity or axon outgrowth has already been reported. Moreover, it is known that Akt in NB cells phosphorylate and thus inactivate GSK-3b [70] and that inactivation of GSK-3b would lead to an increase of MYCN protein expression. Thus increased MYCN levels, via amplification or as the result of Akt-mediated GSK-3b inactivation would lead to DPYSL3 suppression in NB cells. This mechanistic evidence also support an important prognostic role for DPYSL3 expression. Indeed, tumors of advanced-stage NB patients have a good prognosis if characterized by elevated levels of DPYSL3, and even in high-risk NB patients the levels of DPYSL3 mark those patients who may have a better overall survival [71].

NME1 and neuroblastoma

The NME1 gene is located in the 17q21.3 region, whose gain is a common evaluated factor predicting adverse clinical outcome in neuroblastoma patients. NME1 has been shown to be involved in multiple critical cellular behaviors, including cell proliferation, differentiation, and neural development. In cancer, its overexpression has a dichotomous role as both a suppressor and a promoter of tumor metastasis, based on the cancer type. In breast and prostate carcinomas for example, high levels of this protein is associated with good survival and low-risk features. In contrast, in pediatric cancer such as neuroblastoma, it correlates with aggressive neuroblastoma tumor features while increased NME1 expression has been identified as a component of gene expression, signatures most significantly associated with poor neuroblastoma patient outcomes. Some authors attribute this evidence to the histidine kinase capacity of NME1 that catalyze transfer of the activated phosphate from the autophosphorylated histidine 118 residue (H118) onto target proteins. It is plausible to assume that this results in an increased activity of proteins involved in cell migration and differentiation. Indeed, NME1 shRNA knock-down disrupts differentiation of neuroblastoma cells induced by 13-cis-retinoic acid (CRA) treatment [72]. Carotenuto et al. [73] demonstrated that NME1 form a protein complex with h-Prune, trough which could act as a pro-metastatic gene. As a matter of fact, the overexpression of NME1 and h-Prune enhances the aggressiveness of NBL cells both in vitro and in vivo. Thus, the disruption of this interaction might constitute a potential therapeutic intervention for neuroblastoma patients [73]. Moreover, A Negroni et al. [74], discovered that patients bearing S120G mutation in NME1 gene have worst prognosis respect to wild type ones. It has been demonstrated that NME1-S120G is more effective in promoting cell invasiveness and metastasis of neuroblastoma in vitro and in vivo. S120G may be defined as a gain-of-function mutation, since it increases the invasiveness not only of neuroblastoma, but also of breast and prostate carcinoma cells. An apparent gain-of-function of the S120G mutation of NME1 is likely caused by a protein-folding defect, which affects its protein-protein interactions. However, the molecular mechanism(s) by which NME1 promotes neuroblastoma metastasis still remains elusive [74]. Finally, Okabe-Kado et al [75] examined serum NME1 protein levels in 217 patients with neuroblastoma, demonstrating that (i) the serum NME1 protein level was higher in neuroblastoma patients than in control children; (ii) patients with MYCN amplification had higher serum NME1 levels than those with a single copy of MYCN. Thus, the detection of serum NME1 protein levels, contributing to predictions of clinical outcome in patients with neuroblastoma, may be also proposed as not invasive prognostic markers [75].

Discussion

The prognostic role of AHCY, DPYSL3, and NME1 The evaluation of the impact of AHCY, DPYSL3 and NME1 on the prognosis of patients affected by Neuroblastoma showed the importance of these genes in terms of association with the death rate (Fig. 1) and prediction of the survival probabilities across time (Table 4). Despite the limited number of studies able to provide a high number of patients with available survival data, both observed risk and hazard ratios suggested that low levels of DPYSL3 expression and high levels of AHCY and NME1 expressions shorten overall survival, even taking in consideration potential confounders like the age of the patients. The combination of these genes showed that, except for one dataset, in all the other studies the combined effect of these genes showed significant hazard ratios (p-values \(\le 0.001\)). Only for TARGET data the p-value was not significant, but \(<0.01\). The evaluation of the pooled risk ratios showed high risk ratios for the combined score (\(>8.5\)). Despite the heterogeneity was high (\(I^{2}\)=74%), no significant biases across the studies were observed from the funnel plots (Begg and Mazumdar’s test p-value \(>0.05\), Fig. S1). Similar results were observed also when the three genes were analyzed independently. Considering the potential overlap between the datasets used in our analysis (Table 3) and those used in the studies of the prognostic signatures reported in Table 1, we observed that the number of samples in each study does not exceed the 40% of overlap with respect to the total number of samples considered in our analysis, i.e. 1,648, as reported in the Supplementary Table S1. In addition, except for AHCY which was detect in [8], showing an overlap equal to 32.3% with our samples, the other studies where we found the three genes show an overlap below 15.5%.

Moreover, we verified the predictive power of the three genes employed with machine learning to an independent gene expression dataset. Our three proposed methods and Random Forests were able to correctly classify survived patients with neuroblastoma and deceased patients with the same disease with a high accuracy, reaching MCC = +0.517 and ROC AUC = 0.877 on average. This binary classification result confirms the prognostic ability of these three genes.

We then performed a thorough literature search where we retrieved tens of peer-reviewed published studies associating each of our proposed prognostic genes with neuroblastoma survival. We did not only look for the ACHY, DPYSL3, and NME1 gene names, but also their aliases found on GeneCards.org. Our brief review of this biomedical literature confirmed a strong association between the three prognostic genes and neuroblastoma development. This confirmation comes with no surprise, since we selected our three prognostic genes from ten signatures already available in the biomedical literature.

These results confirm the robustness of the three proposed prognostic genes, that we validated in this study in three different ways (statistical analysis, machine learning analysis, and literature review). Since the prognostic signatures for neuroblastoma have minimal overlap, indicating that most genes are prognostically relevant mainly in one single study, our outcomes result being particularly relevant and interesting in oncologic research.

It is also relevant to notice that the NME1 is present as prognostic gene in three signatures’ studies [10, 11, 14] which employ the GSE3960 and E-TABM-38 datasets, among others.

Conclusions To the best of our knowledge, there are no studies on neuroblastoma reviewing the current status of the associated genetic signatures available in the literature. Here, we showed the low overlap among the prognostic genetic signatures provided so far and highlighted the most recurrent genes detected by the neuroblastoma studies, We then validated these genes using the largest gene expression datasets from the public repositories in order to provide an evaluation of the prognostic impact at high accuracy on the highest number of samples available. Our results pointed out the relevant impact of the genes AHCY, DPYSL3 and NME1 on neuroblastoma prognosis as future targets for future neuroblastoma genetics studies and the development of novel therapies.

Availability of data and materials

The gene expression datasets employed in this study can be freely and openly found at the following URLs:

\(\bullet\) https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE16237

\(\bullet\) https://www.nature.com/articles/s41467-021-21247-8#data-availability

\(\bullet\) https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE16476

\(\bullet\) https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE62564

\(\bullet\) https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-8248

\(\bullet\) https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-38

\(\bullet\) https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE85047

\(\bullet\) https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-179

\(\bullet\) https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE16716

Abbreviations

AHCY:: Adenosylhomocysteinase
CI:: Confidence Interval
CRA:: Cis-retinoic acid
DPYSL3:: Dihydropyrimidinase Like 3
GEO:: Gene Expression Omnibus
HR:: Hazard Ratio
LR:: Log-Rank
MCC:: Matthews correlation coefficient
NME1:: NME/NM23 Nucleoside Diphosphate Kinase 1
NB:: Neuroblastoma
p-value:: Probability value
RR:: Risk ratio

References

Cleveland Clinic. Neuroblastoma. https://my.clevelandclinic.org/health/diseases/14390-neuroblastoma. Accessed 5 Aug 2022.
American Cancer Society. What causes neuroblastoma? https://www.cancer.org/cancer/neuroblastoma/causes-risks-prevention/what-causes.html. Accessed 5 Aug 2022.
Sanz-Pamplona R, Berenguer A, Cordero D, Riccadonna S, Solé X, Crous-Bou M, et al. Clinical value of prognosis gene expression signatures in colorectal cancer: a systematic review. PLoS ONE. 2012;7(11): e48877.
Article CAS PubMed PubMed Central Google Scholar
Chung W, Eum HH, Lee HO, Lee KM, Lee HB, Kim KT, et al. Single-cell RNA-seq enables comprehensive tumour and immune cell profiling in primary breast cancer. Nat Commun. 2017;8(1):1–12.
Article Google Scholar
Szyf M. DNA methylation signatures for breast cancer classification and prognosis. Genome Med. 2012;4(3):1–12.
Article Google Scholar
Sanavia T, Aiolli F, Da San Martino G, Bisognin A, Di Camillo B. Improving biomarker list stability by integration of biological knowledge in the learning process. BMC Bioinformatics. 2012;13(4):1–11.
Google Scholar
Cangelosi D, Morini M, Zanardi N, Sementa AR, Muselli M, Conte M, et al. Hypoxia Predicts Poor Prognosis in Neuroblastoma Patients and Associates with Biological Mechanisms Involved in Telomerase Activation and Tumor Microenvironment Reprogramming. Cancers. 2020;12(9):2343.
Article CAS PubMed PubMed Central Google Scholar
Zhong X, Tao Y, Chang J, Zhang Y, Zhang H, Wang L, et al. Prognostic Signature of Immune Genes and Immune-Related lncRNAs in Neuroblastoma: a Study Based on GEO and TARGET Datasets. Front Oncol. 2021;11:452.
Article Google Scholar
Jin W, Zhang Y, Liu Z, Che Z, Gao M, Peng H. Exploration of the molecular characteristics of the tumor-immune interaction and the development of an individualized immune prognostic signature for neuroblastoma. J Cell Physiol. 2021;236(1):294–308.
Article CAS PubMed Google Scholar
Vermeulen J, Preter KD, Naranjo A, Vercruysse L, Roy NV, Hellemans J, et al. Predicting outcomes for children with neuroblastoma using a multigene-expression signature: a retrospective SIOPEN/COG/GPOH study. Lancet Oncol. 2009;10(7):663–71.
Article CAS PubMed PubMed Central Google Scholar
Preter KD, Vermeulen J, Brors B, Delattre O, Eggert A, Fischer M, et al. Accurate Outcome Prediction in Neuroblastoma across Independent Data Sets Using a Multigene Signature. Clin Cancer Res. 2010;16(5):1532–41.
Article PubMed Google Scholar
Valentijn LJ, Koster J, Haneveld F, Aissa RA, van Sluis P, Broekmans ME, et al. Functional MYCN signature predicts outcome of neuroblastoma irrespective of MYCN amplification. Proc Natl Acad Sci. 2012;109(47):19190–5.
Article CAS PubMed PubMed Central Google Scholar
Zhong X, Liu Y, Liu H, Zhang Y, Wang L, Zhang H. Identification of potential prognostic genes for neuroblastoma. Front Genet. 2018;9:589.
Article CAS PubMed PubMed Central Google Scholar
Garcia I, Mayol G, Ríos J, Domenech G, Cheung NKV, Oberthuer A, et al. A Three-Gene Expression Signature Model for Risk Stratification of Patients with Neuroblastoma. Clin Cancer Res. 2012;18(7):2012–23.
Article CAS PubMed PubMed Central Google Scholar
Frumm SM, Fan ZP, Ross KN, Duvall JR, Gupta S, VerPlank L, et al. Selective HDAC1/HDAC2 Inhibitors Induce Neuroblastoma Differentiation. Chem Biol. 2013;20(5):713–25.
Article CAS PubMed PubMed Central Google Scholar
Wang Z, Cheng H, Xu H, Yu X, Sui D. A five-gene signature derived from m6A regulators to improve prognosis prediction of neuroblastoma. Cancer Biomark. 2020;28(3):275–84.
Article CAS PubMed Google Scholar
Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, et al. NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Res. 2012;41(D1):D991–5.
Article PubMed PubMed Central Google Scholar
Parkinson H, Kapushesky M, Shojatalab M, Abeygunawardena N, Coulson R, Farne A, et al. ArrayExpress—a public database of microarray experiments and gene expression profiles. Nucleic Acids Res. 2007;35(suppl_1):D747–D750.
Reimand J, Arak T, Adler P, Kolberg L, Reisberg S, Peterson H, et al. g:Profiler—a web server for functional interpretation of gene lists (2016 update). Nucleic Acids Res. 2016;44(W1):W83–9.
Article CAS PubMed PubMed Central Google Scholar
Alameer A, Chicco D. geoCancerPrognosticDatasetsRetriever: a bioinformatics tool to easily identify cancer prognostic datasets on Gene Expression Omnibus (GEO). Bioinformatics. 2022;38(6):1761–3.
Article CAS PubMed Google Scholar
Davis S, Meltzer PS. GEOquery: a bridge between the Gene Expression Omnibus (GEO) and Bioconductor. Bioinformatics. 2007;23(14):1846–7.
Article PubMed Google Scholar
Durinck S, Moreau Y, Kasprzyk A, Davis S, De Moor B, Brazma A, et al. BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics. 2005;21(16):3439–40.
Article CAS PubMed Google Scholar
Ohtaki M, Otani K, Hiyama K, Kamei N, Satoh K, Hiyama E. A robust method for estimating gene expression states using Affymetrix microarray probe level data. BMC Bioinformatics. 2010;11(1):1–14.
Article Google Scholar
Chicco D. geneExpressionFromGEO: an R package to facilitate data reading from Gene Expression Omnibus (GEO). In: Agapito G, editor. Microarray Data Analysis. vol. 2401 of Methods in Molecular Biology. New York City: Springer; 2021. p. 187–194.
Li Q, Birkbak NJ, Győrffy B, Szallasi Z, Eklund AC. Jetset: selecting the optimal microarray probe set to represent a gene. BMC Bioinformatics. 2011;12(1):1–7.
Article Google Scholar
Chen C, Grennan K, Badner J, Zhang D, Gershon E, Jin L, et al. Removing batch effects in analysis of expression microarray data: an evaluation of six batch adjustment methods. PLoS ONE. 2011;6(2): e17238.
Article CAS PubMed PubMed Central Google Scholar
Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43(7):e47–e47.
Article PubMed PubMed Central Google Scholar
Koster J, Volckmann R, Zwijnenburg D, Molenaar P, Prins RW, Hoyng L, et al. R2: genomics analysis and visualization platform. 2022. http://r2.amc.nl/. Accessed 16 June 2022.
Mantel N, Haenszel W. Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst. 1959;22(4):719–48.
CAS PubMed Google Scholar
Raudenbush SW. Analyzing effect sizes: Random-effects models. In: Cooper H, Hedges LV, Valentine JC, editors. The handbook of research synthesis and meta-analysis. New York City: Russell Sage Foundation; 2009. p. 295–315.
Google Scholar
Higgins JP, Thompson SG. Quantifying heterogeneity in a meta-analysis. Stat Med. 2002;21(11):1539–58.
Article PubMed Google Scholar
Begg CB, Mazumdar M. Operating characteristics of a rank correlation test for publication bias. Biometrics. 1994;50(4):1088–101.
Article CAS PubMed Google Scholar
Sterne JA, Harbord RM. Funnel Plots in Meta-analysis. Stata J. 2004;4(2):127–41.
Article Google Scholar
Kleinbaum DG, Klein M. The Cox proportional hazards model and its characteristics. In: Analysis Survival, editor. New York City. New York, USA: Springer; 2012. p. 97–159.
Google Scholar
Breiman L. Random Forests. Mach Learn. 2001;45(1):5–32.
Article Google Scholar
Chicco D, Jurman G. The ABC recommendations for validation of supervised machine learning results in biomedical sciences. Front Big Data. 2022;5: 979465. https://doi.org/10.3389/fdata.2022.979465.
Article PubMed PubMed Central Google Scholar
Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004;5(10):1–16.
Article Google Scholar
Chicco D. Ten quick tips for machine learning in computational biology. BioData Min. 2017;10(35):1–17.
Google Scholar
Matthews BW. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta Protein Struct. 1975;405(2):442–51.
Article CAS Google Scholar
Chicco D, Tötsch N, Jurman G. The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation. BioData Min. 2021;14(1):1–22.
Article Google Scholar
Chicco D, Starovoitov V, Jurman G. The Benefits of the Matthews correlation coefficient (MCC) Over the Diagnostic Odds Ratio (DOR) in Binary Classification Assessment. IEEE Access. 2021;9:47112–24.
Article Google Scholar
Chicco D, Warrens MJ, Jurman G. The Matthews correlation coefficient (MCC) is more informative than Cohen’s Kappa and Brier score in binary classification assessment. IEEE Access. 2021;9:78368–81.
Article Google Scholar
Google. Google Scholar. 2022. http://scholar.google.com. Accessed 5 July 2022.
Rebhan M, Chalifa-Caspi V, Prilusky J, Lancet D. GeneCards: a novel functional genomics compendium with automated data mining and query reformulation support. Bioinformatics. 1998;14(8):656–64.
Article CAS PubMed Google Scholar
Safran M, Dalah I, Alexander J, Rosen N, Iny Stein T, Shmoish M, et al. GeneCards version 3: the human gene integrator. Database. 2010;2010:1–16.
Stelzer G, Rosen N, Plaschkes I, Zimmerman S, Twik M, Fishilevich S, et al. The GeneCards suite: from gene data mining to disease genome sequence analyses. Curr Protoc Bioinforma. 2016;54(1):1–30.
Article Google Scholar
Otte J, Dyberg C, Pepich A, Johnsen JI. MYCN function in neuroblastoma development. Front Oncol. 2021;10: 624079.
Article PubMed PubMed Central Google Scholar
Molenaar JJ, Koster J, Zwijnenburg DA, van Sluis P, Valentijn LJ, van der Ploeg I, et al. Sequencing of neuroblastoma identifies chromothripsis and defects in neuritogenesis genes. Nature. 2012;483(7391):589–93.
Article CAS PubMed Google Scholar
Wang C, Gong B, Bushel PR, Thierry-Mieg J, Thierry-Mieg D, Xu J, et al. The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance. Nat Biotechnol. 2014;32(9):926–32.
Article CAS PubMed PubMed Central Google Scholar
Roderwieser A, Sand F, Walter E, Fischer J, Gecht J, Bartenhagen C, et al. Telomerase Is a Prognostic Marker of Poor Outcome and a Therapeutic Target in Neuroblastoma. JCO Precis Oncol. 2019;3:1–20.
PubMed Google Scholar
Pugh TJ, Morozova O, Attiyeh EF, Asgharzadeh S, Wei JS, Auclair D, et al. The genetic landscape of high-risk neuroblastoma. Nat Genet. 2013;45(3):279–84.
Article CAS PubMed PubMed Central Google Scholar
Oberthuer A, Berthold F, Warnat P, Hero B, Kahlert Y, Spitz R, et al. Customized oligonucleotide microarray gene expression-based classification of neuroblastoma patients outperforms current clinical risk stratification. J Clin Oncol. 2006;24(31):5070–8.
Article CAS PubMed Google Scholar
Rajbhandari P, Lopez G, Capdevila C, Salvatori B, Yu J, Rodriguez-Barrueco R, et al. Cross-Cohort Analysis Identifies a TEAD4-MYCN Positive Feedback Loop as the Core Regulatory Element of High-Risk Neuroblastoma. Cancer Discov. 2018;8(5):582–99.
Article CAS PubMed PubMed Central Google Scholar
Hartlieb SA, Sieverling L, Nadler-Holly M, Ziehm M, Toprak UH, Herrmann C, et al. Alternative lengthening of telomeres in childhood neuroblastoma from genome to proteome. Nat Commun. 2021;12(1):1269.
Article CAS PubMed PubMed Central Google Scholar
Posser T, de Paula MT, Franco JL, Leal RB, da Rocha JBT. Diphenyl diselenide induces apoptotic cell death and modulates ERK1/2 phosphorylation in human neuroblastoma SH-SY5Y cells. Arch Toxicol. 2011;85(6):645–51.
Article CAS PubMed Google Scholar
Pinto NR, Applebaum MA, Volchenboum SL, Matthay KK, London WB, Ambros PF, et al. Advances in Risk Classification and Treatment Strategies for Neuroblastoma. J Clin Oncol. 2015;33(27):3008–17.
Article CAS PubMed PubMed Central Google Scholar
Chayka O, D’Acunto CW, Middleton O, Arab M, Sala A. Identification and Pharmacological Inactivation of the MYCN Gene Network as a Therapeutic Strategy for Neuroblastic Tumor Cells. J Biol Chem. 2015;290(4):2198–212.
Article CAS PubMed Google Scholar
Hamre MR, Clark SH, Mirkin BL. Resistance to inhibitors of S-adenosyl-l-homocysteine hydrolase in C1300 murine neuroblastoma tumor cells is associated with increased methionine adenosyltransferase activity. Oncol Res Featuring Preclinical Clin Cancer Ther. 1995;7(10–11):487–92.
CAS Google Scholar
Zhang C, Bowlin T, Mirkin BL. Suppression of C-1300 murine neuroblastoma cell proliferation in tissue culture and tumor growth in vivo by (Z) 5’-Fluoro-4’,-5’-didehydro-5’-deoxyadenosine (MDL 28,842), an irreversible inhibitor of S-Adenosyl-L-homocysteine Hydrolase. Oncol Res Featuring Preclinical Clin Cancer Ther. 1993;5(10–11):433–9.
CAS Google Scholar
O’Dea RF, Mirkin BL, Hogenkamp HP, Barten DM. Effect of adenosine analogues on protein carboxylmethyltransferase, S-adenosylhomocysteine hydrolase, and ribonucleotide reductase activity in murine neuroblastoma cells. Cancer Res. 1987;47(14):3656–61.
PubMed Google Scholar
Ramakrishnan V, Borchardt RT. Adenosine dialdehyde and neplanocin A: potent inhibitors of S-adenosylhomocysteine hydrolase in neuroblastoma N2a cells. Neurochem Int. 1987;10(4):423–31. https://doi.org/10.1016/0197-0186(87)90068-4.
Article CAS PubMed Google Scholar
Dwivedi RS, Wang LJ, Mirkin BL. S-adenosylmethionine synthetase is overexpressed in murine neuroblastoma cells resistant to nucleoside analogue inhibitors of S-adenosylhomocysteine hydrolase: a novel mechanism of drug resistance. Cancer Res. 1999;59(8):1852–6.
CAS PubMed Google Scholar
Novak EM, Halley NS, Gimenez TM, Rangel-Santos A, Azambuja AMP, Brumatti M, et al. BLM germline and somatic PKMYT1 and AHCY mutations: Genetic variations beyond MYCN and prognosis in neuroblastoma. Med Hypotheses. 2016;97:22–5.
Article CAS PubMed Google Scholar
Oliynyk G, Ruiz-Pérez MV, Sainero-Alcolado L, Dzieran J, Zirath H, Gallart-Ayala H, et al. MYCN-enhanced Oxidative and Glycolytic Metabolism Reveals Vulnerabilities for Targeting Neuroblastoma. Iscience. 2019;21:188–204.
Article CAS PubMed PubMed Central Google Scholar
Tan K, Wu W, Zhu K, Lu L, Lv Z. Identification and Characterization of a Glucometabolic Prognostic Gene Signature in Neuroblastoma based on N6-methyladenosine Eraser ALKBH5. J Cancer. 2022;13(7):2105–25.
Article CAS PubMed PubMed Central Google Scholar
Kanda M, Nomoto S, Oya H, Shimizu D, Takami H, Hibino S, et al. Dihydropyrimidinase-like 3 facilitates malignant behavior of gastric cancer. J Exp Clin Cancer Res. 2014;33(1):1–8.
Article Google Scholar
Kawahara T, Hotta N, Ozawa Y, Kato S, Kano K, Yokoyama Y, et al. Quantitative proteomic profiling identifies DPYSL3 as pancreatic ductal adenocarcinoma-associated molecule that regulates cell adhesion and migration by stabilization of focal adhesion complex. PLoS ONE. 2013;8(12): e79654. https://doi.org/10.1371/journal.pone.0079654.
Article CAS PubMed PubMed Central Google Scholar
Rosslenbroich V, Dai L, Baader SL, Noegel AA, Gieselmann V, Kappler J. Collapsin response mediator protein-4 regulates F-actin bundling. Exp Cell Res. 2005;310(2):434–44. https://doi.org/10.1016/j.yexcr.2005.08.005.
Article CAS PubMed Google Scholar
Alabed YZ, Pool M, Tone SO, Sutherland C, Fournier AE. GSK3β regulates myelin-dependent axon outgrowth inhibition through CRMP4. J Neurosci. 2010;30(16):5635–43. https://doi.org/10.1523/JNEUROSCI.6154-09.2010.
Li Z, Tan F, Thiele CJ. Inactivation of glycogen synthase kinase-3 contributes to brain-derived neutrophic factor/TrkB-induced resistance to chemotherapy in neuroblastoma cells. Mol Cancer Ther. 2007;6(12):3113–21.
Article CAS PubMed Google Scholar
Tan F, Wahdan-Alaswad R, Yan S, Thiele CJ, Li Z. Dihydropyrimidinase-like protein 3 expression is negatively regulated by MYCN and associated with clinical outcome in neuroblastoma. Cancer Sci. 2013;104(12):1586–92. https://doi.org/10.1111/cas.12278.
Article CAS PubMed PubMed Central Google Scholar
Adam K, Lesperance J, Hunter T, Zage PE. The potential functional roles of NME1 histidine kinase activity in neuroblastoma pathogenesis. Int J Mol Sci. 2020;21(9):3319. https://doi.org/10.3390/ijms21093319.
Article CAS PubMed PubMed Central Google Scholar
Carotenuto M, Pedone E, Diana D, de Antonellis P, Džeroski S, Marino N, et al. Neuroblastoma tumorigenesis is regulated through the Nm23-H1/h-Prune C-terminal interaction. Sci Rep. 2013;3(1):1–11. https://doi.org/10.1038/srep01351.
Article CAS Google Scholar
Negroni A, Venturelli D, Tanno B, Amendola R, Ransac S, Cesi V, et al. Neuroblastoma specific effects of DR-nm23 and its mutant forms on differentiation and apoptosis. Cell Death Differ. 2000;7(9):843–50. https://doi.org/10.1038/sj.cdd.4400720.
Article CAS PubMed Google Scholar
Okabe-Kado J, Kasukabe T, Honma Y, Hanada R, Nakagawara A, Kaneko Y. Clinical significance of serum NM23-H1 protein in neuroblastoma. Cancer Sci. 2005;96(10):653–60. https://doi.org/10.1111/j.1349-7006.2005.00091.x.
Article CAS PubMed Google Scholar

Download references

Acknowledgements

The authors thank Davide Cangelosi (Istituto Giannina Gaslini) for his feedback.

Funding

Not applicable.

Author information

Authors and Affiliations

Institute of Health Policy Management and Evaluation, University of Toronto, 155 College Street, M5T 3M7, Toronto, Ontario, Canada
Davide Chicco
Dipartimento di Scienze Mediche, Università di Torino, Via Verdi 8, 10124, Turin, Italy
Tiziana Sanavia
Data Science for Health Unit, Fondazione Bruno Kessler, Via Sommarive 18, 38123, Povo (Trento), Italy
Giuseppe Jurman

Authors

Davide Chicco
View author publications
You can also search for this author in PubMed Google Scholar
Tiziana Sanavia
View author publications
You can also search for this author in PubMed Google Scholar
Giuseppe Jurman
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

D.C. conceived the study, made the signature literature search, discovered the three most recurrent genes, performed the machine learning analysis, co-supervised this study, and wrote multiple parts of the article. T.S. found the survival datasets, performed the survival analysis, co-supervised this study, and wrote multiple parts of the article. G.J. performed the literature review, performed the probabilistic calculations on the presence of the three proposed genes, co-supervised the study, and wrote multiple parts of the article. All the authors read and approved the final manuscript version.

Corresponding author

Correspondence to Davide Chicco.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

The authorizations for the usage of patients’ data have been obtained by the original curators of the studies which released the datasets publicly.

Competing interests

The authors declare they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Supplementary information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Chicco, D., Sanavia, T. & Jurman, G. Signature literature review reveals AHCY, DPYSL3, and NME1 as the most recurrent prognostic genes for neuroblastoma. BioData Mining 16, 7 (2023). https://doi.org/10.1186/s13040-023-00325-1

Download citation

Received: 19 September 2022
Accepted: 17 February 2023
Published: 04 March 2023
DOI: https://doi.org/10.1186/s13040-023-00325-1

Signature literature review reveals AHCY, DPYSL3, and NME1 as the most recurrent prognostic genes for neuroblastoma

Abstract

Introduction

Methods

Prognostic genes discovery and dataset search

Validation of risk and hazard ratio in multi-studies overall survival data

Binary classification method

Validation on the literature

Results

Prognostic genes found

Computational evaluation of the impact of the most recurrent prognostic genes across the largest neuroblastoma studies

AHCY, DPYSL3, and NME1 in the neuroblastoma literature

Discussion

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Supplementary Information

Additional file 1.

Rights and permissions

About this article

Cite this article

Keywords

BioData Mining

Contact us

Signature literature review reveals AHCY, DPYSL3, and NME1 as the most recurrent prognostic genes for neuroblastoma

Abstract

Introduction

Methods

Prognostic genes discovery and dataset search

Validation of risk and hazard ratio in multi-studies overall survival data

Binary classification method

Validation on the literature

Results

Prognostic genes found

Computational evaluation of the impact of the most recurrent prognostic genes across the largest neuroblastoma studies

AHCY, DPYSL3, and NME1 in the neuroblastoma literature

Discussion

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Supplementary Information

Additional file 1.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BioData Mining

Contact us