Skip to main content

A prognostic model based on seven immune-related genes predicts the overall survival of patients with hepatocellular carcinoma

A Correction to this article was published on 26 October 2023

This article has been updated



Hepatocellular carcinoma (HCC) is a disease with a high incidence and a poor prognosis. Growing amounts of evidence have shown that the immune system plays a critical role in the biological processes of HCC such as progression, recurrence, and metastasis, and some have discussed using it as a weapon against a variety of cancers. However, the impact of immune-related genes (IRGs) on the prognosis of HCC remains unclear.


Based on The Cancer Gene Atlas (TCGA) and Immunology Database and Analysis Portal (ImmPort) datasets, we integrated the ribonucleic acid (RNA) sequencing profiles of 424 HCC patients with IRGs to calculate immune-related differentially expressed genes (DEGs). Survival analysis was used to establish a prognostic model of survival- and immune-related DEGs. Based on genomic and clinicopathological data, we constructed a nomogram to predict the prognosis of HCC patients. Gene set enrichment analysis further clarified the signalling pathways of the high-risk and low-risk groups constructed based on the IRGs in HCC. Next, we evaluated the correlation between the risk score and the infiltration of immune cells, and finally, we validated the prognostic performance of this model in the GSE14520 dataset.


A total of 100 immune-related DEGs were significantly associated with the clinical outcomes of patients with HCC. We performed univariate and multivariate least absolute shrinkage and selection operator (Lasso) regression analyses on these genes to construct a prognostic model of seven IRGs (Fatty Acid Binding Protein 6 (FABP6), Microtubule-Associated Protein Tau (MAPT), Baculoviral IAP Repeat Containing 5 (BIRC5), Plexin-A1 (PLXNA1), Secreted Phosphoprotein 1 (SPP1), Stanniocalcin 2 (STC2) and Chondroitin Sulfate Proteoglycan 5 (CSPG5)), which showed better prognostic performance than the tumour/node/metastasis (TNM) staging system. Moreover, we constructed a regulatory network related to transcription factors (TFs) that further unravelled the regulatory mechanisms of these genes. According to the median value of the risk score, the entire TCGA cohort was divided into high-risk and low-risk groups, and the low-risk group had a better overall survival (OS) rate. To predict the OS rate of HCC, we established a gene- and clinical factor-related nomogram. The receiver operating characteristic (ROC) curve, concordance index (C-index) and calibration curve showed that this model had moderate accuracy. The correlation analysis between the risk score and the infiltration of six common types of immune cells showed that the model could reflect the state of the immune microenvironment in HCC tumours.


Our IRG prognostic model was shown to have value in the monitoring, treatment, and prognostic assessment of HCC patients and could be used as a survival prediction tool in the near future.

Peer Review reports


Ranking sixth in worldwide incidence, primary liver cancer (PLC) is the fourth-leading cause of cancer-related mortality [1]. Hepatocellular carcinoma (HCC), the most common pathological type of PLC, accounts for approximately 90% of reported cases [2,3,4,5]. Hepatitis B and C viruses are the biggest risk factors for HCC [6]. Application of the hepatitis B virus vaccine has caused the incidence of HCC to decline [7]. Leaving aside patients who are diagnosed at an early stage or eligible for potentially curative therapies, treatment for advanced HCC is limited due to its heterogeneity, and the overall prognosis of HCC patients is still unsatisfactory [8, 9].

Cancer immunotherapy has contributed to personalized medicine, with substantial clinical benefit against advanced disease [10,11,12,13,14,15]. Current immune checkpoint inhibitors show surprising potential effectiveness against HCC [16, 17]. Indeed, the liver is a central immunological organ with a high density of myeloid and lymphoid immune cells [17, 18]. Immune cells are widespread in the tumor microenvironment (TME) [19, 20], wherein interaction between tumor cells and immune cells is extremely important to maintaining the dynamic balance of normal tissues and tumor growth; this process is closely related to the occurrence, progression, and prognosis of cancer [21]. Meanwhile, inflammatory reaction plays a decisive role at different stages of tumor development. It also affects immune monitoring and response to treatment and promotes the occurrence and development of tumours to varying degrees [22]. Since HCC often arises in the setting of chronic liver inflammation [5, 23] and might be responsive to novel immunotherapies, people infected with hepatitis B or C viruses are at high risk of HCC [24]. While several studies have supported the importance of immunology in HCC, the exact molecular mechanisms still remain unknown, particularly for combinations of immune cells forming a TME [25] and for immunogenomic effects [26]. With the advent of multi-dimensional, large-scale high-throughput analyses, cancer researchers have been able to identify culpable biomarkers for tumour prognosis and prediction [27,28,29,30]. Long et al. explored the prognostic value of immune-related genes (IRGs) linked to TP53 status in order to improve the prognoses of HCC patients [31]. Moeinia et al. analysed the expression profiles of 392 early-stage non-tumour liver tissues from HCC patients and liver tissues from HCC-free cirrhosis patients, identified possible regulatory changes in the expression of IRGs in HCC, and further verified the accuracy of this conclusion through experiments. This gene expression pattern is related to the risk of PLC in cirrhosis patients [32]. Liang S et al. proposed that after liver injury, the molecular pattern related to the release of hepatocytes would activate liver tumour-associated macrophages (TAMs), thus producing cytokines to promote tumour development [33]. However, the clinical relevance and prognostic significance of IRGs in HCC have yet to be comprehensively explored.

Our study aimed to better appreciate the potential clinical utility of IRGs prognostic stratification and develop a new IRG-based immune prognostic model (IPM). We systematically investigated the expression status from The Cancer Gene Atlas (TCGA, database and prognostic landscape of IRGs, constructed a genomic–clinicopathological model for these patients and validate it in Gene Expression Omnibus (GEO, Moreover, underlying regulatory mechanisms have been explored by bioinformatics analysis. The results of this study could help provide a more complete understanding and more-precise immunotherapy for HCC.

Materials and methods

HCC datasets and preprocessing

As TCGA and GEO databases both are landmark cancer genome projects that are publicly available to any researcher, our research did not require the approval of an ethics committee. After downloading data from transcriptome messenger ribonucleic-acid (mRNA) expression profiles and the clinical information of HCC patients from the TCGA and GEO website, we ultimately obtained a dataset of 374 HCC and 50 para-tumor samples [34] as a training dataset, 225 HCC tissues and 220 adjacent non-tumour samples (GPL3921) in GSE14520 dataset as a test dataset [35].

Also, we obtained a list of IRGs from the Immunology Database and Analysis Portal (ImmPort, This is one of the largest open source repositories of human immunological data at the subject level, providing data on clinical and mechanistic studies of human subjects and immunological studies of model organisms [36]. The integrated analysis of these databases, which reveals new insights into the comprehensive analysis yielded by the combination of mass spectrometry staining and tumour molecular profiling, could become a useful resource on the regulation of tumour-related genes.

Identification, normalization, and elucidation of differentially expressed genes (DEGs) and immune-related genes (IRGs)

We used the limma package in R software (version 3.5.3; R Foundation for Statistical Computing) to calculate genes in common between HCC and para-tumour tissue [37]. The absolute value of log fold change (FC) was ≥2, and adjusted P < 0.05 was the cutoff value. We screened DEGs between the two groups and depicted the results in a heatmap and volcano plot. Then, we use the combat function in the sva package in R software to remove batch effects and batch corrections on the gene expression data between the training and test group [38]. By combining DEGs and IRGs, we obtained the intersection of IRGs involved in HCC pathogenesis, and all of the IRGs were listed in GSE14520 dataset, too. To explore the potential functions and possible pathways of these IRGs, we further analysed the differentially expressed IRGs via gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, enabled by the clusterProfiler package in R software [39].

Screening of prognosis-specific IRGs

We combined and analysed the patients’ clinical information and the gene expression of IRGs, using OS as the outcome index. Samples with an OS time of less than 30 days and incomplete clinical information were omitted, and we finally retained 343 samples in the TCGA dataset and 221 samples in the GSE14520 dataset to construct the model. Detailed epidemiological information of the two cohorts is displayed in Table 1. The significance level of univariate Cox regression analysis was set to P < 0.05 and displayed in the form of a forest plot.

Table 1 Clinical information in training and validation groups

Transcription factor (TF) regulatory network

TF protein are critical regulators of gene switches [40]. The Cistrome Cancer database ( combines the cancer genomics data in TCGA with the chromatin analysis data in the Cistrome Data Browser, enabling cancer researchers to explore how TFs regulate the degree of gene expression [41]. To explore the regulatory mechanisms of prognosis-related IRGs, we built a regulatory network covering differentially expressed TFs and IRGs using Cytoscape software version 3.7.1 (Cytoscape Consortium; [42]. We also conducted protein–protein interaction (PPI) analysis using the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING; STRING Consortium; to evaluate interactions among all of the TFs. Using the cytoHubba package in Cytoscape, we also performed topological analysis of these key TFs and ranked the top 10 by the “degree” criterion [43].

Construction of IPMs and validation model

The glmnet package was utilized to build a multivariate least absolute shrinkage and selection operator (Lasso) Cox proportional hazards regression model, and the cv.glmnet function was used to create 1000 random iterations. We obtained the best modelling parameters through 10-fold cross-validation and the default “deviance”, hence constructing an IPM of the IRGs [44]. The calculation formula was as follows:

$$ risk\kern0.5em score=\sum \limits_{n=1}^{\infty}\left({\beta}_n\times {\in}_n\right) $$

where β represents the weight of each gene, and is the standardized expression value of each gene. According to the median value of the risk score, the entire TCGA dataset was divided into two groups. We also divided the GSE14520 data set into high- and low- risk groups according to the median in the training set. We applied Kaplan-Meier (K-M) survival analyses curves to see if there were any differences between these two groups. At the same time, we displayed the risk scores, survival status, and gene expression levels of patients in the high-risk and low-risk groups.

Construction and validation of the prognosis-related nomogram

We built 1-, 3-, and 5-year nomograms of key genes in the IPM using the rms packages in R software. To evaluate the sensitivity and specificity of our IPM, we drew time-dependent receiver operating characteristic (ROCs) curves and calibration curves, and calculated a concordance index (C-index) using the survivalROC installation package in R software [45]. When the C-index is between 0.5–0.7, it proves that the prognostic performance of the model is statistically acceptable; and when C-index > 0.7, we considered the predictive power of our model has a high degree of discrimination [46].

Correlations between risk score and clinical features

Similarly, we analysed the significance of risk score correlated with clinical factors in multivariate and univariate analyses, and constructed a nomogram to evaluate practical-application value of the nomogram. The clinical factors in the training set include age, gender, TNM staging and grade; the clinical information in the testing set include gender, age, alanine transaminase (ALT) (>/<=50 U/L), main tumour size (>/<=5 cm), multinodular, cirrhosis, tumour node metastasis (TNM) staging, Barcelona Clinic Liver Cancer (BCLC) staging, Cancer of the Liver Italian Program (CLIP) staging and alpha fetoprotein (AFP) (>/<=300 ng/ml). In addition, the time-independent ROC curve and C-index value were used to assess its prognostic performance, too. We further analysed the correlation of various clinical factors with gene expression levels and risk scores in the IPM.

Gene set enrichment analysis

GSEA v4.0.1 software was used to further identify different biological processes between the low-risk and high-risk groups constructed by the seven IRGs in HCC. We carried out gene set enrichment analysis (GSEA, to explore the enriched items of the two groups [47] and “c2.all.v7.4.symbols.gmt” was chosen as the reference gene set. P < 0.05 and false discovery rate < 0.25 were used as the screening criteria.

Relationship between risk sore and immune cell infiltration

The Tumour Immune Estimation Resource online database (TIMER, can estimate the infiltration abundance of six common types of immune cells-B cells, Cluster of Differentiation 4-positive (CD4+) T cells, Cluster of Differentiation 8-positive (CD8+) T cells, neutrophils, TAMs, and dendritic cells (DCs)-and provide a comprehensive resource on immune infiltration of various cancer types [17]. Hence, we performed Pearson correlation analysis between risk score and the content of six types of immune cells.

Verification of immune-related signatures

We analysed genetic alterations in seven IRGs associated with prognosis. The data were obtained from the cBio Cancer Genomics Portal (cBioPortal,, which is of great utility in exploring multidimensional genomic information [48]. The human protein atlas project (HPA, is used to evaluate the protein level differences of each IRGs [49]. To obtain the effect on HCC survival of high and low expression of these genes in HCC, we input them into the Kaplan Meier Plotter (K-M,, a website providing gene chips and RNA sequencing data sources from the GEO and TCGA for several cancers [48, 50]. P < 0.05 was considered to be statistically significant. We calculated OS, disease-free survival (DFS), progression-free survival (PFS), and relapse-free survival (RFS) rates for HCC.

Statistical analysis

Most of the statistical analyses was performed using R software and online databases. PPI network analysis was completed and the diagram of mechanism regulation between TFs and IRGs was created using Cytoscape. Pearson correlation analysis was used to analyse the correlation between risk score and clinical factors and the degree of immune cell infiltration. In addition, we used the cBioPortal and K-M Plotter to analyse the genetic changes and survival differences of genes, respectively.


Differentially expressed OS-related DEGs in HCC

The flowchart in Fig. 1 clearly illustrates our analytic process. According to our screening criteria (|log FC| > 2, adjusted P < 0.05), the limma package identified 2068 DEGs in common between HCC and normal liver tissue. These DEGs included 1991 upregulated and 77 downregulated genes (Fig. 2a, d). From this group of genes, we extracted 116 differentially expressed IRGs, including 96 upregulated and 20 downregulated genes (Fig. 2b, e). Finally, we obtained 100 IRGs that exist both in the TCGA and GSE14520 dataset for model construction.

Fig. 1
figure 1

Flowchart presenting the process of establishing the seven-gene signature and prognostic nomogram for HCC. Abbreviations: HCC: hepatocellular carcinoma; TCGA-LIHC: The Cancer Genome Atlas, Liver Hepatocellular Carcinoma; GEO: Gene Expression Omnibus; IMMPORT: Immunology Database and Analysis Portal; DEG: differentially expressed gene; TF: transcription factor; ROC: Receiver operating characteristic; IRG: immune-related gene; LASSO: Least Absolute Shrinkage and Selection Operator; GSEA: Gene Set Enrichment Analysis

Fig. 2
figure 2

The filter results of differentially expressed immune related genes (IRGs) and transcription factors (TFs) between 374 hepatocellular carcinoma (HCC) and 50 para-tumor samples. a Heatmap and Volcano plot (c) of differentially expressed IRGs; (b) Heatmap and volcano plot (D) of differentially expressed TFs. Green and red dots separately represent low and high expression of IRGs and TFs in HCC, and black dots represent genes that are not differentially expressed

The results of IRGs enrichment analysis were more common in the inflammatory pathway, including “positive regulation of secretion by cell,” “positive regulation of secretion,” “antimicrobial humoral response” and “defense response to bacterium” in biological processes. In the meantime, these genes participated in “secretory granule lumen,” “Cytoplasmic vesicle lumen,” and “vesicle lumen” in cell components; and played a main role in the regulation of various receptor ligands, cytokines, cytokine receptors, hormones, or chemokines in molecular functions. Also, these IRGs could be involved in the composition of signal pathways such as “Cytokine-cytokine receptor interaction,” “Axon guidance,” “TGF-beta signaling pathway,” “Viral protein interaction with cytokine and cytokine receptor,” and “Hippo signaling pathway” (Fig. 3). The above enriched items are all related to immunity or tumour, indicating that these 100 IRGs may play a role in regulating HCC by regulating some immunological process.

Fig. 3
figure 3

Functional-enrichment analysis of the 100 common differentially expressed genes. a Biological processes analysis, (b): Cell components analysis; (c): Molecular functions analysis; (d) Pathway analysis of the top 30 most important entries in the Kyoto Encyclopedia of Genes and Genomes. Significance gradually increases from blue to red; the thickness of the line represents the degree of correlation between the two points, and the size of the circle represents the number of genes enriched on the entry

Establishment and validation of a seven-gene prognostic signature based on the prognosis of HCC

We included a total of 343 patient cases (OS > 30 days) from TCGA in the survival analysis; OS was selected as the primary endpoint for this study. Applying the univariate Cox regression model (P < 0.05), we used the 100 IRGs to identify the DEGs associated with OS in HCC. We identified 30 OS-related DEGs, which were considered to be significant genes associated with HCC (Fig. 4).

Fig. 4
figure 4

Forest plot of significant genes in univariate cox regression analysis (The red squares on the right side of the forest map indicate risk factors, and the green squares indicate protective factors)

Using Lasso Cox multivariate analysis, we then developed an IPM based on seven genes: FABP6, MAPT, BIRC5, PLXNA1, CSPG5, SPP1 and STC2. The hazard ratios of all these DEGs were > 1, meaning that all were considered oncogenes. According to the following formula:

$$ {\displaystyle \begin{array}{c} risk\ score=\left[(0.103)\times FABP6\ standardized\ expression\ value\right]+\\ {}\left[(0.0214)\times MAPT\ standardized\ expression\ value\right]+\\ {}\begin{array}{c}\left[(0.161)\times BICR5\ standardized\ expression\ value\right]\\ {}\left[(0.0421)\times PLXNA1\ standardized\ expression\ value\right]+\\ {}\begin{array}{c}\left[(0.244)\times CSPG5\ standardized\ expression\ value\right]+\\ {}\left[(0.0497)\times SPP1\ standardized\ expression\ value\right]+\\ {}\left[(0.174)\times STG2\ standardized\ expression\ value\right]\end{array}\end{array}\end{array}} $$

we calculated the risk score of each sample and then automatically divided all of the patients in TCGA into high- and low-risk groups according to median risk value. The K-M survival curve showed a significantly worse prognosis in the high-risk group (P = 8.135e− 07; Fig. 5a). The heatmap shows that as the risk score increases, the expression of IRGs gradually increases, thus indicating that high expression of these genes is a risk factor for HCC prognosis (Fig. 5b). In addition, the survival risks of these patients gradually increased as risk scores increased, and the number of survivors decreased significantly (Fig. 5c, d). Finally, we get the same results in the GSE14520 dataset, indicating that our model has a high degree of credibility (Fig. 5e-h).

Fig. 5
figure 5

Construction of seven immune-related prognostic signatures for HCC. a: Kaplan-Meier curve for low- and high-risk populations in training group; (b): The distribution of risk score in patients in training group; (c): Survival status of patients with HCC in training group; (d): Heatmap of the expression levels of seven immune-related genes (IRGs) of patients in training group; (e): Kaplan-Meier curve for low- and high-risk populations in testing group; (F): The distribution of risk score in patients in testing group; (g): Survival status of patients with HCC in training group; (H): Heatmap of the expression levels of seven IRGs of patients in testing group

Meanwhile, we constructed a nomogram of the seven IRGs and evaluated the prognostic value of the seven-gene model based on the time-dependent ROC curve and the C-index value. The 1-, 3-, and 5-year risk prediction area under ROC curves (AUCs) for OS were 0.780, 0.699, and 0.685, respectively, and the C-index is 0.72, 95% [confidence interval (CI): 0.68, 0.77] and 0.62, 95% [CI: 0.57, 0.68], respectively. The above results indicated that the seven signatures performed well in predicting the OS of HCC, and we obtained the same results in testing set, too (Fig. 6).

Fig. 6
figure 6

The establishment and verification of gene-related nomograms in the training (a, c) and verification (b, d) groups for predicting 1-year, 3-year, and 5-year survival rates of HCC patients

TF regulatory network

To explore the clinical significance of pivotal IRGs and the corresponding underlying molecular mechanisms, we examined the expression profiles of 318 TFs and found that 31 genes were differentially expressed between HCC and non-tumour HCC samples (|log FC| > 2, P < 0.05), and they were related to OS in HCC patients. Then, we established a regulatory network based on these 31 TFs and 9 IRGs that had proven significant in univariate analysis. Correlation coefficients > 0.4 and P-values ≤0.001 were set as screening criteria. The TF-based regulatory diagram in Fig. 7 clearly illustrates the regulatory relationship between these IRGs (Fig. 7).

Fig. 7
figure 7

Protein–protein interaction network based on prognosis-related transcription factors (TFs) (a) and the main regulatory network between TFs and prognostic immune-related genes (IRGs) (b). In (a), center, the top 10 genes are sorted by the Degree criterion; the darker the color, the higher the ranking. In (B), the blue triangles represent TFs, the red circles represent differentially expressed prognostic IRGs, and the red connecting lines represent positive regulation

Evaluation of prognostic factors associated with OS in HCC

We included 219 patients with complete clinical information in the TCGA-LIHC dataset. As important clinical indicators, gender, age, grade, and TNM staging were included in our study to identify prognostic factors. We used univariate and multivariate Cox regression analysis to determine prognostic factors associated with OS in HCC. Univariate analysis showed that risk score, TNM staging, T stage, and M stage were significantly correlated with OS (P < 0.05). Based on univariate-analysis results with P < 0.669, we further included these parameters in multivariate Cox regression analysis for analysis. Multivariate analysis showed that risk score (P < 0.001) was an independent risk factor (Fig. 8a, b), further demonstrating that our IPM’s impact on the patient’s prognosis is not disturbed by other clinical factors, and it is an independent prognostic factor of OS in HCC patients.

Fig. 8
figure 8

(a, c) Univariate and (b, d) multivariate Cox regression analysis of the correlation between risk score and clinical factors in training (a, b) and testing (c, d) group

The clinical information of 242 HCC patients who meet the criteria in the GSE14520 dataset includes age, ALT (>/<=50 U/L), main tumour size (>/<=5 cm), multinodular cirrhosis, TNM staging, BCLC staging, CLIP staging and AFP (>/<=300 ng/ml) were included in the analysis. Univariate analysis showed that risk score, main tumour size, cirrhosis, TNM staging, BCLC staging, CLIP staging and AFP were related to OS; while multinodular, cirrhosis, BCLC staging, CLIP staging and risk score were independent prognostic risk factors in multivariate analysis (Fig. 8c, d).

Construction and validation of a prognostic nomogram

We used a stepwise Cox regression model to establish a prognostic nomogram based on the 219 eligible HCC patients with complete clinical information in the TCGA–LIHC dataset for predicting survival at 1, 3 and 5 years. Risk score, age, sex, TNM stage, T stage, N stage, and M stage were all nomogram parameters. The AUCs of OS at 1, 3 and 5 years were 0.791, 0.760 and 0.793, respectively. The C-index was values were 0.78 (95% CI: 0.72, 0.84) and 0.73 and (95% CI: 0.68, 0.78) in the training and testing groups, respectively. The results of the clinical factors showed that the AUC values of T stage, TNM stage, and risk score were the highest at 0.757, 0.750, and 0.791, respectively, which suggested that the IPM had moderate prognostic performance (Fig. 9). The calibration curve further showed that the nomogram performed well in predicting the OS of HCC patients in the training group. However, the difference between the predicted survival rate and the actual survival rate in the calibration curve of the testing group was large, suggesting that the performance of the prognostic model may need to be further verified (Fig. 10).

Fig. 9
figure 9

The establishment and verification of clinical-related nomograms in the training (a-c) and verification (d-f) groups for predicting 1-year, 3-year, and 5-year survival rates of HCC patients

Fig. 10
figure 10

Calibration curve of nomogram in the training set and testing set. The X-axis is the predicted survival rate, and the Y-axis is the actual survival rate. a, b, c: 1-year, 3-years, and 5-year calibration curves in the training set; (d, e, f):1- year, 3- year and 5 -year calibration curves in the testing set

Gene set enrichment analysis

We performed GSEA in the training group to identify the differences between the high-risk and low-risk groups. Among them, the high-risk group had 41 significantly enriched pathways, and the low-risk group had 12 significantly enriched pathways. The enrichment pathways in high-risk groups are mostly related to tumours (bladder cancer, small cell lung cancer, non-small cell lung cancer, pancreatic cancer and colorectal cancer) or tumour-related pathways (“p53 signaling pathway”, “nucleotide-binding oligomerization domain (NOD) -like receptor signaling pathway”, “Notch signaling pathway”, “VEGF signaling pathway” and “Pathways in cancer”), and metabolic or metabolic disease-related pathways (pyrimidine metabolism, purine metabolism and N-Glycan biosynthesis); the enrichment pathways in the low-risk group are mostly related to metabolism (fatty acid metabolism, valine leucine and isoleucine degradation, drug metabolism cytochrome P450 and tryptophan metabolism), complement and coagulation cascades and the PPAR string pathway (Fig. 11).

Fig. 11
figure 11

Representative pathways of significant enrichment in the model by Gene set enrichment analysis (high-risk group above, low-risk group below)

The correlation between clinical factors and gene signature in IPM

We analysed the relationship between genetic characteristics and clinical parameters (Table 2) in training group. Compared with patients with Stage I/II, G1/G2, and T1/T2 in HCC, patients with Stage III/IV, G3/G4, and T3/T4 have higher levels of BICR5 gene expression and risk score. Female has higher expression level of PLXNA1 than male, while G3/G4 and stage III/IV have higher expression level of PLXNA1 gene. Those of patients in Stage III and IV were higher in SPP1 expression than those of patients in Stage I and II, a difference that was statistically significant. Patients in stage N0 had higher expression of FABP6 and MAPT than patients in stage N1, probably due to the large difference in the number of samples between the two groups. In terms of survival time, patients with HCC in T1/T2 and Stage I/II were significantly higher than those with the disease in T3/T4 and stage III/IV.

Table 2 The relationship between clinical factors and risk scores or the expression of seven prognostic related immune genes in hepatocellular carcinoma

Association between the degree of immune infiltration and risk score

To further study whether risk score in this IPM could affect the abundance of immune cells in the TME, we performed a correlation analysis between six types of common immune cells and risk score. In Fig. 12 we can see that the correlation coefficients of risk score and neutrophils, TAMs and dendritic are all above 0.2; B cells, CD4+ T cells and CD8+ T cells and risk scores is less than 0.2, but it is relatively close to 0.2. The results showed that all immune cells were positively correlated with risk score to a statistically significant degree (Fig. 12; P < 0.05), which implying that the higher the degree of immune infiltration, the worse the prognosis of the patient.

Fig. 12
figure 12

Pearson correlation analysis between risk score and infiltration abundances of six types of immune cells. a B cells, (b) Cluster of Differentiation 4–positive (CD4+) T cells, (c) CD8+ T cells, (d) neutrophils, (e) macrophages, and (f) dendritic cells

Verification of hub genes using online website

Based on the above analysis results, we can see that the model has a high clinical application value, so we conducted a study of the molecular characteristics of the IRGs in the IPM. We analysed the genetic-variation results of the genes FABP6, MAPT, BIRC5, PLXNA1, CSPG5, SPP1 and STC2. Of the 349 patients included in the cBioPortal, 120 (34.38%) showed genetic changes in these seven genes. With mRNA high (21.49%) was the most common genetic variation, amplification (6.5%) and missense mutation (1.69%) being the next most common (Fig. 13a). We further analysed the differences in the expression of these genes at the protein level. As shown in Fig. 11b, except for FABP6, the other genes were all highly expressed in HCC tissues.

Fig. 13
figure 13

Genetic alterations landscape (a) and expression in the translational level (b) of the seven-prognostic immune-related genes in hepatocellular carcinoma

To confirm that multi-gene signatures have better prognostic performance than a single-gene signature, we performed a K-M survival analysis of the seven IRGs in this IPM (Fig. 14). Almost all of the P-values were < 0.05, and the c-indexes in ROC curves were lower than in IPM, further proving the importance of these seven IRGs. Since these seven IRGs were found to be oncogenes, the high expression of genes is often associated with poor prognosis, which is basically consistent with the conclusions we have obtained before. However, patients with high expression of FABP6 had longer PFS and RFS. Our conclusions regarding FABP6 was unclear, and further research might be needed to verify them. In short, the abnormal stomatic mutations, expression and survival differences of these seven IRGs in HCC may help explain their important application value.

Fig. 14
figure 14

Overall survival, progression-free survival, disease-free survival, and relapse-free survival Kaplan-Meier curves of seven prognosis-related IRGs (From top to bottom: BIRC5, CSPG5, FABP6, MAPT, PLXNA1, SPP1, STC2). Black curve represents low expression; red curve represents high expression


Components of the TME, the environment in which tumour cells grow, include inflammatory cells, fibroblasts, myofibroblasts, neuroendocrine cells, adipocytes, and extracellular matrix [51]. The TME is inseparable from the growth, invasion, metastasis, and prognosis of tumour cells [45]. Unlike cancer cell genes and epigenetic mechanisms, the matrix population in the TME is relatively stable genetically; therefore, having potential therapeutic value. A growing number of studies have begun to focus on the regulatory mechanism of the TME on HCC [45]. Although major results have been achieved, identifying suitable immunotherapeutic targets in complex TME components requires the joint efforts of multiple research teams. The application of molecular prognostic models and the identification of target genes can be described as molecular therapeutics that may provide effective approaches in the future [3]. Fortunately, the open source TCGA and GEO databases have accumulated an abundance of genomic information, providing a simpler and more reliable way to predict prognosis in cancer. Our study intended to identify IRGs from the TCGA and ImmPort databases that were significant to HCC prognosis, and further verified the results in the GSE14520 dataset. The identification of these IRGs and the construction of the IPM might provide new ideas for immunotherapy in HCC. Finding and developing novel targets with potential value for immunotherapy is very important; after experimental verification and clinical trials, it can strengthen current immunotherapeutic regimes.

In the current study, we obtained 2068 HCC-related DEGs from TCGA and finally retained 100 IRGs that also existed in the GEO database for model construction. After identifying the common DEGs in the training set and testing set, we used only the IRGs to construct a Cox model for predicting OS. Incorporating clinical factors and genome information into Lasso regression analysis is indeed a better modelling method, and it is also the current trend of machine learning. However, doing this makes it difficult for us to externally verify the model because the clinical information contained in each data set is different, and there are certain limitations in selecting limited clinical factors. In addition, each type of tumour has different susceptibility factors and important indicators. Taking lung cancer as an example, attention needs to be paid to the influence of radon, asbestos, second-hand smoke and other factors, while in HCC, attention needs to be paid to hepatitis history, liver cirrhosis and AFP. Therefore, these models may not be generalizable to other types of cancer. Konstantina Kourou [52] summarized and analysed a number of studies that incorporated clinical factors and genomic information into models, but their research lacks external verification or testing of the predictive performance of the models.

In this case, other clinical information was omitted since we sought to create a simple but powerful model that could be externally verified by various data sets. Then, we evaluated the model, including the AUC curve and C-index, as well as the calibration plot, and made comparisons with the prognostic model constructed based on the risk score and other clinical factors to prove the advantages of the proposed modelling scheme. With the development of high-throughput sequencing technology and the generalization of clinical genetic testing, the use of genomic information to construct a predictive model for the simple prognostic analysis of patients will bring certain convenience to the clinic. Perhaps with the continuous improvement of various public databases, the joint modelling and analysis of clinical information and multiomics data will also become a trend.

Further gene function analysis results showed that all of the IRGs were mainly enriched in the positive regulation of secretion by cells, secretory granule lumen, receptor ligand activity, and cytokine-cytokine receptor interaction (Fig. 3). Specifically, these IRGs are mainly involved in various immune regulation processes (such as positive regulation of secretion by cells, positive regulation of secretion, antimicrobial humoral response, defence response to bacteria and humoral immune response) and take part in the composition of the secreted granule lumen, cytoplasmic vesicle lumen, and vesicle lumen; these IRGs regulate various receptors, ligands, growth factors, cytokines, and chemokines. In addition, they were mainly enriched in cytokine-cytokine receptor interactions, axon guidance, the TGF-beta signalling pathway, viral protein interactions with cytokines and cytokine receptors, and the Hippo signaling pathway. Most of the above items are related to immunity and inflammation, and the rest are classic signalling pathways in tumours. Jian Chen et al. reported that dysregulation of the TGF-beta signalling pathway plays a key role in immune regulation, inflammation and fibrogenesis in HCC [53]. Disorders of the Hippo signalling pathway are present in various tumours, including liver cancer [54], breast cancer [55] and lung cancer [56]. At present, immune checkpoint inhibitors can greatly improve the prognosis of HCC. However, the specific mechanism of the immune system affecting HCC are unclear, and further experiments is needed to confirm our conclusion.

In our research, we reached a nearly consistent conclusion with other researcher’s prognostic models: there were significant differences in OS between the high-risk and low-risk groups, the prognosis of the high-risk group was worse (Fig. 5a, p = 8.135 × 10− 7), and the same conclusion was reached in the testing set (Fig. 5a, p = 1.2535 × 10− 3). The patient’s risk score for HCC progressively increases as the expression levels of the genes in the risk signature increase, and the prognosis of HCC worsens as the risk score increases. More importantly, we constructed a nomogram based on these seven IRGs to quantitatively analyse the prognosis of HCC patients. The AUCs for 1-, 3-, and 5-year OS were 0.780, 0.699 and 0.685, respectively, and the C-index was 0.72 (95% CI: 0.68–0.77). HCC is a highly heterogeneous disease, and its prognosis is affected by many factors. We only included and analysed genes related to immunity and ignored the influence of other factors on HCC. Hence, our model does not show a high prognostic performance in predicting the long-term survival rate of patients, which is also one of the inherent defects of the model. We further analysed the risk score and clinically related factors in univariate and multivariate analyses and found that the risk score was associated with higher grade (P = 0.003) and stage III/IV disease (P = 0.004), which indicated that our prognostic model was more significant in advanced HCC patients. We believe that genetic detection should not be considered independently of individual characteristics. Therefore, we also constructed a nomogram combining the risk score and clinical factors, which can easily predict the 1-year, 3-year and 5-year OS of patients. It should be noted that the AUC values were all higher than 0.7. Compared with other clinical factors, the AUC value of the nomogram corresponding to risk score was the highest (AUC = 0.791), and the C-index was 0.78 (95% CI: 0.72–0.84). In addition, when we analysed the risk score combined with clinical factors, the C-index of the test dataset was 0.73 (95% CI: 0.67–0.78), indicating that our IPM has a modest prognostic performance in the test dataset. In the GSE14520 dataset, a series of test results were basically consistent with those in the TCGA dataset. Although the AUC values reached above 0.5 (Fig. 6), the same effect as that in the training set was not achieved, which may be because the samples in the GSE14520 dataset were from China. Generally, the model constructed in this study has certain advantages in the quantitative prediction of patient prognosis and adjustment of the treatment plan.

Our research results showed that the risk score is the only meaningful indicator in multiple analyses, which indicates that the risk score may have a better predictive ability for the OS of HCC. However, the standard error of an estimate does not tell us about the estimate’s contribution to a prediction model. Nonsignificant coefficients can still have very high predictive power, and vice versa. In addition, a significant covariate doesn’t imply a reliable estimation of survival; thus, we still need to assess the model. Next, we performed the same analysis in the test dataset and obtained the same conclusion. In addition, we evaluated our model with the AUC curve, C-index and calibration curve, which suggested that our model has good prognostic performance. To further verify the prognostic performance of the risk score, we compared the AUC value of the model constructed by the risk score with that of the model constructed by other clinical factors (Fig. 9b, e), and the results indicated that the risk score may be a good predictor of HCC survival. In addition, compared with a single gene, prognostic models based on multiple genes can better analyse the prognosis of patients. To develop a simple and effective method for evaluating the prognosis of HCC patients and find potential immunotherapy targets, we established a prognostic model based on the seven IRGs. Of course, ours is not the first IPM for HCC. Wen-jie Wang et al. constructed a prognostic model of 16 IRGs and a ceRNA network to predict the prognosis of HCC [57]; Junyu Long developed a HCC immune prognostic model related to TP53 [28]; and Dengchuan Wang et al. reported a four-gene signature prognostic model related to immune infiltration through coexpression analysis [57]. Recently, an increasing number of researchers have begun to recognize the significance of the TME in HCC, and IPMs have also received extensive attention. Compared with other prognostic models, our IPM has the following advantages. (1) We have not only established a seven-gene prognostic model of IRGs but also showed that the model can be independent of other clinical factors and is positively correlated with the degree of immune infiltration, which can provide valuable prognostic information for optimizing the individual treatment of HCC patients. Additionally, we constructed a gene nomogram and clinically related nomogram to quantitatively evaluate the 1-, 3-, and 5-year OS of patients. (2) We constructed a TF regulatory network, performed GSEA and analysed the possible mechanisms of the IRGs in the IPM related to HCC tumour infiltration, which can contribute to exploring the immunotherapy mechanism of HCC. (3) We performed gene mutation analysis and protein expression level analysis on the genes in this IPM, and also analysed the survival differences between patients with high and low expression levels of the IRGs. The conclusions obtained further confirmed the potential of IRGs in the model as a prognostic marker of HCC.

The signatures in this IPM have good prognosis performance, which could be potential prognosis and therapeutic targets for HCC. BIRC5, commonly known as Survivin, is the most effective molecule in inhibitor-of-apoptosis [58]. Experimental investigation showed that BIRC5 can promote the expression of VEGF, which in turn promotes angiogenesis in the tumour stromal [59]. PLXNA1 (Plexin-A1) is expressed in DC and participates in the interaction between T cells and DC, and may be involved in regulating the rearrangement of the cytoskeleton during the interaction between T cells and DC [60]. CSPG5 is only expressed in the human brain, and a study showed that it has a new function that binds to ERBB3 tyrosine kinase [61], and the ERBB3 somatic mutation is a potential tumour driver [62]. However, few studies have focused on its relevance to HCC immunotherapy. Ying Zhu et al. found that SPP1 can activate the CSF1-CSF1R pathway in tumour-associated TAMs and promote the expression of PD-L1 in HCC, and there is a positive correlation between SPP1 and PD-L1, TAM expression. On the other hand, SPP1 can induce endothelial cells and upregulate VEGF-induced migration of endothelial cells, having a synergistic effect with VEGF in tumour angiogenesis [63]. MAPT is mainly expressed in nerve cells, and more commonly studied in geriatric diseases such as various neurodegenerative diseases including Alzheimer’s disease [64]. Previous investigation identified that MAPT is overexpressed in certain cancers, and participate in the resistance of various tumours to taxane drugs [65], and its specific mechanism of action still needs further study. FABP6 is involved in the bile acid metabolic process and is related to the bile acid intestinal circulation [66]. STC plays a vital role in tumour growth, invasion, apoptosis and metastasis, and promotes local angiogenesis through the VEGF/VEGFR2 signalling pathway [67]. Hongwei Cheng et al. showed that the high expression of STC2 is related to the poor prognosis of HCC [68]. To further confirm the application value of these IRGs in HCC, we analysed the survival rates of groups with high- and low- expression levels of these seven genes and evaluated whether there was significance in patients’ OS, DSS, PFS, and RFS rates. The results showed that there were some contradictions in the survival analysis of FABP6. Also, the results of immunohistochemistry in the HPA database showed that except FABP6, the protein levels of other IRGs were significantly different between HCC tissues and normal liver tissues, and they were highly expressed in HCC, which was consistent with our conclusion. At present, FABP6 has not been reported in immunotherapy of HCC, which may be potential therapeutic targets for HCC. We then used the seven IRGs from the cBioPortal to obtain information about genetic mutations. The mutation rates of these seven IRGs are more than 4%, which may be useful for clinical research in the future. Collectively, these findings indicated that the seven IRGs have the potential to predict the prognosis of HCC.

To explore the potential molecular mechanisms associated with these IRGs, we constructed a TF-mediated regulatory network to screen out important TFs that might regulate identified the hub IRGs. BIRC5, PLXNA1 and CSPG5 were the core IRGs in this network; all three were positively regulated by 13 core TFs, of which EZH2 could positively regulate the expression of BIRC5, PLXNA1 and CSPG5. EZH2, the hub TF in our PPI analysis (Fig. 6a), is shown in the network diagram (based on the degree ranking criterion). An accumulating number of studies show that EZH2 is closely associated with of epigenetics [69], immunity [70], metastasis [71], angiogenesis [72] and apoptosis [73]. In addition, our GSEA results showed that these seven IRGs play an important role in tumour regulation in the high-risk and low-risk groups through immune metabolic pathways. Gaia Giannone demonstrated that immune metabolic disorders play an important role in acquired resistance to the TME and immune checkpoint inhibitors [74]. At present, the VEGF signalling pathway is widely used in the immunotherapy of HCC [75] and may be involved in cell proliferation, growth and apoptosis processes as well as the regulation of the PPAR and TP53 signalling pathways [76]. NOD-like receptor X1 can induce HCC cell apoptosis by regulating the PI3K-AKT signaling pathway [77]. The inhibition or promotion of the Notch signalling pathway in different tumours depends on the TME. The cross-talk between the Notch signalling pathway and p53 gene plays an important role in HCC and may be a potential target for HCC treatment [78]. Of particular note, based on the above studies, we found that EZH2 and BIRC5 can inhibit HCC cell apoptosis and are closely related to VEGF-mediated angiogenesis. Interestingly, in the regulatory network of TFs, EZH2 positively regulated BIRC5, with a correlation coefficient of 0.72 (p = 3.76 × 10− 57). STG and SPP1 are associated with the VEGF signalling pathway, PLXNA1 and SPP1 are associated with DCs or TAMs; CSPG5 is associated with common somatic mutation sites. The application values of MAPT and FABP6 in HCC need further experimental confirmation. In this case, we boldly speculate that EZH2 may mediate the angiogenesis of the VEGF signalling pathway through regulating the expression of the seven IRGs, which may be the possible mechanism of this predictive model related to immune infiltration in high-risk patients. In low-risk patients, we found that the mechanism of these seven IRGs related to the immune infiltration of HCC is related to metabolism. However, the specific mechanism remains to be further explored. The combination of antiangiogenic drugs and tumour immunotherapy will show great prospects in the near future. However, further insights by validation with immunohistochemistry analysis are needed to understand whether the VEGF signaling pathway is linked to high-risk groups.

To further assess the immune microenvironment of HCC, we also analysed the correlation between risk score and the following six types of immune cells: B cells, CD4+ T cells, CD8+ T cells, neutrophils, TAMs, and DCs. The results showed that for these six cell types, the degree of immune infiltration was positively correlated with the risk score, and the correlations between all immune cells and the risk score were statistically significant (P < 0.05). These results indicated that these cells have a high level of immune infiltration in high-risk patients. TAMs are phagocytes, which are the body’s first line of defence against external threats; they can produce proinflammatory responses to pathogens and repair damaged tissues. However, cytokines and chemokines expressed by TAMs can inhibit antitumour immunity and promote tumour progression [79]. The expression of M1 macrophages in HCC can promote tumour formation by promoting the expression of PD-L1, and their infiltration degree is positively correlated with the expression of PD-L1. On the other hand, Ying Zhu et al. found that there was a positive correlation between the expression of SPP1 and PD-L1 and the infiltration of TAMs in HCC tissues, which played an important role in the immune microenvironment of HCC [80]. All these results suggested that our high-risk patients may benefit from PD-L1 treatment. Li Li et al. [81] illustrated that the CXCR2-CXCL1 axis can regulate neutrophil infiltration in HCC; this axis is an independent prognostic factor for HCC and may be a potential target for anti-HCC therapy. Overexpression of CXCL5 is associated with neutrophil infiltration and poor prognosis of HCC [82]. Wei Y et al. showed that the depletion of B cells can prevent the production of TAMs and increase the antitumour T cell response to inhibit the growth of HCC [83]. Several studies indicated that high infiltration levels of immunosuppressive TAMs and regulatory T cells are associated with reduced OS and can increase the aggressiveness of HCC [84, 85]. Research by Zhou ZJ et al. [86] illustrated that the high infiltration level of plasmacytoid DCs is related to the poor prognosis of HCC, and plasmacytoid DC infiltration in HCC can promote tumour progression by promoting the immunosuppression of CD4+ type 1 T regulatory (Tr1) cells [87], which is basically consistent with our research conclusions. The occurrence and development of HCC involve interactions between various immune cells, and participates in the regulation of HCC immunotherapy through a complex mechanism. Our IPM may be a predictor of increased infiltration of HCC immune cells. However, the correlation coefficient between the infiltration abundance of CD4+ T cells and CD8+ T cells and the risk score was less than 0.4. In addition, studies have shown that CD4+ T cells and CD8+ T cells can suppress the occurrence and proliferation of HCC due to their antitumour immune response [88]; however, another study indicated that the proportion and absolute number of CD4+ T cells in the area surrounding HCC tumour tissue increases significantly and could promote the progression of HCC [89]. Shinji Itoh et al. showed that the presence of CD8+T cells is associated with longer OS [90]. Although our conclusion is basically consistent with that of previous studies in terms of the immune infiltration of HCC, the differences we found were not was not related to any biologically relevant levels that should be interpreted outside this analysis in the future. Overall, this study can provide direction and guidance for the mechanism of immune cells in HCC, but the specific mechanism remains to be elucidated by further basic research.

We systematically and comprehensively analysed the application value of our IPM for HCC, which could provide new insights for the treatment of this disease. However, the current study still has certain limitations. The IPM constructed based on the TCGA database has the best predictive performance in the training set. In the testing set, the risk score is not the best indicator to predict the OS of HCC. Limited by clinical factors in the testing group, our model still needs to be further validated in other datasets. Although we increased the reliability of our conclusions by combining multiple datasets, evaluating various aspects of the IPM, and verifying our results using the GSE14520 database. Most of the studies are plagued by a lack of validation in vitro and in vivo validation experiments, and further evidence provided by a well-designed clinical study is needed. It is also essential to use relevant basic experiments to further explain the mechanism of HCC immunotherapy. On the other hand, we have studied six major immune cells, the correlation between immune cells and the risk score was weak, and the association between more immune cell subtypes and HCC is still unclear. Therefore, we suggest that further individual studies and discussions should be conducted in the future. We believe that the era of HCC immunotherapy will soon be realized, and we look forward to it.


To sum up, we constructed an IPM with seven prognostic IRGs by combining different data types from multiple databases. Individuals with HCC were automatically classified into high-risk and low-risk groups based on their risk scores, with gene expression as an independent variable. In addition, we established gene-related and clinical factor-related nomograms to facilitate more-comprehensive prognostic assessments of HCC patients. Finally, the results of the association between infiltration abundance of common immune cells in the TME and risk score showed that our IPM could predict the TME to a certain extent. This model will be a reliable tool for predicting prognosis in HCC by combining genomic characteristics, immune infiltration abundance, and clinical factors.

Availability of data and materials

The datasets for this study can be found in TCGA [] and GEO databases [].

Change history


  1. Villanueva A. Hepatocellular Carcinoma. N Engl J Med. 2019;380(15):1450–62.

    Article  CAS  PubMed  Google Scholar 

  2. El-Serag HB, Rudolph KL. Hepatocellular carcinoma: epidemiology and molecular carcinogenesis. Gastroenterology. 2007;132(7):2557–76.

    Article  CAS  PubMed  Google Scholar 

  3. Forner A, Reig M, Bruix J. Hepatocellular carcinoma. Lancet. 2018;391(10127):1301–14.

    Article  PubMed  Google Scholar 

  4. Khemlina G, Ikeda S, Kurzrock R. The biology of hepatocellular carcinoma: implications for genomic and immune therapies. Mol Cancer. 2017;16(1):149.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Llovet JM, Zucman-Rossi J, Pikarsky E, Sangro B, Schwartz M, Sherman M, et al. Hepatocellular carcinoma. Nat Rev Dis Primers. 2016;2(1):16018.

    Article  PubMed  Google Scholar 

  6. Shlomai A, de Jong YP, Rice CM. Virus associated malignancies: the role of viral hepatitis in hepatocellular carcinoma. Semin Cancer Biol. 2014;26:78–88.

    Article  CAS  PubMed  Google Scholar 

  7. Park NH, Chung YH, Lee HS. Impacts of vaccination on hepatitis B viral infections in Korea over a 25-year period. Intervirology. 2010;53(1):20–8.

    Article  PubMed  Google Scholar 

  8. Kanwal F, Singal AG. Surveillance for hepatocellular carcinoma: current best practice and future direction. Gastroenterology. 2019;157(1):54–64.

    Article  PubMed  Google Scholar 

  9. Yarchoan M, Agarwal P, Villanueva A, Rao S, Dawson LA, Karasic T, et al. Recent developments and therapeutic strategies against hepatocellular carcinoma. Cancer Res. 2019;79(17):4326–30.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Bellmunt J, de Wit R, Vaughn DJ, Fradet Y, Lee JL, Fong L, et al. Pembrolizumab as second-line therapy for advanced Urothelial carcinoma. N Engl J Med. 2017;376(11):1015–26.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Ferris RL, Blumenschein G Jr, Fayette J, Guigay J, Colevas AD, Licitra L, et al. Nivolumab for recurrent squamous-cell carcinoma of the head and neck. N Engl J Med. 2016;375(19):1856–67.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Larkin J, Chiarion-Sileni V, Gonzalez R, Grob JJ, Cowey CL, Lao CD, et al. Combined Nivolumab and Ipilimumab or Monotherapy in untreated melanoma. N Engl J Med. 2015;373(1):23–34.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Mandal R, Chan TA. Personalized oncology meets immunology: the path toward precision immunotherapy. Cancer Discov. 2016;6(7):703–13.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Motzer RJ, Escudier B, McDermott DF, George S, Hammers HJ, Srinivas S, et al. Nivolumab versus Everolimus in advanced renal-cell carcinoma. N Engl J Med. 2015;373(19):1803–13.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Reck M, Rodriguez-Abreu D, Robinson AG, et al. Pembrolizumab versus chemotherapy for PD-L1-positive non-small-cell lung Cancer. N Engl J Med. 2016;375(19):1823–33.

    Article  CAS  PubMed  Google Scholar 

  16. Inarrairaegui M, Melero I, Sangro B. Immunotherapy of hepatocellular carcinoma: facts and hopes. Clin Cancer Res. 2018;24(7):1518–24.

    Article  CAS  PubMed  Google Scholar 

  17. Zongyi Y, Xiaowu L. Immunotherapy for hepatocellular carcinoma. Cancer Lett. 2020;470:8–17.

    Article  CAS  PubMed  Google Scholar 

  18. Heymann F, Tacke F. Immunology in the liver--from homeostasis to disease. Nat Rev Gastroenterol Hepatol. 2016;13(2):88–110.

    Article  CAS  PubMed  Google Scholar 

  19. Pitt JM, Marabelle A, Eggermont A, Soria JC, Kroemer G, Zitvogel L. Targeting the tumor microenvironment: removing obstruction to anticancer immune responses and immunotherapy. Ann Oncol. 2016;27(8):1482–92.

    Article  CAS  PubMed  Google Scholar 

  20. Chew V, Lai L, Pan L, Lim CJ, Li J, Ong R, et al. Delineation of an immunosuppressive gradient in hepatocellular carcinoma using high-dimensional proteomic and transcriptomic analyses. Proc Natl Acad Sci U S A. 2017;114(29):E5900–e5909.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Joyce JA, Pollard JW. Microenvironmental regulation of metastasis. Nat Rev Cancer. 2009;9(4):239–52.

    Article  CAS  PubMed  Google Scholar 

  22. Grivennikov SI, Greten FR, Karin M. Immunity, inflammation, and cancer. Cell. 2010;140(6):883–99.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Kulik L, El-Serag HB. Epidemiology and Management of Hepatocellular Carcinoma. Gastroenterology. 2019;156(2):477–491.e471.

    Article  PubMed  Google Scholar 

  24. Yarchoan M, Xing D, Luan L, Xu H, Sharma RB, Popovic A, et al. Characterization of the immune microenvironment in hepatocellular carcinoma. Clin Cancer Res. 2017;23(23):7333–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Kurebayashi Y, Ojima H, Tsujikawa H, et al. Landscape of immune microenvironment in hepatocellular carcinoma and its additional impact on histological and molecular classification. Hepatology. 2018;68(3):1025–41.

    Article  CAS  PubMed  Google Scholar 

  26. Gnjatic S, Bronte V, Brunet LR, Butler MO, Disis ML, Galon J, et al. Identifying baseline immune-related biomarkers to predict clinical outcome of immunotherapy. J Immunother Cancer. 2017;5(1):44.

    Article  PubMed  PubMed Central  Google Scholar 

  27. Li X, Xu W, Kang W, Wong SH, Wang M, Zhou Y, et al. Genomic analysis of liver cancer unveils novel driver genes and distinct prognostic features. Theranostics. 2018;8(6):1740–51.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Shen S, Wang G, Zhang R, Zhao Y, Yu H, Wei Y, et al. Development and validation of an immune gene-set based prognostic signature in ovarian cancer. EBioMedicine. 2019;40:318–26.

    Article  PubMed  Google Scholar 

  29. Wan B, Liu B, Huang Y, Yu G, Lv C. Prognostic value of immune-related genes in clear cell renal cell carcinoma. Aging. 2019;11(23):11474–89.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Yang S, Wu Y, Deng Y, Zhou L, Yang P, Zheng Y, et al. Identification of a prognostic immune signature for cervical cancer to predict survival and response to immune checkpoint inhibitors. Oncoimmunology. 2019;8(12):e1659094.

    Article  PubMed  PubMed Central  Google Scholar 

  31. Long J, Wang A, Bai Y, Lin J, Yang X, Wang D, et al. Development and validation of a TP53-associated immune prognostic model for hepatocellular carcinoma. EBioMedicine. 2019;42:363–74.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Moeini A, Torrecilla S, Tovar V, Montironi C, Andreu-Oller C, Peix J, et al. An immune gene expression signature associated with development of human hepatocellular carcinoma identifies mice that respond to Chemopreventive agents. Gastroenterology. 2019;157(5):1383–97 e1311.

    Article  CAS  PubMed  Google Scholar 

  33. Liang S, Ma HY, Zhong Z, Dhar D, Liu X, Xu J, et al. NADPH oxidase 1 in liver macrophages promotes inflammation and tumor development in mice. Gastroenterology. 2019;156(4):1156–72 e1156.

    Article  CAS  PubMed  Google Scholar 

  34. International Cancer Genome C, Hudson TJ, Anderson W, et al. International network of cancer genome projects. Nature. 2010;464(7291):993–8.

    Article  CAS  Google Scholar 

  35. Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista C, et al. NCBI GEO: archive for high-throughput functional genomic data. Nucleic Acids Res. 2009;37(Database issue):D885–90.

    Article  CAS  PubMed  Google Scholar 

  36. Zalocusky KA, Kan MJ, Hu Z, Dunn P, Thomson E, Wiser J, et al. The 10,000 Immunomes project: building a resource for human immunology. Cell Rep. 2018;25(7):1995.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43(7):e47.

    Article  PubMed  PubMed Central  Google Scholar 

  38. Muller C, Schillert A, Rothemeier C, et al. Removing batch effects from longitudinal gene expression - Quantile normalization plus ComBat as best approach for microarray Transcriptome data. PLoS One. 2016;11(6):e0156594.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16(5):284–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Imran QM, Hussain A, Lee S-U, Mun BG, Falak N, Loake GJ, et al. Transcriptome profile of NO-induced Arabidopsis transcription factor genes suggests their putative regulatory role in multiple biological processes. Sci Rep. 2018;8(1):771.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Mei S, Meyer CA, Zheng R, Qin Q, Wu Q, Jiang P, et al. Cistrome Cancer: a web resource for integrative gene regulation modeling in Cancer. Cancer Res. 2017;77(21):e19–22.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13(11):2498–504.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Chin CH, Chen SH, Wu HH, Ho CW, Ko MT, Lin CY. CytoHubba: identifying hub objects and sub-networks from complex interactome. BMC Syst Biol. 2014;8(Suppl 4):S11.

    Article  PubMed  PubMed Central  Google Scholar 

  44. Zhao Y, Simon R. Development and validation of predictive indices for a continuous outcome using gene expression profiles. Cancer Informat. 2010;9:105–14.

    Article  CAS  Google Scholar 

  45. Farazi PA, DePinho RA. Hepatocellular carcinoma pathogenesis: from genes to environment. Nat Rev Cancer. 2006;6(9):674–87.

    Article  CAS  PubMed  Google Scholar 

  46. Vickers AJ, Cronin AM, Elkin EB, Gonen M. Extensions to decision curve analysis, a novel method for evaluating diagnostic tests, prediction models and molecular markers. BMC Med Inform Decis Mak. 2008;8(1):53.

    Article  PubMed  PubMed Central  Google Scholar 

  47. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;102(43):15545–50.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Cerami E, Gao J, Dogrusoz U, Gross BE, Sumer SO, Aksoy BA, et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2012;2(5):401–4.

    Article  PubMed  Google Scholar 

  49. Ponten F, Schwenk JM, Asplund A, Edqvist PH. The human protein atlas as a proteomic resource for biomarker discovery. J Intern Med. 2011;270(5):428–46.

    Article  CAS  PubMed  Google Scholar 

  50. Nagy A, Lanczky A, Menyhart O, Gyorffy B. Validation of miRNA prognostic power in hepatocellular carcinoma using expression data of independent datasets. Sci Rep. 2018;8(1):9227.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Chen F, Zhuang X, Lin L, Yu P, Wang Y, Shi Y, et al. New horizons in tumor microenvironment biology: challenges and opportunities. BMC Med. 2015;13(1):45.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Kourou K, Exarchos TP, Exarchos KP, Karamouzis MV, Fotiadis DI. Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J. 2015;13:8–17.

    Article  CAS  PubMed  Google Scholar 

  53. Chen J, Gingold JA, Su X. Immunomodulatory TGF-beta signaling in hepatocellular carcinoma. Trends Mol Med. 2019;25(11):1010–23.

    Article  CAS  PubMed  Google Scholar 

  54. Lee KP, Lee JH, Kim TS, Kim TH, Park HD, Byun JS, et al. The hippo-Salvador pathway restrains hepatic oval cell proliferation, liver size, and liver tumorigenesis. Proc Natl Acad Sci U S A. 2010;107(18):8248–53.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Cordenonsi M, Zanconato F, Azzolin L, Forcato M, Rosato A, Frasson C, et al. The hippo transducer TAZ confers cancer stem cell-related traits on breast cancer cells. Cell. 2011;147(4):759–72.

    Article  CAS  PubMed  Google Scholar 

  56. Lau AN, Curtis SJ, Fillmore CM, Rowbotham SP, Mohseni M, Wagner DE, et al. Tumor-propagating cells and yap/Taz activity contribute to lung tumor progression and metastasis. EMBO J. 2014;33(5):468–81.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Wang D, Liu J, Liu S, Li W. Identification of crucial genes associated with immune cell infiltration in hepatocellular carcinoma by weighted gene co-expression network analysis. Front Genet. 2020;11:342.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Su C. Survivin in survival of hepatocellular carcinoma. Cancer Lett. 2016;379(2):184–90.

    Article  CAS  PubMed  Google Scholar 

  59. Fernandez JG, Rodriguez DA, Valenzuela M, et al. Survivin expression promotes VEGF-induced tumor angiogenesis via PI3K/Akt enhanced beta-catenin/Tcf-Lef dependent transcription. Mol Cancer. 2014;13(1):209.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Takegahara N, Kumanogoh A, Kikutani H. Semaphorins: a new class of immunoregulatory molecules. Philos Trans R Soc Lond Ser B Biol Sci. 2005;360(1461):1673–80.

    Article  CAS  Google Scholar 

  61. Kinugasa Y, Ishiguro H, Tokita Y, Oohira A, Ohmoto H, Higashiyama S. Neuroglycan C, a novel member of the neuregulin family. Biochem Biophys Res Commun. 2004;321(4):1045–9.

    Article  CAS  PubMed  Google Scholar 

  62. Kiavue N, Cabel L, Melaabi S, Bataillon G, Callens C, Lerebours F, et al. ERBB3 mutations in cancer: biological aspects, prevalence and therapeutics. Oncogene. 2020;39(3):487–502.

    Article  CAS  PubMed  Google Scholar 

  63. Zhu Y, Yang J, Xu D, Gao XM, Zhang Z, Hsu JL, et al. Disruption of tumour-associated macrophage trafficking by the osteopontin-induced colony-stimulating factor-1 signalling sensitises hepatocellular carcinoma to anti-PD-L1 blockade. Gut. 2019;68(9):1653–66.

    Article  CAS  PubMed  Google Scholar 

  64. Derry PJ, Hegde ML, Jackson GR, Kayed R, Tour JM, Tsai AL, et al. Revisiting the intersection of amyloid, pathologically modified tau and iron in Alzheimer's disease from a ferroptosis perspective. Prog Neurobiol. 2020;184:101716.

    Article  CAS  PubMed  Google Scholar 

  65. Gargini R, Segura-Collar B, Sanchez-Gomez P. Novel functions of the neurodegenerative-related gene tau in Cancer. Front Aging Neurosci. 2019;11:231.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Zwicker BL, Agellon LB. Transport and biological activities of bile acids. Int J Biochem Cell Biol. 2013;45(7):1389–98.

    Article  CAS  PubMed  Google Scholar 

  67. Chu SJ, Zhang J, Zhang R, Lu WW, Zhu JS. Evolution and functions of stanniocalcins in cancer. Int J Immunopathol Pharmacol. 2015;28(1):14–20.

    Article  CAS  PubMed  Google Scholar 

  68. Cheng H, Wu Z, Wu C, Wang X, Liow SS, Li Z, et al. Overcoming STC2 mediated drug resistance through drug and gene co-delivery by PHB-PDMAEMA cationic polyester in liver cancer cells. Mater Sci Eng C Mater Biol Appl. 2018;83:210–7.

    Article  CAS  PubMed  Google Scholar 

  69. Xiao G, Jin LL, Liu CQ, Wang YC, Meng YM, Zhou ZG, et al. EZH2 negatively regulates PD-L1 expression in hepatocellular carcinoma. J Immunother Cancer. 2019;7(1):300.

    Article  PubMed  PubMed Central  Google Scholar 

  70. Bugide S, Green MR, Wajapeyee N. Inhibition of enhancer of zeste homolog 2 (EZH2) induces natural killer cell-mediated eradication of hepatocellular carcinoma cells. Proc Natl Acad Sci U S A. 2018;115(15):E3509–18.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Sun S, Wang W, Luo X, Li Y, Liu B, Li X, et al. Circular RNA circ-ADD3 inhibits hepatocellular carcinoma metastasis through facilitating EZH2 degradation via CDK1-mediated ubiquitination. Am J Cancer Res. 2019;9(8):1695–707.

    CAS  PubMed  PubMed Central  Google Scholar 

  72. Lu C, Han HD, Mangala LS, Ali-Fehmi R, Newton CS, Ozbun L, et al. Regulation of tumor angiogenesis by EZH2. Cancer Cell. 2010;18(2):185–97.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Guo J, Hao C, Wang C, Li L. Long noncoding RNA PVT1 modulates hepatocellular carcinoma cell proliferation and apoptosis by recruiting EZH2. Cancer Cell Int. 2018;18(1):98.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Giannone G, Ghisoni E, Genta S, et al. Immuno-metabolism and microenvironment in cancer: key players for immunotherapy. Int J Mol Sci. 2020;21(12):4414.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Bupathi M, Kaseb A, Meric-Bernstam F, Naing A. Hepatocellular carcinoma: where there is unmet need. Mol Oncol. 2015;9(8):1501–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Zhu GZ, Liao XW, Wang XK, Gong YZ, Liu XG, Yu L, et al. Comprehensive investigation of p53, p21, nm23, and VEGF expression in hepatitis B virus-related hepatocellular carcinoma overall survival after hepatectomy. J Cancer. 2020;11(4):906–18.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Hu B, Ding GY, Fu PY, Zhu XD, Ji Y, Shi GM, et al. NOD-like receptor X1 functions as a tumor suppressor by inhibiting epithelial-mesenchymal transition and inducing aging in hepatocellular carcinoma cells. J Hematol Oncol. 2018;11(1):28.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Giovannini C, Bolondi L, Gramantieri L. Targeting notch3 in hepatocellular carcinoma: molecular mechanisms and therapeutic perspectives. Int J Mol Sci. 2016;18(1):56.

    Article  PubMed  PubMed Central  Google Scholar 

  79. Pathria P, Louis TL, Varner JA. Targeting tumor-associated macrophages in Cancer. Trends Immunol. 2019;40(4):310–27.

    Article  CAS  PubMed  Google Scholar 

  80. Zong Z, Zou J, Mao R, Ma C, Li N, Wang J, et al. M1 macrophages induce PD-L1 expression in hepatocellular carcinoma cells through IL-1beta signaling. Front Immunol. 2019;10:1643.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Li L, Xu L, Yan J, et al. CXCR2-CXCL1 axis is correlated with neutrophil infiltration and predicts a poor prognosis in hepatocellular carcinoma. J Exp Clin Cancer Res. 2015;34:129.

    Article  PubMed  PubMed Central  Google Scholar 

  82. Zhou SL, Dai Z, Zhou ZJ, et al. Overexpression of CXCL5 mediates neutrophil infiltration and indicates poor prognosis for hepatocellular carcinoma. Hepatology. 2012;56(6):2242–54.

    Article  CAS  PubMed  Google Scholar 

  83. Wei Y, Lao XM, Xiao X, Wang XY, Wu ZJ, Zeng QH, et al. Plasma cell polarization to the immunoglobulin G phenotype in hepatocellular carcinomas involves epigenetic alterations and promotes Hepatoma progression in mice. Gastroenterology. 2019;156(6):1890–904 e1816.

    Article  CAS  PubMed  Google Scholar 

  84. Zhou J, Ding T, Pan W, Zhu LY, Li L, Zheng L. Increased intratumoral regulatory T cells are related to intratumoral macrophages and poor prognosis in hepatocellular carcinoma patients. Int J Cancer. 2009;125(7):1640–8.

    Article  CAS  PubMed  Google Scholar 

  85. Kuang DM, Wu Y, Chen N, Cheng J, Zhuang SM, Zheng L. Tumor-derived hyaluronan induces formation of immunosuppressive macrophages through transient early activation of monocytes. Blood. 2007;110(2):587–95.

    Article  CAS  PubMed  Google Scholar 

  86. Zhou ZJ, Xin HY, Li J, Hu ZQ, Luo CB, Zhou SL. Intratumoral plasmacytoid dendritic cells as a poor prognostic factor for hepatocellular carcinoma following curative resection. Cancer Immunol Immunother. 2019;68(8):1223–33.

    Article  PubMed  Google Scholar 

  87. Pedroza-Gonzalez A, Zhou G, Vargas-Mendez E, et al. Tumor-infiltrating plasmacytoid dendritic cells promote immunosuppression by Tr1 cells in human liver tumors. Oncoimmunology. 2015;4(6):e1008355.

    Article  PubMed  PubMed Central  Google Scholar 

  88. Mossanen JC, Tacke F. Role of lymphocytes in liver cancer. Oncoimmunology. 2013;2(11):e26468.

    Article  PubMed  PubMed Central  Google Scholar 

  89. Yang XH, Yamagiwa S, Ichida T, Matsuda Y, Sugahara S, Watanabe H, et al. Increase of CD4+ CD25+ regulatory T-cells in the liver of patients with hepatocellular carcinoma. J Hepatol. 2006;45(2):254–62.

    Article  CAS  PubMed  Google Scholar 

  90. Itoh S, Yoshizumi T, Yugawa K, et al. Impact of immune response on outcomes in hepatocellular carcinoma: association with vascular formation. Hepatology. 2020;72(6):1987.

    Article  CAS  PubMed  Google Scholar 

Download references


We thank LetPub ( and Nature Research Editing Service for its linguistic assistance during the preparation of this manuscript.


This work was supported by R & D projects in key areas of Guangdong Province, Construction of high-level university in Guangzhou University of Chinese Medicine (Grant number: A1-AFD018181A29), Guangzhou University of Chinese Medicine National University Student Innovation and Entrepreneurship Training Project (Project Leader: Xinqian Yang; grant number: 201810572038) and the First Affiliated Hospital of Guangzhou University of Chinese Medicine Innovation and Student Training Team Incubation Project (Project leader: Wenjiang Zheng; grant number: 2018XXTD003), and 2020 National College Student Innovation and Entrepreneurship Training Program of Guangzhou University of Chinese Medicine (Project leader: Ping Zhang; grant number: S202010572123).

Author information

Authors and Affiliations



QY and WJZ were responsible for research design and writing, and BQY were responsible for data and bioinformatics analysis. In the meantime, BQW made a great contribution to the revision process of our research. HYL was responsible for checking full-text grammatical errors, XWW guided research ideas, design, research methods, and manuscript revision. The author(s) read and approved the final manuscript.

Corresponding author

Correspondence to Xiongwen Wang.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential or actual conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yan, Q., Zheng, W., Wang, B. et al. A prognostic model based on seven immune-related genes predicts the overall survival of patients with hepatocellular carcinoma. BioData Mining 14, 29 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: