Original research

Network-based hub biomarker discovery for glaucoma

Abstract

Objective Glaucoma is an optic neuropathy and the leading cause of irreversible blindness worldwide. However, the early detection of glaucoma remains challenging, as chronic forms of glaucoma remain largely asymptomatic until considerable irreversible visual field deficits have ensued. Thus, biomarkers that facilitate early diagnosis and treatment for glaucoma patients with a high risk of progression are pressing.

Methods and analysis Human disease–biomarker interactions network and human disease–target–drug interactions network were first constructed based on multiomics data. The greedy search algorithm was used to search for the hub biomarkers and drug targets for glaucoma. Genome-wide association studies and epidemiological data from the UK Biobank were used to verify our results. Biological network and functional analysis was conducted to find common network features and pathways.

Results We identified 10 hub biomarkers/drug targets for the diagnosis, treatment and prognosis for glaucoma. These results were verified by text mining and genomic/epidemiology data. We also predicted the new application of BMP1 and MMP9 to diagnose glaucoma and confirm the theory of hub biomarkers with multiple clinical applications. Further, relevant pivotal pathways for these hub biomolecules were discovered, which may serve as foundations for future biomarker and drug target prediction for glaucoma.

Conclusion We have used a network-based approach to identify hub diagnostic and therapeutic biomarkers for glaucoma and detected relationships between glaucoma and associated diseases. Several hub biomarkers were identified and verified, which may play more important roles in the diagnosis and treatment of glaucoma.

What is already known on this topic

  • Glaucoma is the leading cause of irreversible blindness worldwide. Early detection remains challenging because chronic forms of glaucoma are largely asymptomatic until significant visual field loss occurs. Biomarkers that enable early diagnosis and treatment for patients at high risk of progression are urgently needed.

What this study adds

  • This study identified 10 hub biomarkers and drug targets for the diagnosis, treatment and prognosis of glaucoma using disease–biomarker and disease–target–drug interaction networks. BMP1 and MMP9 were proposed as novel diagnostic biomarkers, and key pathways relevant to these hub biomolecules were discovered.

How this study might affect research, practice or policy

  • The identified biomarkers and pathways provide a foundation for future glaucoma biomarker and drug target prediction. This research may guide early diagnostic efforts, influence clinical treatment strategies and inspire further studies in the field of glaucoma.

Introduction

Glaucoma is one of the leading causes of irreversible blindness globally, characterised by retinal ganglion cell loss, retinal nerve fibre layer thinning and optic disc cupping.1 2 Glaucoma affects 80 million people and is undiagnosed in 9 of 10 affected people worldwide.1 Despite the considerable health burden attributable to glaucoma, the accurate diagnosis and treatment of glaucoma remain far from optimal3–5; indeed, elevated intraocular pressure is the only risk factor that can be treated with medications, laser procedures or glaucoma surgery at present.6 Further research to identify biomarkers critical to the early diagnosis and appropriate therapy for glaucoma is imperative in alleviating its associated burden of disease.7

The pathogenesis of glaucoma is mediated by multiple genes, while the encoded biomolecules have cross-interactions that are best studied as networks.8 A wide body of literature has attempted to elucidate the clinical correlations among genetic information, biomarkers and the pathogenesis of glaucoma. Several genes (eg, CDKN2B-AS1, CAV1 and CAV2, TMCO1, ABCA1, AFAP1, GAS7, TXNRD2, ATXN2, SIX1 and SIX6) associated with quantitative glaucoma-related traits (eg, intraocular pressure, central corneal thickness and optic disc size) have been identified in genome-wide association studies (GWAS).9–19 However, there is yet to be one clear pathway that best characterises this. It is likely the multifactorial pathogenesis of glaucoma best lends itself to network studies.

Besides intraocular pressure, oxidative stress, systemic and ocular vascular factors, elevated glutamate concentration, nitric oxide levels and an autoimmune process have also been implicated in glaucoma pathogenesis.1 20–25 Many biomarkers for glaucoma have been identified, including crystallins, heat shock protein 60 (HSP 60) and HSP 90, myotrophin, apolipoprotein B and apolipoprotein E, endothelial leucocyte adhesion molecule-1, myoblast determination protein 1, myogenin, vasodilator-stimulated phosphoprotein, ankyrin-2 and transthyretin.26–29 Additionally, a growing body of evidence supports targets including oxidative stress, and neuroinflammation might hold great potential for the treatment of glaucoma.30 31 However, given the heterogeneity of experiment samples and environments, it is hard to compare the statistical power and clinical effect for different biomarkers from different studies.

With the advent of the big data era, massive amounts of biomedical data are accumulated and stored on databases. For example, the STRING Database includes the interaction relationships of 19 257 human proteins,32 the Therapeutic Target Database (TTD) contains the interaction information between 1512 human diseases to 37316/3419 drugs/targets and 1313 biomarkers.33 This interaction information has helped many studies to detect the relationships for specific biomolecules and explain their biological function. Also, integrated with the interaction information stored in STRING and TTD, new interactions and functions for biomolecules have been predicted and applied. Hence, it is feasible to integrate the human disease–biomarker information and disease–target–drug information as networks to study them systematically. Our study aimed to fill the gap in analysing glaucoma biomarkers and drug targets based on complex biological interaction networks and found several hub (important points on networks) biomarkers and drug targets. We further identified hub pathways (important pathways) based on these hub biomarkers and drug targets, guiding future biomarkers and drug targets discovery for glaucoma. In addition, the multimorbidity network associated with glaucoma was fully assessed to provide possible underlying biomarkers and pathways.

Materials and methods

Figure 1 outlines the analysis pipeline for this study.

Figure 1
Figure 1

Study pipeline. This study contained two parts: the biomarker analysis and the drug–target analysis. The biomarker analysis came from downloading and integrating biomarkers–diseases interaction information to construct the human disease–biomarker interactions network (HDBIN). Then the greedy search was implemented to find the glaucoma-related hub diseases-biomarkers interaction network (GHDBIN). Several close diseases were found on the GHDBIN, and we used genome-wide association studies (GWAS) data from GWAS Catalog, text data from PubMed and epidemiology data from UK Biobank to verify the result. Based on the GHDBIN, we extracted the glaucoma hub biomarker network (GHBN). Further, the application information collected from PubMed and protein–protein interaction (PPI) information from STRING were added to the hub biomarker network. Gene expression data from GEO was used to verify the diagnosis value of hub biomarkers. Further, biological functional analysis was conducted to find key pathways for glaucoma hub biomarkers. The drug–target analysis started with constructing the human disease–target–drug interactions network (HDTDIN), and the glaucoma hub targets network (GHTN) was selected by the greedy form HDTDIN. PPI information was also annotated to GHTN and critical pathways for glaucoma drug targets were identified (cGMP-PKG signalling pathway, KEGG Permission ID: 231472). TTD, Therapeutic Target Database.

Data acquisition

Biomarker–disease interaction information, drug–target interaction information and target–disease interaction information were downloaded from the TTD (http://idrblab.net/ttd/).

Protein–protein interaction (PPI) information and pathway information were collected from the STRING Database (https://string-db.org/), the KEGG Database (https://www.genome.jp/kegg/),34 the Reactom Database (https://reactome.org/)35 and the Gene Ontology Database (http://geneontology.org/).36

Disease-related GWAS data were downloaded from the GWAS Catalog Database (https://www.ebi.ac.uk/gwas/).37

The UK Biobank (UKB) Database (https://www.ukbiobank.ac.uk/) provided epidemiological data regarding the patients’ situation.38

Gene expression data were obtained from the GEO Database (https://www.ncbi.nlm.nih.gov/geo/, GSE2378, GPL8300 platform).39 40 The ‘GEOquery’ package in R language was used to download the gene expression data. GSE2378 contained four pairs of normal human eyes from MidAmerica Transplant Services (St. Louis, Missouri, USA), and two pairs of glaucoma from the Glaucoma Research Foundation (San Francisco, California, USA). All donors were Caucasians. Gene expression data were generated by cDNA microarray.

Construction of complex networks

Human disease–biomarker interactions network (HDBIN)

We integrated and normalised the human diseases and biomarkers information as standard interaction format in the graph, using the Human Disease Ontology Database (https://www.ebi.ac.uk/ols/ontologies/doid)41 and NCBI Protein (https://www.ncbi.nlm.nih.gov/protein/)42 and Gene (https://www.ncbi.nlm.nih.gov/gene/) databases.43 Based on their interactions, we created a network, which we designated as the HDBIN. Using Cytoscape V.3.8.2 software, diseases and biomarkers were marked as points and their relationships as edges.44

Human disease–target–drug interactions network (HDTDIN)

Meanwhile, the interaction data of biomarker–disease, drug–target and target–disease downloaded from TTD were also systematically reorganised to construct the HDTDIN. Using Cytoscape, diseases, drugs and drug targets were represented as distinct nodes, with edges illustrating the interactions between them. This network allowed for a comprehensive visualisation of how drugs and their molecular targets are connected to various diseases, providing a clearer understanding of potential therapeutic pathways.

Searching for hub networks

The ‘greedy search algorithm’, a clustering method that takes the best or optimal choice in the current state in each step of the selection, on the jActiveModules plugin of Cytoscape was used to find the most active network models (hub networks: the most active small network clustered from the big network) for glaucoma based on HDBIN and HDTDIN. Seven network topology features were selected in searching, and they were: topological coefficient, average shortest path length, closeness centrality, clustering coefficient, radiality, betweenness centrality and neighbourhood connectivity. In greedy search cluster, the number of submodels was set as 1000, and the overlap threshold was 0.8. The jActiveModules plugin calculated the active levels of clustered networks, and we selected the glaucoma belonged active model as the glaucoma-related hub diseases-biomarkers interaction network (GHDBIN).

The Pesca plugin on Cytoscape was then used to calculate the shortest paths from other diseases to glaucoma on the GHDBIN, to find the diseases close to glaucoma on the network. We have hereafter termed these diseases as glaucoma network neighbour diseases (GNND).

Glaucoma-related diseases relationships verification test on GWAS data

In order to identify the gene-level relationships for glaucoma and its closest diseases on HNDBIN (network distance very close to glaucoma) found on GHDBIN, GWAS data was used to find the genes that were corelated to glaucoma and other diseases (designated hereafter as inter-genes). Biological network and functional analysis was subsequently conducted for these cospecific genes to explore their biological relationships and identify key pathways.

GNND verification test on epidemiological data

The GNND on HNDBIN were verified using the UKB on epidemiological level. The UKB is a population-based cohort of more than 500 000 participants aged 40–73 years who attended one of 22 assessment centres throughout the UK between 2006 and 2010. ORs and 95% CIs for diseases associated with glaucoma were evaluated using logistic regression model. Data analyses were conducted using SAS V.9.4 software for Windows (SAS Institute).

Hub biomarker analysis

We selected biomarkers from the GHDBIN to test their relationships on the PPI network. Biological functional analysis was conducted to identify pivotal pathways for glaucoma hub biomarkers.

Identification of biomarkers

In order to verify their diagnostic value on biological samples, the Wilcoxon test and receiver operating characteristic curve (ROC) test were conducted using gene expression data by R language, using the gene expression data of patients and controls downloaded from GSE2378. A p value<0.05 and ROC>0.7 were considered as the cut-off for biomarker candidates.45

Glaucoma hub drugs-targets network (GHTN) analysis

The GHTN generated by application of the greedy search algorithm on HDTDIN was further analysed by biological network and functional analysis to find the critical pathways for the prediction of glaucoma drug targets.

Biological functional analysis

The STRING Database and clusterProfiler package on the R language conducted KEGG pathway enrichment analysis and Gene Ontology annotation.

The STRING Database also generated UniProt annotated keywords analysis and Reactome pathway enrichment analysis.

Results

HDBIN and GHDBIN identified network-based close diseases to glaucoma

Human diseases and biomarkers interaction information was integrated to construct the HDBIN. HDBIN contained 1480 nodes and 2512 edges, including 167 Human diseases and 1313 of their related biomarkers. (figure 2A and online supplemental table S1) The GHDBIN identified by the greedy search was illustrated in figure 2B, which contained 68 glaucoma-related diseases and 123 of their related biomarkers. We calculated the shortest paths between other diseases to glaucoma on GHDBIN (online supplemental table S2). The 11 GNND (shortest path=2) was shown on figure 2C. We searched on PubMed to find the research related to glaucoma and these diseases and found that nine diseases have been reported together with glaucoma (figure 2D).

Figure 2
Figure 2

Human disease–biomarker interactions network (HDBIN) and glaucoma-related hub diseases-biomarkers interaction network (GHDBIN) identified network-based close diseases to glaucoma. (A) HDBIN. Red points represented diseases and green points represented biomarkers. (B) GHDBIN. The purple point was glaucoma. BDNF and MMP9 were key glaucoma biomarkers connected with other diseases. (C) Glaucoma core network on GHDBIN. The shape of points represented a significant level of OR from UK Biobank data: diamond meant significant and rectangle meant not significant. Point size reflected patients number with glaucoma: the bigger the more. (D) Closest diseases to glaucoma on GHDBIN. This histogram reflected the sample size, related paper numbers, number of overlapping genes and pathways for glaucoma and these diseases.

Verification test for GNND on GHDBIN

To facilitate further validation of our findings at a gene level, we downloaded GWAS data for each of the 11 GNND and mapped out the overlap of their corresponding inter-genes. GWAS data for glaucoma and eight other diseases were found, and three diseases (asthma, hepatitis B (HB) and myocardial infarction (MI)) had overlapped specific genes with glaucoma. Online supplemental figure S1 include the LocusZoom plot, which showed the single-nucleotide polymorphism distribution of GWAS data.

For asthma and glaucoma, 27 inter-genes were identified (figure 3A), among which only PTHLH-ETS1, ATXN2-RBFOX1 and ATXN2-HERC2 showed significant interaction in the PPI network (figure 3B). Biological functional analysis demonstrated that these genes were mapped on five cellular component pathways: cytoplasmic ribonucleoprotein granule, ribonucleoprotein granule, trans-Golgi network, cytoplasmic stress granule and histone deacetylase complex (figure 3C).

Figure 3
Figure 3

Genomic verification test results for glaucoma closest diseases on glaucoma-related hub diseases-biomarkers interaction network. (A) There were 27 overlapping genes for glaucoma and asthma from GWAS data. (B) Protein–protein interaction (PPI) network for glaucoma–asthma genes. Only PTHLH-ETS1 and ATXN2-RBFO1 showed significant connections. (C) Pathway enrichment analysis results for glaucoma–asthma genes. These genes were mapped mainly on cytoplasmic ribonucleoprotein granule, ribonucleoprotein granule, trans-Golgi network, cytoplasmic stress granule and histone deacetylase complex. (D) Glaucoma shared three disease-specific genes with hepatitis B. (E) No connection was identified for the three glaucoma–hepatitis B genes. (F) Pathway enrichment results for glaucoma–hepatitis B genes. These genes were mapped on the Metabolism pathway. (G) Nine genes were identified as glaucoma–myocardial infarction (MI) shared disease-specific genes. (H) PPI network for glaucoma–MI genes. (I) GMDS and FNDC3B showed a significant relationship on PPI network.

There were three overlapping genes (DERA, PLCL1 and GLIS3) for HB and glaucoma (figure 3D,E), which were not demonstrated to connect on the PPI level (figure 3E). Gene Ontology analysis showed that these genes were associated with trans-Golgi network, cytoplasmic stress granule, cytoplasmic ribonucleoprotein granule, ribonucleoprotein granule and histone deacetylase complex (figure 3F). Reactome pathway enrichment analysis showed that these three genes were all related to the Metabolism pathway (figure 3G).

Nine genes (CDKN2B-AS1, AC003084.1, AC007568.1, GMDS, ATXN2, ZFPM2, FNDC3B, ABO and SMG6) overlapped by MI-specific and glaucoma-specific genes in GWAS data (figure 3H) and only GMDS and FNDC3B showed significant connections on the PPI network (figure 3I). No significant enriched pathways were found for these genes.

Online supplemental table S3 presents the verification results by the UKB data. There were five closest diseases for glaucoma from GHDBIN found on the UKB, and asthma (OR: 1.10, p value: 0.028) and dry eye disease (OR: 2.16, p value: 0.00171) showed significant relationships with glaucoma.

Glaucoma hub biomarkers network (GHBN) presented essential biomarkers for glaucoma

From GHDBIN, we extracted the hub biomarkers (BNDF, CS, MMP9, SAA1, TUBA1A, DARC, BMP, CTAM, NOS3 and GFAP) for glaucoma and integrated them with PPI information, application information (diagnosis, treatment, prognosis) from published papers and area under the curve (AUC)s on diagnostic tests to construct the GHBN (figure 4A).

Figure 4
Figure 4

Analysis of glaucoma hub biomarkers, including their network, expression levels and diagnostic performance. (A) Glaucoma hub biomarkers network. Different colours represented different applications for these hub biomarkers reported in published papers : diagnosis/treatment (blue), diagnosis/treatment/prognosis (purple), treatment (red), treatment/prognosis (orange) and prognosis (yellow). (B) Expression of hub biomarkers in glaucoma and controls. (C) Diagnosis receiver operating characteristic curves for hub biomarkers.

Meanwhile, in order to find common pathways for hub biomarkers, pathway enrichment analysis was conducted. Online supplemental table S4 presents the significant mapped pathways. On the biological process level, seven biomarkers (BMPER, NOS3, ACKR1, MMP9, SAA1, BDNF and GFAP) were mapped on the regulation of the multicellular organismal process and six biomarkers (BMPER, NOS3, MMP9, SAA1, BDNF and GFAP) were enriched on positive regulation of the multicellular organismal process and regulation of localisation. At the cellular component level, six biomarkers were mapped on the cytoplasmic vesicle and four were annotated on methylation by UniProt.

Diagnostic test results for glaucoma hub biomarkers

The GSE2378 (platform: GPL8300) gene expression dataset from the GEO Database was selected to conduct the diagnostic test for hub glaucoma biomarkers, which included seven glaucoma cases and six controls. Figure 4B presented the gene expression comparison between glaucoma and controls in different biomarkers. BMP1 (p value=0.035 in Wilcoxon text) showed significant differences between patients and controls.

Diagnostic ROC tests were conducted to verify the diagnostic or prognostic values for these hub biomarkers, of which GFAP (AUC=0.786), MMP9 (AUC=0.619), TUBA1A (AUC=0.833) and BMP1 (AUC=0.857) showed significant diagnostic effects (figure 4C). Earlier literature has previously reported the utility of GFAP6 and MMP946 as diagnostic biomarkers for glaucoma. BMP1 has been reported as both therapeutic47 and prognostic45 biomarkers, while TUBA1A has not been previously reported as any glaucoma biomarker. Previous studies have proven that specific biomarkers are likely to perform multiple functions in diagnosing, treating and prognosis of complex diseases.48 Therefore, we posit that BMP1 and TUBA1A may have future value as diagnostic biomarkers for glaucoma.

HDTDIN identified essential drug targets for glaucoma

HDTDIN contains nodes and 33 385 edges, including 1481 human diseases, 2833 drug targets and 29 071 drugs (figure 5A and online supplemental table S5).

Figure 5
Figure 5

Network analysis and biological functional analysis of disease–target–drug interactions and glaucoma hub drug targets. (A) Human disease–target–drug interactions network. (B). Glaucoma hub drug targets network. (C) Protein–protein interaction network for glaucoma hub drug targets. (D) Biological functional analysis results. All five protein hub targets were mapped on signal transduction and development process.

The greedy search identified six hub glaucoma drug targets: ADRB3, REN, AKR1B1, CaC, ADR and ROCK, and we constructed the glaucoma hub drugs-target networks (GHTN) (figure 5B). We put these targets on the PPI network and found that ADRB3, REN and ADRB2 have been proven with strong relationships (figure 5C). Also, biological functional analysis found that signal transduction and developmental process on biological process were enriched by all hub protein targets (figure 5D).

Discussion

Along with the development of both traditional and computational experiments, many biomarkers have been reported for glaucoma.49 Our study identified hub glaucoma biomarkers based on HDBIN, which collected all the human biomarkers–diseases interaction information. Then relevant biological functional analysis was conducted, and we found several pivotal pathways for these hub biomarkers. These pathways may be foundations for future hub biomarker discovery for glaucoma. Meanwhile, diseases relationships for glaucoma were explored on HDBIN. Further, some hub drug targets and their essential pathways for glaucoma have been identified on HDTDIN. In summary, our study provides key pathways for hub biomarkers/drug targets which are important for clinical-based biomarker/drug discovery in glaucoma.

Analysing diseases and biomarkers in system networks could give a broader view for complex disease research: different from traditional methods, network analysis could investigate disease situation as a connected system.8 To the best of our knowledge, this is the first study to identify hub biomarkers and related pathways in glaucoma using a network medicine analytical approach. Our study finds 10 glaucoma hub biomarkers, compared with PubMed documentations, five have been used in a single application: GTAM,50 DARC51 and SAA152 have been used as treatment biomarkers, and CS53 and NOS354 have been used as prognosis biomarker. Except TUBA1A not found in PubMed, other four hub biomarkers have been used in multiple ways: GFAP6 55 and MMP946 56 have been applied as diagnosis and treatment biomarkers, BMP47 48 has been reported as treatment and prognosis biomarkers and BDNF57 58 has been reported as diagnosis and treatment and prognosis biomarker. In addition to the biomarkers that have previously been identified in earlier research, we predict that BMP1 and TUBA1A could add their diagnosis ability in the future, which also conforms to our theory that hub biomarkers always conduct multiple uses in complex diseases.59

Several critical pathways for glaucoma hub biomarkers were identified. No previous studies have reported the definite relationships between glaucoma biomarker or positive regulation of the multicellular organismal process. Hence, we suggest that multicellular organismal process may play a key role in glaucoma and could be a new direction for glaucoma biomarker discovery. Many studies have proven the relationships between glaucoma biomarker with regulation of localisation60 and cytoplasmic vesicle.61 Four hub biomarkers were enriched on methylation, which add the evidence for DNA methylation is a universal biomarker for complex diseases.62

We also investigated the disease–disease association relationships on HDBIN in this study. We discovered 11 glaucoma-closest diseases on GHDBIN and verified in genomic and epidemiology data results based on network topology analysis for the first time. Network analysis adds a new feature for detecting disease associations and could widely exclude the heterogeneity of experiments. Further, network relationships could reflect inside disease associations which may not be found by traditional statistical analysis.

Many drugs have been used in the treatment of glaucoma. Based on drug–targets networks, finding the hub drug targets to predict new drugs may be a new strategy for glaucoma drug discovery. This study is the first to identify hub drug targets for glaucoma and found six hub targets: ADRB3, REN, AKR1B1, ADR, ROCK and CaC. These hub targets play important role in biological process: ADRB3 belongs to the beta-adrenergic receptors’ family, involved in the regulation of fat and thermogenesis. REN participates in regulating blood pressure and electrolyte balance. AKR1B1 is involved in the development of diabetes complications by catalysing the reduction of glucose to sorbitol. ADR acts as a calcium release channel in the sarcoplasmic reticulum. ROCK plays an important role in cytokinesis. Pathway enrichment analysis found that signal transduction and developmental process were mapped by all five protein hub targets, which may be critical pathways for future glaucoma drug targets prediction. Signal transduction pathway is used to transmit the information of the ligand to the change in the biological activity of the target cell.63 Our study adds evidence for signal transduction as an important platform for drug discovery.64

Conclusions

In conclusion, we have used a network-based approach to identify hub diagnostic and therapeutic biomarkers for glaucoma and detected relationships between glaucoma and associated diseases. Several hub biomarkers were identified and verified, which may play more important roles in the diagnosis and treatment of glaucoma. Our approach opened a new door for the biomarker comparison of glaucoma.