Introduction to metabolomics

Attempts to understand the contribution of individual genes and environmental triggers to different inflammatory disease groups have driven many hypothesis-based research studies. Inherent in this scientific process, however, is the issue that the selection of a particular hypothesis to investigate also runs the risk of missing other, perhaps more complex, interactions which may be of greater importance.

A novel approach which may allow integration of these multiple interacting factors which impact on disease is metabolomics [1]. The general aim of metabolomics is to identify, measure and interpret the complex time-related concentration, activity and flux of endogenous metabolites in cells, tissues and other biosamples such as blood, urine and saliva. Metabolites include not only small molecules that are the products and intermediates of metabolism but also carbohydrates, peptides and lipids, many of which may also derive from the diet or may be altered in disease. The approach is highly applicable to human studies and takes into account the whole spectrum of variables, including genetic background, diet, drug treatment and environment, which act together to influence metabolism. Just as genomics involves the study of gene expression and proteomics the expression of proteins, metabolomics investigates the consequences of the activity of these genes and proteins.

Many diseases, for example, rheumatoid arthritis (RA) are complex polygenic conditions. In RA there is a strong contribution of a number of genes, in particular major histocompatibility complex class II and protein tyrosine phosphatase PTPN22 [2], but in spite of this, studies on monozygotic twins indicate that the genetic component does not account for all of the risk of disease [3]. Other, non-genetic factors must therefore contribute, for example age and infections or the post-translational modification of arginine to produce citrulline, which promotes autoantibodies in RA patients [4]. One of the other major risk factors for RA is smoking, and the level of the increased risk was shown in a study of monozygotic twins [5]. Smoking also plays a significant part in increasing the risk for age-related macular degeneration [6]. Smoking has profound effects on the oxidative metabolic environment in vivo since plasma levels of glutathione (GSH) have been reported to be 25% lower in smokers than in controls [7], which could promote oxidative damage to tissues and lead to widespread changes in metabolism. Diet may play a role with a recent study suggesting that antioxidant deficiency may predispose to disease since there was a relationship between dietary ascorbate and susceptibility to inflammatory polyarthritis [8] while alcohol consumption may be protective [9, 10]. It may be most appropriate, however, to view genes and metabolism as inseparably linked when trying to understand human diseases, and this was highlighted recently by the observation that metabolic profiles are highly heritable, at least in the context of cardiovascular disease [11], suggesting a strong link between genes and metabolism.

Acquisition of metabolomic data

The importance of the complex array of metabolites in human physiology has long been recognised, and as early as 1971 Linus Pauling developed methods based on gas–liquid chromatography to separate 280 metabolites in human urine [12]. Recent technical advances in nuclear magnetic resonance (NMR) and mass spectrometry have allowed extremely high-density data sets to be constructed from individuals by examining the changes in hundreds or thousands of low-molecular-weight metabolites in intact tissues or biofluids. NMR-based metabolomics offers several distinct advantages in a clinical setting since it is relatively quick and can be carried out on standard preparations of blood cells, serum, plasma, urine or other body fluids. To maximise sensitivity and minimise the size of the samples needed, most NMR-based metabolomic analyses make use of moderately high-field NMR instruments (500 to 800 MHz) often equipped with a cryo-probe, which minimises the electronic noise in the detection system, thus maximising signal from sample. Higher volumes of sample are required for an instrument which requires the sample to be in the glass tube; however, it is possible to derive good-quality spectra from 100 μL of vitreous fluid [13] with such an instrument. Some systems are now available, which will inject a sample directly into a sample chamber in the magnet, and these can require as little as 30 μL of sample, which makes possible the analysis of less readily available samples. Proteins in serum can interfere with the quality of the spectrum derived from low-molecular-weight metabolites, and so removal of these by filtration can significantly enhance the quality of the NMR spectrum derived [14]. Separation of samples into hydrophilic, water-soluble metabolites and those which are hydrophobic or bound to proteins [14] also allows a greater depth of information to be derived from a particular sample.

From the information contained within the data set, it is possible to establish a relationship between metabolite levels and cellular responses and a powerful means of exploring the biochemical consequences of disease [15]. NMR spectra of biofluids are highly complex, containing signals from hundreds of metabolites that represent many key biochemical pathways. To make the spectra tractable for analysis it is usual to segment the spectra into small regions [16] to allow processing using a number of statistical approaches. Pattern recognition methods (principal component and partial least-squares analysis, see below) allow the complex biofluid/tissue NMR data to be reduced and analysed quantitatively to provide pattern recognition maps that can assist in disease classification. Metabolomic diagnostics can be extremely sensitive for the detection of low-level damage in a variety of organ systems and is potentially a powerful new adjunct to conventional pathological procedures and to assist in functional genomics problems [17]. The multiplexed analysis inherent in this approach, which takes into account all metabolite signals regardless of whether they have been specifically identified, is able to provide information not available by other means.

As stated, while genomic and proteomic techniques have been useful in generating useful data and novel hypotheses, they are ultimately dependent on a candidate gene/protein approach, which is costly and time consuming for the study of significant numbers of patients. In contrast, metabolomics is low cost, reproducible and, with bioinformatic analysis of the data, easily translated into a clinical test that could inform future therapy. Furthermore, because a metabolic profile is summative of all the biochemical processes occurring in the body at a given time, it makes no presumption about the relative importance of these processes and so is more likely than a candidate approach to be able to highlight differences between disease groups and to identify changes occurring during therapy. In a recent paper analysing metabolomics by SWOT (strengths, weaknesses, opportunities, and threats) strengths included the ability to identify molecules as the ultimate manifestation of a biological process, and opportunities included the identification of predictive biomarkers in disease states [18]. Weaknesses were defined as lack of comprehensive databases for metabolite identification although such programmes are now available, including the Human Metabolome Database [19, 20]. A basic point is the ability of the methodology to discern differences between normal metabolic homeostasis and changes induced by the disease state. This has been demonstrated in metabolomic analysis coupled with mathematical modelling and offers a systems biology approach that will provide diagnostic biomarkers in inflammatory conditions [21]. In support of the potential of metabolomics it has been highlighted for funding in the recently published roadmap of the US National Institutes of Health [22].

Analysis of metabolomic data

Principal components analysis (PCA) is a particularly useful statistical technique for the analysis of complex datasets such as metabolite or transcript profiles. Although it was first developed many years ago [23, 24], its application has become more widespread [2529] with the ready availability of personal computers since it provides a means of data simplification while retaining the main features of the data. It involves a linear transformation that chooses a coordinate system such that the greatest variance by any projection of the data lies on the first axis (principal component 1 (PC1)), the second greatest variance will be PC2 etc. PCA reduces the dimensionality in a dataset while retaining those characteristics that contribute most to variance. Often the first stage is to standardise the data by subtracting the mean and dividing by the standard deviation. This sets the centroid of the data as zero. PCA chooses PC1 as the line that goes through the centroid but minimises the square of the distance of each point from that line, so the line goes through the maximum variation of the data. Each subsequent PC must also go through the centroid, but must be uncorrelated to PC1, i.e. at right angles or orthogonal to PC1 axis. PC values can now be compared with each other.

Partial least-squares analysis discriminant analysis (PLA-DA) is a method of partial least-squares regression (PLS regression), which bears some relation to PCA; however, instead of finding the hyperplanes of maximum variance, it finds a linear model describing some predicted variables in terms of other observable variables. It is based on linear transition from a large number of original descriptors to a small number of orthogonal factors (latent variables) providing the optimal linear model in terms of predictivity. A PLS-DA model will try to find the multidimensional direction in the X space that explains the maximum multidimensional variance direction in the Y space. Such modelling techniques should be used with caution, since there is the possibility of over-fitting data leading to incorrect interpretation of data [30].

Metabolomics in healthy individuals

In a longitudinal study of metabolites in the blood and urine of healthy controls, there was relatively little variability between subjects and study days. This provides reassurance that metabolomic data have acceptable variability and may highlight biomarkers of disease [31]. Collection and storage of samples for metabolomic studies must also be considered. Blood or other biofluids, such as those from the eye, need to be collected in a systematic and consistent manner to minimise variation in the metabolites between samples and then stored at a temperature to try and minimise the deterioration of the metabolites prior to an analysis. The addition of antibacterial preservative borate added to urine is common in clinical samples; however, this had little effect on metabolomic profiles compared to inter-person variation [32].

The largest study of metabolomic profiles in healthy individuals analysed urine samples from 4,600 people of different ethnic groups. PCA analysis showed that North and South Chinese, Japanese and UK/American split into four distinct groups and that Japanese living in Japan differed from Japanese-Americans. It was suggested that in part this was due to diet, microbial gut composition and metabolic profile. Formate was shown to be linked to blood pressure, while alanine and hippurate were linked to gut microbes. Formate is a by-product of carbon metabolism via the activities of mitochondrial and cytosolic serine hydroxymethyltransferases and the tetrahydrofolate reductase pathway. It can also be a product of fermentation of dietary fibre by the gut microbiome. Alanine was higher in people consuming a predominantly animal diet compared to predominantly vegetable. The data suggest that diet and metabolites are linked to blood pressure and coronary heart disease in different ethnic groups [31, 33].

Age and gender of the patients must also be taken into account with creatinine levels rising in growing children and higher lipid synthesis in young women and protein synthesis in young men [34, 35]. NMR analysis of sera from children has been used but such studies have largely been limited to known metabolic defects or in the assessment of drug therapy. For example, in vivo brain spectroscopy was able to identify metabolic changes in children with hypothyroidism [36] and mercaptopurine metabolites in children with inflammatory bowel disease [37]. A recent study of muscle metabolites in the blood of children with juvenile idiopathic inflammatory myopathy [38] showed that NMR-based analysis of blood and urine creatinine and creatine might have potential in assessing disease damage in this and others diseases such as juvenile chronic arthritis.

Levels of antioxidants are low in the elderly [39], which might predispose to the establishment of chronic inflammation, following an initiating inflammatory event such as an infection. An example of the mechanism through which this could work was recently shown in protein-deficient mice, which were as a result low in another antioxidant, GSH, which when challenged with LPS gave a much elevated response, resulting in enhanced activation of NFkB and excess TNF production [40]. Boosting the GSH in these animals, using NAC, normalised their responses.

Thus complex interactions between lifestyle, diet and genes may affect the metabolic and therefore the functional status of individuals with consequences for immune function and susceptibility to disease.

Metabolomics in animal models and human disease

Metabolomics has been used in several animal models of human disease. Apolipoprotein E (ApoE)-deficient mice are a commonly used model of atherosclerosis, and NMR spectroscopy identified vascular oxidative stress, inflammatory response and changes in energy metabolism in atherogenesis in ApoE−/− mice [41]. ApoE−/− mice fed high cholesterol diet, developed early atherosclerosis accompanied by metabolic changes linked to inflammation and lipid metabolism [42]. Both studies support the inflammatory basis for atherosclerosis, but metabolomics also defines other important pathways linked to the disease. In lung disease, metabolomics was investigated to provide a non-invasive method of monitoring asthma exacerbation in guinea pigs. Urine metabolomic profiles discriminated between ovalbumin challenged animals that developed disease and sensitised or control animals with 90% accuracy [43]. Metabolomic analysis of bronchial alveolar lavage fluid distinguished between patients with cystic fibrosis who had high inflammation at the time of sampling compared to patients with low inflammation, characterised by increased amino acid and lactate levels [44].

At a practical level metabolomics has been successfully applied to the study of coronary heart disease where analysis of serum NMR spectra was able to distinguish between different degrees of coronary artery stenosis. The features in the NMR spectra contributing to this discrimination were largely lipid components and yet classical biochemical analysis of the usual lipid profiles was unable to discriminate between these patients [26]. However, while metabolomics had a higher predictive power, other multivariate analysis of risk factors for coronary artery disease factors such as gender and statin treatment showed confounding effects [27]. The effects could be accentuated by the use of plasma compared to serum and other methodological changes in the latter study, but the results add a note of caution that identifying the optimal system of analysis for each condition is important. With regards to inflammatory disease a recent study demonstrated different metabolic profiles in faecal extracts between patients with Crohn’s disease and ulcerative colitis from normal controls. More importantly, patients with Crohn’s disease or ulcerative colitis could be discriminated using metabolomics, with levels of glycerol being particularly relevant [45]. In a recent study of identical twins, one healthy and one with Crohn’s disease, metabolomic analysis identified profiles between twins and between individuals with primarily ileal Crohn’s or colonic Crohn’s. The results showed an association between metabolites produced by gut microbes that correlated with disease [46]. Identification of inflammatory bowel disease depends on a clinical diagnosis, and metabolomics offers a novel approach to aid diagnosis. The effect of the gut microbe profile on blood metabolites was suggested in a recent study on germ-free mice. Compared to wild-type animals amino acid levels and antioxidants were dependent on the presence of gut microbes, suggesting a complex link between bacterial and host metabolism [47].

Recently, metabolomic analysis has been reported to discriminate in the prognosis and diagnosis of other human diseases, including diabetes, blood pressure and cancer [33, 48, 49]. Thus it seems that the multiplexed analysis inherent in this approach, which takes into account all metabolite signals regardless of whether they have been specifically identified, is able to provide information not available by other means. In a group of 56 children, metabolomic profiles showed reduced succinic acid, phosphatidylcholine and antioxidants, with increased pro-inflammatory lysophosphatidylcholines in those individuals who developed diabetes. These changes were not related to known HLA-associated genetic markers of diabetes and were detectable several months before seroconversion to autoantibody was detected [50]. Metabolomic analysis has proved useful in monitoring the effects of drug therapy. Serum from diabetic patients treated with metformin hydrochloride showed lower levels of lactate, oxaloacetate and other metabolites, compared to sera from untreated patients, suggesting a reduction in inflammation [51]. In support of the response to treatment a recent study described a new pharmaco-metabolomic approach to personalise drug treatment using a combination of pre-dose metabolomics and chemometrics to model and predict responses in individual animals [52].

Metabolomic analysis of samples from patients with clinically localised and metastatic prostate cancer and healthy individuals with benign prostate produced separate profiles for each condition. One metabolite, sarcosine, was significantly increased in prostate cancer and correlated with progression to metastatic disease. In vitro treatment of prostate epithelial cells with sarcosine led to increased invasiveness, a response that was attenuated by inhibition of glycine N-methyl transferase, the enzyme responsible for sarcosine conversion from glycine [53]. Similarly, metabolomic analysis was reported to define poor prognosis in neuroendocrine cancers [49].

Two recent studies analysed metabolomic profiles in patients with hepatocellular carcinoma. While both showed changes in metabolites that distinguished patients with disease from healthy controls, the pattern was different as one group was assayed with urine and one with sera. These data support the use of metabolomics for diagnosis, but caution must be used in extrapolating from one body fluid to another [54, 55]. Given such caveats, NMR-based metabolomics is now considered a reliable analytical tool to identify biomarkers in human cancer [56].

Metabolomics in ocular disease

In a recent study we analysed vitreous fluid samples from patients undergoing retinal surgery using NMR and the metabolic profiles investigated by several techniques. The results showed clear separation by both PCA and PLS-DA analysis based on clinical diagnosis. The two largest groups of samples were from patients with lens-induced uveitis (LIU) and patients with chronic uveitis (CU). PCA and PLS-DA could clearly separate the metabolomic profiles, while a genetic algorithm applied to the data gave a sensitivity and specificity of over 90% for this separation. Individual metabolites identified from regions of the spectra showed significant differences, with urea, oxaloacetate and glucose all being raised in LIU samples compared to CU samples. Both oxaloacetate and urea are involved in the urea cycle, and urea is produced in the conversion of arginine to ornithine, a process prominent in activated macrophages and endothelial cells at the expense of nitric oxide, suggesting more active inflammation in the LIU patients. Lactate levels were high in vitreous samples from both conditions [13]. These results are most probably influenced by the inflammatory nature of the conditions. Glucose is metabolised to lactate under anaerobic conditions, and glycolysis is enhanced 6-fold in T cells in the first 2 h following stimulation and 15-fold by 48 h. This shift from mostly aerobic to anaerobic lactate production occurs rapidly after activation of the cell cycle [57]. Peripheral T cells encounter a rapid decrease in oxygen tension when they enter an inflammatory site, and CD3 engagement is prolonged under hypoxic conditions with hypoxia-inducible factor (HIF-1) and its target gene product adrenomedullin being critical. Hypoxia alone is not enough, and T cell antigen receptor engagement is required for increased HIF-1 accumulation. Signalling may go through mTOR (mammalian target of rapamycin) as expression of HIF-1 and its target gene is blocked by rapamycin [58]. Moreover, lymphocyte activation initiates a programme of cell growth and proliferation that increases metabolic demand. CD28 stimulation acting through phosphoinositide 3-kinase and the kinase Akt is required for T cells to increase their glycolytic rate, allowing T cells to anticipate biosynthetic needs [59, 60]. CD28 and increased stimulation have been suggested to supply the bioenergic requirements of the cell, and increased glucose may protect the cell from apoptosis. Conversion of naive CD8 cells to effector cells induced an increase in eight genes encoding glycolytic enzymes, and these cells display a greater glucose uptake, higher glycolytic rate and increased lactate production compared to naive cells. Glucose deprivation strongly inhibited IFNγ gene expression, although IL-2 was unaffected. These results imply that surrounding metabolic conditions may affect CD8 function [61]. This effect of environment is supported by studies showing that lactate derived from tumour cells suppressed proliferation and activity of cytotoxic of T cells (CTL). A recovery period in lactic-acid-free medium restored CTL function. It was suggested that the high lactic acid environment blocked T cell lactic acid export thereby disturbing metabolism [62]. Hypoxia and high lactate levels stimulate macrophages to perform similar pro-angiogenic functions in both tissues and wounds. The resolution of wounds results in the restoration of tissue integrity and perfusion, and macrophage numbers are reduced to pre-injury levels [63]. Lactate at the sites of wounds also affects fibroblasts inducing collagen synthesis and vascular endothelial growth factor (VEGF) by macrophages [64]. VEGF in turn has been shown to induce endothelial cell migration [65]. Therefore the inflammatory milieu affects many cell types, the outcome of which can be detected by metabolomics.

NMR analysis of ocular tissue has been reported in several other studies. Early work by Greiner used phosphorus [31P] NMR to analyse ocular metabolism. The results defined phosphate-containing metabolites in aqueous and vitreous fluids from pig eyes [66, 67]. Perchloric acid extracts of rabbit cornea and lens analysed by 1H NMR showed expression of taurine and glutathione suggesting a robust antioxidant environment in ocular tissues [68, 69]. A role for oxidant status was shown in a study of human and rat lenses exposed to hyperglycaemic or oxidative stress. Uptake of ascorbic acid was only minimally affected by hyperglycaemia unless glutathione levels were significantly reduced [70]. A series of studies by the Midelfart group in Norway investigated the effects of UVB irradiation by NMR spectroscopy. UVB treatment of rat lens caused a significant depletion of several amino acids, lactate levels and other water-soluble metabolites, although metabolite levels were restored in the days after treatment [71]. Treatment of rabbit lens with UVB irradiation following dexamethasone treatment daily for 36 days induced significant reductions in taurine, glutathione and lactate, while levels of glucose rose. Therefore treatment with dexamethasone exacerbates potential oxidative stress in the mammalian lens. Similar results were shown for dexamethasone treatment of rabbit cornea [72]. UVB irradiation caused significant metabolic changes in rabbit aqueous humor, while UVA irradiation had no effect [73, 74]. A stronger response was seen in aqueous humor from animals exposed to repeated UVB exposure [75]. These data were elegantly reviewed in a recent publication [76].

Metabolomic analysis of neurological conditions has also been informative. Cerebrospinal fluid (CSF) and serum samples from patients with a variety of neurological conditions, including multiple sclerosis, were analysed by NMR spectroscopy, and the metabolomic profile was used to diagnose a second cohort of patients. The results showed that for different disease conditions metabolomic analysis had a sensitivity and specificity between 65% and 75% in diagnosing disease [28]. As many patients with intermediate uveitis go on to develop multiple sclerosis it will be of interest to compare these patient groups using metabolomics. CSF from patients with bacterial or fungal meningitis could be separated from viral meningitis and healthy controls. Metabolites of both bacterial and host origin contributed to the separation. Metabolomic data correlated with onset and course of infection in one patient with two episodes and with response to therapy in another [77]. Together these data support the continued analysis of CSF for diagnostic purposes, including conditions that affect the eye.

Future prospects

Metabolomics measures the metabolite profile in body fluids or tissues and as such provides a profile of pathways and processes that have been activated in those samples. The results presented in this review strongly support further investigation of metabolomic analysis of ocular disease, including glaucoma, Graves ophthalmopathy, age-related macular degeneration, inflammatory and infectious conditions. The growing consensus that metabolites associate with particular processes (e.g. lactate and antioxidant changes with inflammation) and with particular conditions (e.g. sarcosine and prostate cancer) emphasises the potential of this methodology. Recent studies linking metabolomics to genomic profiling suggest that there may be even greater opportunities for a systems biology approach that will lead to a greater understanding of disease conditions and underlying mechanisms involved.