OPEN ACCESS DOCUMENT
•Anal. Methods, 2017, 9, 774
•DOI: 10.1039/c6ay02976k
Chemometric evaluation of hydrophilic interaction liquid chromatography stationary phases: resolving complex mixtures of metabolites
Elena Ortiz-Villanueva, Meritxell Navarro-Reig, Joaquim Jaumot, Romà Tauler*
Department of Environmental Chemistry, IDAEA-CSIC, Jordi Girona 18-26, 08034 Barcelona, Spain.
Corresponding author: Prof. Dr. Romà Tauler
Postal address: Department of Environmental Chemistry, IDAEA-CSIC, Jordi Girona 18-26, 08034 Barcelona, Spain.
Telephone: +34934006140
E-mail:
Keywords: Hydrophilic interaction liquid chromatography (HILIC), Liquid chromatography-diode array detector (LC-DAD), Multivariate curve resolution,Metabolomics,Berridge chromatographic response function.
Abbreviations: ASCA: ANOVA-simultaneous component analysis; CRF: Chromatographic response function; HILIC: hydrophilic interaction liquid chromatography; IPC: Ion-pair chromatography; MCR-ALS:Multivariate curve resolution alternating least squares; NP: Normal phase; PCA: Principal component analysis.
Abstract
Different hydrophilic interaction liquid chromatography (HILIC) stationary phases havebeen evaluated using different chemometric methods with the aim of their application in metabolomics studies. Experimental factors, such as the type of HILIC stationary phase (i.e. amide, amine, zwitterionic and diol) and the mobile phase conditions (organic co-solvent, pH and ionic strength) were assessed using a full factorial experimental design.A test sample mixture ofmetabolites with diverse physicochemical properties (amino acids, nucleotides, nucleosides, and sugars among others) was analyzedby liquid chromatography with a diode array detector (LC-DAD) using fivedifferent HILIC columns. Application ofmultivariate curve resolution alternating least squares (MCR-ALS) method, allowed the full chromatographic peak resolution of all mixture constituents. This approach wasparticularly helpful in the case of methanol samples where the quality of the chromatographic separation (resolution) was lower in consequence of the co-solvent perturbation on the water layer formation at the surface of the stationary phase.Then, Berridge chromatographic response function (CRF), based on peak resolution, retention times and number of peaks, was used forthe investigationof the best HILIC column configuration for future metabolomics studies. The best chromatographic configuration resulted in beingthe amide and zwitterionic HILIC stationary phases in combination with acetonitrile as an organic co-solvent of the mobile phase.
- Introduction
Metabolomics studies aim to characterize the complete endogenous low-molecular weight compounds(metabolites) present in biological systems[1, 2].Metabolites have diverse physicochemical properties and are usually found at a broad range of concentrations in living organisms[3, 4].In the last years, new approaches have gained importance to increase coverage capability of these compounds.
Liquid chromatography (LC) is an attractive platform for -omic analyses becauseof its versatility, precision and high concentration sensitivity. In the metabolomics field, reversed-phaseliquid chromatography (RPLC) coupled toC18 or C8 stationary phasespresent verylow retention for the highly polar and hydrophilic compounds which are usually foundin metabolomic samples. Thedevelopment of analytical alternatives and technologies, such as ion-pair chromatography (IPC)[5]orrelatively novel hydrophilic interaction liquid chromatography (HILIC)[6], hadovercome this major drawback of more traditional approaches.Nowadays, the use of HILIC is widely recognized as analternativeforcarrying out metabolomicsstudies[3, 7], solvingpolar selectivity issuesassociated withthe use of RPLC columns.HILIC stationary phases can be consideredto be a combination of a normal stationary phase (NP) and an RPmobile phase, containing a high percentage of organic solvent.Under such conditions, HILIC stationary phasesprovideenhanced retention for strong polar compoundsontheirsurface active groups.
In the last decade, HILIC stationary phase supports and surface chemistry advancements have provided solutions to specific separation problems such as those related to the determination of short chain carboxylic acids[8], carbohydrates[9, 10], amino acids[11], nucleosides and nucleotides[12], and peptides[13]. Consequently, several types of HILIC columns have been described, including plain silica, neutral polar chemicallybonded,ion-exchange and zwitterionic stationary phases[14-17]. However, according to Alpert’s [18, 19], the retention mechanism of HILIC chromatographic systems is more complex than for RPLC systems [20]. The retention mechanism of HILIC columns is more complex than for conventional RP columns, due to the different retention patterns depending on the specific stationary phase considered. HILIC retention is afunctionof variousmolecular interactions, such as adsorption, ion exchange, and analyte partitioning between the mobile phase and the water-rich layer at the surface of the stationary phase[21]. Accordingly, the chemistry occurring at the stationary phase and the composition of the mobile phase (pH, ionic strength, and organic co-solvent composition) play a crucial role in HILIC selectivity and retention mechanisms[22, 23]. Therefore, the selection of the most suitable pair of stationary and mobile phasesis a critical decision when a comprehensive analysis of metabolites is attempted.
Until now few studies have attempted a statistical experimental design approachand optimization of the best separation conditions using HILIC stationary phases with the goal of reducingexperimental efforts andincreasingthe reliability of the results[19]. In most of the previous studies about the behavior of HILIC columns, data analysis involved the application of multivariate exploratory and classificationmethods, such as principal component analysis (PCA) [19, 24, 25]or hierarchical cluster analysis (HCA)[19] to investigate the different behavior of HILIC columns.Some of the previously published articles have also reported the use of desirability functions for the automated comparison of HILIC stationary phases[7, 13]. However, these kinds of approaches could miss relevant information from the data due to the not completely resolved peaks in the chromatographic separations of a large number of analytes.In this point, the application of chemometric tools for dealing with these complex datasets is highly recommended. These methods allow obtaining the resolution of peaks (overlapped, embedded) that could not be resolvedusing a single chromatographic separation. There are several chemometric methods that can be used to resolved these chromatographic data as detailed in the reviews of de Juan [26]or Amigo[27]. From all these families of methods, multivariate curve resolution alternating least squares (MCR-ALS) can be presented as a powerful method to get into thedetailof this HILIC chromatographic data [28]. To the best of our knowledge, most works have been only focusedon the comparison of less number of chromatographic conditions[19],or using a narrow range of compounds or only a specific family of them[13, 25], without using the advantages of advanced chemometric methods, such as MCR-ALS to solve complex chromatographic separations.
In this work, the capability of different HILIC stationary phases (amine, amide, zwitterionic and diol) was evaluatedfor metabolomics studies by the combination of two strategies. On the one side, a full factorial experimental design was applied where the nature of the HILIC stationary phaseand ofthe mobile phase (organic co-solvent, pH and ionic strength)were consideredas experimental factors.This experimental design allowed the screening of the influence of these factors on the chromatographic behaviourof a mixture of preselected metabolites from different families analyzed by LC-DAD.On the other side, MCR-ALS approach was used to explorethe effects caused by these chromatographic parameters in the totalresolution of theelution profiles and pure UV spectra of all targeted metabolites. Finally, the use of the Berridgechromatographic response function (CRF)[29] is proposed for the evaluation of the different chromatographic conditions to gather the best configuration for the analysis of the consideredmetabolite mixture after MCR-ALS based chemometric metabolite resolution.
2. Materials and methods
2.1.Chemicals and reagents
All chemicals used in the preparation of solutions were of analytical reagent grade. Acetic acid (glacial), ammonia (25%), methanol (HPLC grade) and acetonitrile (HPLC grade) were purchased from Merck (Darmstadt, Germany). Ammonium acetate was provided by Sigma-Aldrich (St. Louis, MO, USA).Water with conductivity lower than 0.05 μS·cm-1 was obtained using a Milli-Q water purification system (Millipore, Molsheim, France).
2.2. Standards mixtureand working solutions
A test mixture of 12 chemical compounds, including reduced glutathione, D-fructose 1,6-bisphosphate sodium salt hydrate (F2,6BP), β-nicotinamide adenine dinucleotide (NADH), adenosine 5’-monophosphate disodium salt (AMP),hypoxanthine, uridine, inosine, cytidine, L-phenylalanine, L-tryptophan, L-tyrosine and L-citrulline was used as a model metabolite mixture. All standards were purchased from Sigma-Aldrich (St. Louis, MO, USA).
For each metabolite, one standard solution (1000 µg·mL-1) was prepared and stored in the freezer at -20°C until its use. Working standard solutions were obtained by dilution of the metabolite stock solutions with water. A sample mixture of 12 metabolites, from different families (i.e.amino acids, nucleotides, nucleosides andsugars among others (purine, tripeptide)) and with various structures and physicochemical properties was prepared and used as a test sample. Diluted standard solutions (concentrations of 20 µg·mL-1) were also employedto evaluate the distinct chromatographic conditionsused in this work.
2.3. Instrumentation and procedures
2.3.1.Stationary phases
In this work, fivedifferent HILIC stationary phases(BEH amide, amide, amine, zwitterionic and diol,see Table 1 for their main specifications) were tested to evaluate their suitability and performance.
Table 1 near here
2.3.2. LC-DAD
LC-DAD experiments were performed in an Agilent Infinity 1200 series system coupled to a diode array detector (DAD) (Agilent Technologies, Waldbronn, Germany).LC control and separation data acquisition were performed using ChemStation Software (Agilent Technologies).
Chromatographic separations were carried out using chromatographic conditions detailed in Table 1 (solvents, flow rate,and elution gradient). Ammonium acetate was used to prepare all aqueous mobile phases. Sample injection was performed with an autosampler at 4°C, and the injection volume was 5μL. Mobile phases were degassed for 15 min by sonication before using them. DAD detection was carried out considering a spectral range from 190 to 500 nm.
The effect of the four different factors on the chromatographic behavior wasstatistically assessed using a full factorial experimental design. The experimentalfactors were the HILIC stationary phase, and the other three factors related to the mobile phase composition: organic co-solvent, pH, and ionic strength. Five HILIC stationary phases (columns) were evaluated (BEH amide, amide, amine, zwitterionic and diol, see Table 1). In the case of the mobile phase, two commonly used organic phase co-solvents were tested: methanol and acetonitrile; three pH values: acid (pH of the buffer was 3.0), moderately acid (pH of the buffer was 5.5) and neutral (pH of the buffer was 7.0); and three ionic strengths in the aqueous phase: low (5.0 mM), medium (25.0 mM), and high (50.0 mM).Preliminary studies showed that in all the considered cases (different pH values and organic co-solvent contents [30]) the pH at the highest organic content (95%) varied approximatelyup to 2 units in the case of acetonitrile, and up to 1.5 in the case of methanol. However, this shifting was considered not to dwarf the effect of pH significantlyon the separation of the investigated metabolites.
To sum up, the experimental design considering all the factor levels(full factorial design) gave a total number of 90 experiments which were randomly injected twice(see Supplementary Material Table S1).In addition,pure metabolitesamples were also individually injected.
2.4. Data analysis
LC-DAD data from the set of metabolite mixture samples previously described were analyzed by chemometric methods with the goal of evaluating the separation capacity of the five HILIC stationary phases tested under the full factorial experimental conditions previously detailed.MCR-ALSwas applied to resolve the chromatographic peaks of all metabolites in the analyzed test mixture. This complete resolution would not be possible without the help of MCR-ALS, due to the strong coelutions present in the used chromatographic conditions among the 12 metabolites considered in the mixture. This problem would be even more challenging in the case of the analysis of biological samples where hundreds of metabolites would besimultaneously present. From MCR-ALS resolved elution profiles, individual peak parameters for every metabolite (such as peak width and retention time) were obtained and used to investigate the performanceof the different chromatographic combinations of stationary and mobile phasesby means ofBerridge chromatographic response function (CRF) defined for this purpose.
2.4.1. Datapreparation and preprocessing
ChemStation .csvfiles were importedinto the MATLAB® environment (The Mathworks Inc. Natick, MA, USA).In the case of LC-DAD, each chromatographic run provided a single data matrix D(IxJ) in which the rows were the spectra recorded at every retention time(i=1,…I), and the columns were the elution profiles obtained at every wavelength (j=1,…J). For instance, the number of rows ranged from 4800 to 9000 depending on the chromatographic run whereas the number of measured wavelengths was 156.
Since the experimental design considered 90 different experiment conditions, a total number of 90 data matrices were finally obtained, each one corresponding to the analysis of the metabolite mixture test sample under a particularchromatographic condition. Chromatograms from every data matrix were baseline/background corrected using the AsLS (asymmetric least squares) algorithm [31]. Additionally, in order to reduce the size of the datamatrices to be analyzed, retention times (rows) and wavelengths(columns) whereno signal was detected were removed.The final number of considered wavelengths(columns) was 49 ranging from 215 to 280 nm for all chromatographic experiments,and the number of retention times (rows)varied between 300 and 3000 seconds.
2.4.2. MCR-ALS resolution of chromatographic peaks
MCR-ALS is a chemometric method particularly useful to analyze complex multicomponent mixture systems, in particularchromatographic systems with strongly coeluted contributions [32, 33]. In the case of LC-DAD data,MCR-ALS decomposes theoriginal data matrix D(IxJ) according to a bilinear model into twofactor matrices, C and ST, as is shown in Eq.1:
D = CST + E (1)
whereC(IxM)matrix contains the elution profiles of the Mresolved contributions in the considered sample,ST(MxJ)has their pure UV spectra and E(IxJ)is the residuals matrixwith the background absorption unexplained by the model.Pure spectra allow the identification of the metabolites in each resolved component, whereas the corresponding resolved elution profiles permit retrieving their chromatographic peak parameters[34, 35].
Data sets obtained in the chromatographic analyses of the metabolite test mixture using different stationary phases andunder the same mobile phase conditions (pH, ionic strength, and organic co-solvent) were analyzed simultaneously by MCR-ALS using the column-wise augmented data matrix strategy shown in Eq.2 (18 column-wise augmented data matrices).DBEHamide, Damide, Damine, Dzwitterionic and Ddiolare the data matrices obtained in the different chromatographic runs using everyHILIC stationary phase in all combinations of mobile phase conditions (combination of pH, ionic strength, and organic co-solvent). In addition, to facilitate the resolution and identification of all metabolites, chromatographic data matrices obtained in the analysis of every one of the 12 metabolites, Dstandard,1 to Dstandard,n,n=1,…12, using the HILIC amide stationary phase and mobile phase conditions of pH 5.5, ionic strength 5 mM and acetonitrile as organic co-solvent,were also added to each one of the 18 augmented data matrices.
(2)
Eq.2 summarizes the extension of the bilinear model used by MCR-ALS when it is applied to the multirun chromatographic analyses of the metabolite mixture with the different tested HILIC columns at one particular experimental condition. Caug in Eq.2 gives the resolved elution profiles of the 12 metabolites for every chromatographic run in the different HILIC stationary phases, CBEH amide, Camide, Camine, Czwitterionic and Cdiol, and Cstandard,1 to Cstandard,nhave the chromatographic profiles of the pure metabolites. ST has the MCR-ALS resolved pure spectra of all the components (12 metabolites) in all simultaneously analyzed datasets[34]. The ALS constrained optimizationiterativelyresolved Eq.2 for the elution and spectra profiles[35-37] under non-negativity constraints for both elution and spectra profiles, unimodality for elution profiles, and equal height normalization for spectra profiles[35, 37, 38].The incorporation of the data matrices coming from the LC-DAD analysis of the pure metabolites (standards) favored the resolution of the complete system since unique spectra were obtained for them and facilitate the proper resolution of some the metabolites with more highly overlapped elution profiles. Pure metabolites were only analyzed under three different conditions (number 19, 25 and 31 in Table S1) taking into account each studied pH with5.0 mM of ionic strength and acetonitrile as co-solvent using TSK Gel Amide-80column.The correspondence between standards and species in the mixtures was also implementedduring the resolution.Correct application of the proposed model and resolution of the 12 metabolites was finally checked by comparison of the 12 MCR-ALS resolved spectra profiles with the spectraof the known metabolites.In all cases (except for F2,6BP) a perfect agreement was achieved. More detailed explanations about the MCR-ALS procedure and constraints are given in previous publications[33, 38].In addition, details of the strategy used for resolution of the 18 augmented chromatographic systems, one for each combination of the three pH values, three ionic strength values, and of the two organic co-solvents aregiven in theSupplementary Material, Table S2.
2.4.3. Evaluation of chromatographic performance
Resolved elution profiles obtained by MCR-ALSpermitted to retrieve peak retention times and peak widths for the 11 completely resolved metabolites in each HILIC stationary phase and for each chromatographic run at different experimental conditions. Using this information, an investigationabout the performance of the different stationary phases and of the behavior of the various chromatographic conditions can be achieved. The use of Berridge chromatographic response function (CRF) to summarize these results is proposed (see below).
2.4.3.1. Investigation of HILIC stationary phases behavior in LC-DAD
Differences in retention time values obtained for every metabolite in each chromatographic run were investigated. The data matrix,Dtr,was built up with the retentiontimes for each metaboliteat each chromatographic condition. This matrix had a number of rows equal to the number of chromatographic conditions (90 runs) and a number of columns equal to the number of resolved metabolites (11 compounds). Before the analysis, all retention times were normalized using the median retention time as a reference for all compounds in a particular chromatographic run. This retention times data matrix was subjected to statistical evaluation using two different methods: ANOVA-simultaneous component analysis (ASCA) [39]and principal component analysis (PCA)[40]. The ASCA method combines the advantages of ANOVA and simultaneous component analysis (SCA, a method similar to principal component analysis). In ASCA, the retention times data matrix (Dtr) was split into the different effect datamatrices containing the level averages for each factor, and the interaction data matrices describing the possible interactionsbetween the considered factors.In this work, an ANOVA model considering four factors and their two-way interactions was considered asis shown in Eq.3: