Believe it or not: with few exceptions, univariate calibration is not straightforward at all! This is an important observation, since a thorough understanding of the intricacies of univariate calibration is key to investigating the properties of multivariate/multiway extensions. A closer examination of univariate calibration can therefore be seen as a logical step towards getting more out of data in general.
Giving a systematic, complete overview of the theory is clearly out of the scope of this web site. Instead, we aim to make the relevant literature accessible by summarizing important contributions. We pay considerable attention to aspects that also play an important role when moving to highercomplexity predictor data. Moreover, we point out directions for further research.
This page is organized as follows:
Official literature
Univariate calibration has extensive coverage in the official literature. Guidelines have, for example, been issued by the International Union for Pure and Applied Chemistry (IUPAC), see:
 K. Danzer and L.A. Currie
Guidelines for calibration in analytical chemistry
Part 1. Fundamentals and single component calibration
Pure & Applied Chemistry, 70 (1998) 9931014
Download (=1,091 kB: © IUPAC 1998)
Potential multivariate and multiway extensions of generally accepted univariate methodology are listed in:
 A. Olivieri, N.M. Faber, J. Ferré, R. Boqué, J.H. Kalivas and H. Mark
Guidelines for calibration in analytical chemistry
Part 3. Uncertainty estimation and figures of merit for multivariate calibration
Pure & Applied Chemistry, 78 (2006) 633661
Download (=645 kB: © IUPAC 2006)

Noise in the predictand only: classical vs. inverse model

Top

Without loss of generality, we will, unless otherwise mentioned, assume in the remainder that the data pairs to be modelled consist of analye content and instrumental signal for chemical samples. The basic statistics literature is mainly concerned with the following model:
where y is the noisy signal, x is the errorless content, a is the true intercept, b is the true slope and e consists of random error. For what follows, it is necessary to make the (standard) assumption that e is iid normal.
The classical least squares (CLS) fit, i.e., 'forward' calibration through the regression of y onto x, leads to estimates a and b (for a and b) from which the content for an unknown sample is obtained by 'inverse' prediction as
where the subscripts 'u' and 'cl' refer to the unknown sample and the classical model, respectively, and the 'hat' (ˆ) symbolizes prediction (of a random variable) or estimation (of a parameter).
This twostage process is illustrated in:

With noise in the signal (y) only, the CLS fit is unbiased. Moreover, it yields parameter estimates with minimum variance. In statistics jargon, the CLS model is BLUE  best linear unbiased estimate. However, it has been known for some time that the classical prediction is not efficient for prediction! Instead of regressing (noisy) signal (y) onto (errorless) content (x), one should regress content (x) onto signal (y). Least squares regression of x onto y is known as inverse least squares (ILS). It leads to parameter estimates a' and b' from which the 'forward' prediction follows as
where the subscript 'inv' refers to the inverse model.
An excellent overview of the relevant literature is given in:
 J. Tellinghuisen
Inverse vs. classical calibration for small data sets
Fresenius Journal of Analytical Chemistry, 368 (2000) 585588
The relative predictive ability of the classical and inverse model can be explained from the following approximate expressions for prediction bias and variance (square of the standard error):
where:
and denote bias and standard error of the associated quantity,
s^{2} is the variance of e,
is the mean xvalue for the training set of n samples,
in which is the variance of the xvalues in the training set,
in which , and
m is the number of replicates for the prediction sample.
Extensive Monte Carlo simulations have demonstrated the adequacy of approximations [4][7] for large as well as small n.
Prediction bias and variance can be combined into a mean squared error (MSE) of prediction using the wellknown expression,
It is important to note that the MSE criterion is meaningful only if systematic (bias) and random (standard error) deviations are equally harmful. This is often the case. An important exception is legal work where bias that overly incriminates the subject would violate the principle 'in dubio pro re'.
The following detailed remarks are adapted from the paper by Tellinghuisen:
 For finite n, the classical prediction has infinite variance, since the denominator (b) in Equation [2] is normally distributed, hence a zero division occurs with nonzero probability. However, the expectation and variance of can be defined in an asymptotic sense (through approximations [4] and [6]), which will normally be adequate and meaningful.
 The inverse prediction has finite variance and mean squared error for .
 Although the classical prediction is unbiased in the limit (hence is consistent), it is biased at finite n, with a bias magnitude comparable to that for the inverse prediction for small n (~5).
 The biases in both predictions vanish when .
 The range of x over which the inverse prediction is more efficient than the classical prediction, is greater for small n than for large n.
 The distinction is relevant only for calibration data that are inherently very noisy, as the results for the two predictions differ insignificantly for sufficiently small s.
The improved predictive ability of the inverse model has the interpretation of a favorable biasvariance tradeoff: the increase of bias is more than offset by the decrease of variance. Ignoring the final terms inside the brackets of approximations [6] and [7], yields an approximate decrease by a factor . In fact, this decrease is mainly the result of the (favorable) negative (proportional) bias in the slope estimate:


Figure UVC 2: Linear calibration function without intercept  the simplest inverse model. The slope (b') is directly proportional to the amount of error propagation when predicting the true content (x) from the noisy signal (y). Consequently, from a variance perspective a small b' is preferable.

The motivation for inverse calibration is even stronger in the multivariate/multiway domain. The reason for this is, that inverse (multivariate/multiway) calibration enables one to predict for individual analytes whithout explicitly accounting for all interfering species in the unknown mixture. Interfering species are adequately compensated for implicitly by the model if their contribution to the training set predictors (spectra) is representative for future unknown samples. This is especially important for applications in, for example, the food, environmental, petrochemical and life sciences, where number and nature of interfering species is usually unknown. By contrast, the classical (multivariate/multiway) model requires the purecomponent predictors (spectra) for all species to be known, which is often not practical.
A complication arises if excessive predictor noise leads to severe bias in the model parameter estimates (regression vector/matrix/array coefficients), hence severe prediction bias. An illustrative example is given in the classic textbook:
 H. Martens and T. Næs
Multivariate calibration, Wiley, Chichester (1989)
Martens and Næs performed Monte Carlo simulations, resulting in plots like:


Figure UVC 3: NIR prediction versus reference value for 109 test objects. Artificial noise was added to the predictors (spectra) of the calibration set (X_{cal}). The calibration formula was estimated by partial least squares (PLS) regression based on the data from 30 calibration objects. The solid diagonal indicates the ideal results. For the other simulation settings, see Figure 4.2 in Martens and Næs (pp. 242243).

It is seen that the lowvalued predictions are severely biased high  the lowest 18(!) reference values are predicted above the target line , while the converse is true for the high values. As detailed above for the singlepredictor case, this prediction bias must be the effect of bias in the model parameter estimates. Consequently, prediction bias can be eliminated to some extent when applying a bias correction to the model. A straightforward bias correction is possible when an appropriate bias expression is available. Then, a biascorrected model simply follows by subtracting this bias estimate.
Approximate bias expressions are derived for the ILS model (multiple predictors) in:
 S.D. Hodges and P.G. Moore
Data uncertainties and least squares regression
Applied Statistics, 21 (1972) 185195
 R.B. Davies and B. Hutton
The effect of errors in the independent variables in linear regression
Biometrika, 62 (1975) 383391
For the PLS model, approximations are derived under various distributional assumptions in:
 C.H. Spiegelman, M.J. McShane, M.J. Goetz, M. Motamedi, Q.L. Yue and G.L. Coté
Theoretical justification of wavelength selection in PLS calibration: development of a new algorithm
Analytical Chemistry, 70 (1998) 3544
 A.J. Burnham, J.F. MacGregor and R. Viveros
Interpretation of regression coefficients under a latent variable regression model
Journal of Chemometrics, 15 (2001) 265284
 B. Nadler and R.R. Coifman
The prediction error in CLS and PLS: the importance of feature selection prior to multivariate calibration
Journal of Chemometrics, 19 (2005) 107118
It is noted that the first two contributions deal with the onefactor model.
Here it is suggested that a distributionfree approach could lead to a more generally applicable result. This approach has worked well for a multiway calibration method, see:
 N.M. Faber, J. Ferré and R. Boqué
Iteratively reweighted generalized rank annihilation method. 1. Improved handling of prediction bias
Chemometrics and Intelligent Laboratory Systems, 55 (2001) 6790
Approximate bias as well as variance expressions have in common that they are usually obtained by working out a truncated Taylor expansion. (Truncated after the first and secondorder term to obtain the approximate variance and bias, respectively.) Resampling methods are generally more accurate because they do not depend on this kind of approximations. An appropriate resampling method for bias estimation is simulation extrapolation (SIMEX), see:
 R.J. Carroll, D. Ruppert and L.A. Stefanski
Measurement error in nonlinear models, Chapman and Hall, London (1995)
It is recommended to always test the adequacy of approximate bias and variance expressions using an appropriate resampling method.
A final caveat seems to be in order. Bias correction leads to larger model parameter estimates (absolute value), hence to increased propagation of predictor noise in the prediction stage, cf. Figure UVC 2. Consequently, it seems best to work with two models when bias plays a dominating role, namely:
 the original model for samples close to the center, where bias is relatively unimportant, and
 the biascorrected model for extreme samples for which bias is unacceptably large, e.g. near the limit of detection.

Noise in the predictor too: estimation vs. prediction

Top

Nonnegligible noise in the predictor variables is of course the common situation in (inverse) multivariate and multiway calibration  just think of instrument noise. It therefore makes sense to discuss this case in some detail for univariate calibration too. As has been explained in the preceding section, estimation and prediction are essentially distinct tasks with conflicting requirements owing to the associated uncertainty. These conflicting requirements were described as a biasvariance tradeoff. Clearly, with 'errors in both axes', the same tradeoff principle holds as well, i.e., noise in the predictors leads to deflated slope estimates. This bias is an undesirable complication when model interpretation is a major goal of the analysis. After all, how to intepret an estimate that is systematically too small? For a thorough discussion of this complication in connection with PLS, see:
 A.J. Burnham, J.F. MacGregor and R. Viveros
Interpretation of regression coefficients under a latent variable regression model
Journal of Chemometrics, 15 (2001) 265284
However, a negative slope bias leads to decreased propagation of predictor noise in the prediction phase, i.e., a decreased variance contribution to root mean squared error of prediction (RMSEP). The resulting decrease in prediction variance may actually more than outweigh the increase in prediction bias. The following illustrative example is adapted from:
 C.D. Brown
Discordance between net analyte signal theory and practical multivariate calibration
Analytical Chemistry, 76 (2004) 43644373
Brown emphasizes that the optimum RMSEP is achieved when accepting a negative bias in the slope estimate:

We have performed Monte Carlo simulations, resulting in 10,000 data sets generated according to the noise setting depicted in the left panel of Figure UVC 4. Models were estimated using ordinary least squares (OLS), total least squares (TLS) and corrected least squares (CLS). OLS simply regresses y onto x. This leads to biased slope estimates because plain variances are modelled in OLS. Since the predictor variance has a spurious noise contribution, x gets too much weight in the regression. This explains why the OLS slope estimate is biased low. TLS and CLS, on the other hand, are methods that yield slope estimates that are asymptotically unbiased. This is achieved by compensating for the spurious contribution of the predictor noise in the regression. Both methods require an estimate of the predictor noise variance to minimize the spurious contribution. Whereas TLS estimates the predictor noise from the data, CLS utlizes an independent noise estimate. For more details about TLS and CLS, see:
 S. Van Huffel and J. Vandewalle
The total least squares problem. Computational aspects and analysis, SIAM (1991)
The following table gives an overview of the uncertainties associated with estimation and prediction, where the root mean squared error (RMSE) is further broken down into the standard error (SE; square root of the variance) and bias:

Table UVC 1: Summary statistics obtained for the xnoise setting: s = 50 for both estimation and prediction.


It is observed that the theoretical expectations are realized for OLS, CLS and TLS. OLS has an inferior RMSE for estimation in column 2 due to the relatively large bias in column 4, but yields a smaller RMSE for prediction in column 5 because the standard error is reduced. (Note that SE and bias are added in quadrature to obtain the RMSE.) CLS and TLS are almost unbiased for estimation and prediction. These methods are preferable if focus is on the model, e.g. for interpretation. It is noted that for this particular example the differences in RMSE are much smaller for prediction than for estimation. One might therefore opt for TLS or CLS to ensure a small bias at the expense of a slightly increased prediction RMSE. Finally, the standard error of estimation is slightly larger for CLS and TLS because a bias correction always introduces uncertainty.
The situation further complicates if the predictor noise has different magnitude during estimation and prediction. The following results are obtained by setting the predictor noise variance to zero during prediction. Although the opposite scenario is more typical in applied work, i.e., to have relatively noisy predictors in the prediction phase (see below), the results are nevertheless illustrative of the actions that can be taken. The estimation results in columns 25 are almost identical to the ones presented above  small differences can be observed that are caused by a different initialization of the pseudorandom number generator:

Table UVC 2: Summary statistics obtained for the xnoise setting: s = 50 for estimation, whereas s = 0 for prediction.


Since the predictors are noisefree during prediction, there is obviously no biasvariance tradeoff that might favor the use of OLS, hence TLS and CLS are to be preferred.
As mentioned above, it is more natural to have noisier predictor variables during prediction, especially in online applications, see:
 C.M. Andersen, R. Bro and P.B. Brockhoff
Quantifying and handling errors in instrumental measurements using the measurement error theory
Journal of Chemometrics, 17 (2003) 621629
With nonnegligible predictor noise during prediction only, one obtains the following results:

Table UVC 3: Summary statistics obtained for the xnoise setting: s = 0 for estimation, whereas s = 50 for prediction.


The OLS slope estimate is obviously unbiased because the predictors are errorfree during estimation. TLS cannot be used because the predictor noise (during prediction) cannot be estimated from the training data. 'CLS' is the opposite of CLS in the sense that it introduces the bias that would be corrected by CLS when the training data were corrupted by the noise now encountered during prediction only. Some afterthought shows that OLS should behave similar to CLS and TLS in Table UVC 1, likewise 'CLS' and OLS: OLS is clearly superior for estimation hence intepretation, whereas 'CLS' is slightly better for prediction.

Extrapolation in prediction: limit of detection

Top

The novice is usually taught that one should use calibration models to predict at interpolating positions only, because it is not safe to take an empirical model outside the calibrated range. However, some of the most interesting applications arise when extrapolating, e.g.:
 prediction of future events;
 development of a product with higher consumer appreciation using preference mapping;
 search for a molecule with higher biological activity using a quantitative structure activity relationship (QSAR) model;
 determination of analyte concentration using the method of standard additions; and
 detection of lower analyte concentrations in trace analysis.
The remainder of this section is concerned with the latter application, namely limit of detection (LOD) estimation. The following example is taken from:
 F.J. del Río Bocio, J. Riu, R. Boqué and F.X. Rius
Limits of detection in linear regression with errors in the concentration
Journal of Chemometrics, 17 (2003) 413421
These detection limits are based on the prediction intervals developed in:
 F.J. del Río Bocio, J. Riu and F.X. Rius
Prediction intervals in linear regression taking into account errors on both axes
Journal of Chemometrics, 15 (2001) 773788
It is noted that this error analysis can be further refined using the results derived in:
 M. GaleaRojas, M.V. de Castilho, H. Bolfarine and M. de Castro
Detection of analytical bias
Analyst, 128 (2003) 10731081
 M. de Castro, M. GaleaRojas, H. Bolfarine and M.V. de Castilho
Detection of analytical bias when comparing two or more measuring methods
Journal of Chemometrics, 18 (2004) 431440
The data sets under study were characterized by heteroscedastic errors in both axes. Three methods were considered to fit straight lines through the (x,y)data points:
 ordinary least squares (OLS), which only accommodates for homoscedastic errors in the yaxis;
 weighted least squares (WLS), which generalizes OLS to heteroscedastic errors in the yaxis; and
 bivariate least squares (BLS), which generalizes total least squares (TLS) to heteroscedastic errors in both axes.
Results are briefly discussed for:
 Xray fluorescence; and
 cappilary electrophoresis.
1. Xray fluorescence (XRF)
The calibration samples for the XRF determination are 15 geological certified reference materials (CRMs). The errors in the CRMs (xaxis error) were calculated from a worldwide interlaboratory certification trial while the error in the instrumental response (yaxis error) was obtained from 7 replicated measurements of each CRM on different days. Interferences were taken into account and possible matrix effects were corrected with the incoherent radiation (Compton) of the sample.
Owing to the relatively large error in the xaxis, the models are quite different, as shown here for Na_{2}O:

Not only do the model estimates differ themselves but also the uncertainties associated with these models. The combination of these two effects leads to large differences among the estimates for limit of detection:

Table UVC 4: Detection limits for the nine analytes studied by XRF when a and b errors are set to 5%. All results are expressed in ppm.


^{1} See Figure UVC 5 ^{2} See Figure UVC 6

With few exceptions, the detection limit estimates are lowest for BLS.
2. Cappilary electrophoresis (CE)
Similar observations are made for this data set:

Table UVC 5: Detection limits for the three anions analyzed by CE when a and b errors are set to 5%. All results are expressed in ppm.


Conclusions

Top

Whether to prefer the classical or inverse model depends on the goal of the analysis. Biased parameter estimates can be unacceptable if focus is on interpretability. Total least squares (TLS) and corrected least squares (CLS) are examples of methods that reduce the bias caused by random errors. By contrast, a biased calibration model can be optimal in terms of RMSEP when prediction is the goal. The situation around prediction is further complicated when (1) the predictor noise is different during estimation and prediction and (2) extreme extrapolation is attempted. The same considerations play a role when moving to highercomplexity predictors. For example, the generalization of bivariate least squares (BLS) is maximum likelihood calibration.

References & further information

Top

Open a list of references. These references should supplement the ones that are well known from basic statistics.
For further information, please contact Jordi Riu:

