Analytical Methods Committee, AMCTB No. 73
First published on 6th June 2016
The meaning of the term ‘representative sampling’ is unclear and often leads to undue optimism about both the quality of sampling and the reliability of the resultant measurement results and regulatory decisions. The term ‘appropriate sampling’ is preferable to describe sampling that gives rise to measurement values with uncertainties that are fit-for-purpose.
Regulations in many sectors (e.g., environment, food, health) often set a level of compliance as a limit value (e.g., as a maximum, minimum or average value). Demonstrating compliance against this limit requires a sampling and analytical plan (SAP) that often specifies the need for ‘representative’ samples and chemical analysis by an accredited laboratory. The SAP does not often investigate and report uncertainty of the measurements, the variability of the analyte concentration in the material over space or time, or the evidence that the samples were really representative. One way supposedly to demonstrate that sampling is representative is to duplicate it. If the difference between the results is sufficiently small then this goes some way to demonstrating representativeness, but ignores the possibility of a common sampling bias affecting both tests (see Fig. 1). It is relevant, therefore, that ISO 3534-4 (ref. 1) states that ‘the notion of a representative sample is fraught with controversy with some survey practitioners rejecting the term altogether’.
The new approach outlined below, demands a quantitative procedure for answering this question. It indicates that the concept of the mythical ‘representative’ sample should be replaced by that of a more pragmatic but transparent ‘appropriate’ sample, where fitness for purpose (FFP) can be demonstrated and justified on a financial basis of minimum overall cost.
The more explicit term ‘representative sample’, has been defined in survey statistics as a ‘sample for which the observed values have the same distribution as that in the population’.1 Some of the ambiguity in this term is revealed by the change in its meaning in the definition for analytical chemistry2 and physical sampling3 as a ‘sample resulting from a sampling plan that can be expected to reflect adequately the properties of interest in the parent population’. This latter definition suggests that a sample will not represent the population perfectly, but only to a degree that is considered acceptable, but this is not made explicit.
An important issue is therefore to specify when an analytical sample can be considered to ‘reflect adequately the properties of interest in the parent population’ One approach has been simply to state that if a physical sample is taken by the ‘correct’ implementation of a ‘correct’ sampling protocol, then the sample will be acceptable by definition.5 A more transparent approach is to describe a sample as ‘appropriate’6 if it enables us to make measurements that are fit-for-purpose.
Fitness for purpose has been defined as ‘the degree to which data produced by a measurement process enables a user to make technically and administratively correct decisions for a stated purpose’.7 One way to identify when measurement results are FFP is to estimate their uncertainty in terms of costs, both of the measurement including sampling and the average consequential costs of incorrect decisions caused by excessive levels of uncertainty. When the sum of both costs is at a minimum, fitness for purpose and appropriate sampling have been achieved at an optimal level of measurement uncertainty.8 It is often the case that the sampling process contributes the dominant proportion of the measurement uncertainty. In that case fitness can be achieved most cost-effectively by adjusting the uncertainty arising from the sampling process.
There are at least two ways that sampling can be made appropriate. The mass of the sample can be changed, typically by altering the number of increments that are collected within the sampling target to make a composite sample. Alternatively, the number of samples (n) taken from the sampling site, and analysed individually, can be changed. The uncertainty on the calculated mean value, expressed as the standard error on the mean (s/√n), can be thereby reduced. This quantitative approach can be used decide how many samples (or increments) are needed for a particular site (or target), to make the sampling appropriate. Refinements in these broad calculations are needed where the frequency distributions are not normal, and for low values of n.
• The most reliable action is not to believe that a sample is representative, but to seek specific rigorous evidence from validation. A sample could never be perfectly representative because the sample is never identical to the average composition of the sampling target (i.e., parent population): there will always be residual random and systematic differences. These effects need be to acceptable small and the resulting uncertainty explicitly stated.
• A better way to achieve the wider goal of reliable measurements, and the regulatory decisions that are based upon them, is to move away from ‘representative’ to ‘appropriate’ sampling. An ‘appropriate’ sample is one for which the resultant measurement value has an uncertainty that is fit for its intended purpose. Evidence that sampling can be judged ‘appropriate’ could be that results of a validation procedure in which the measurement uncertainty arising from sampling according to a given protocol was deemed fit for purpose. Samples derived from the subsequent applications of this validated protocol to other sampling targets could be considered appropriate if sampling and analytical quality control procedures showed no significant deviation from the values found at the validation.
M. H. Ramsey (University of Sussex) and B. Barnes (Environment Agency)
This Technical Brief was drafted on behalf of the Subcommittee for Uncertainty from Sampling and approved by the Analytical Methods Committee on 29/04/16.
This journal is © The Royal Society of Chemistry 2016 |