To p or not to p: the use of p-values in analytical science
Abstract
A significance test can be performed by calculating a test statistic, such as Student’s t or chi-squared, and comparing it with a critical value for the corresponding distribution. If the test statistic crosses the critical value threshold, the test is considered “significant”. The critical value is chosen so that there is a low probability – often 5% (for “95% confidence”) – of obtaining a significant test result by chance alone. Routine use of computers has changed this situation; software presents critical values at traditional probabilities, but now also calculates a probability, the “p-value”, for the calculated value of the test statistic. A low p-value – say, under 0.05 – can be taken as a significant result in the same way as a test statistic passing the 95% critical value. This applies to a wide variety of statistical tests, so p-values now pop-up routinely in statistical software. However, their real meaning is not as simple as it seems, and the widespread use of p-values in science has recently been challenged – even banned. What does this mean for p-values in analytical science?
- This article is part of the themed collection: Analytical Methods Committee Technical Briefs