Frequentist vs. Bayesian statistics: The basics
When conducting statistical inference and analyses, researchers are often faced between two approaches: frequentist and Bayesian statistics. Both methods aim to draw conclusions from data, but they are based on different assumptions. This article explores key differences between these approaches and helps you figure out which one to use and when.
What is inference statistics?
Inferential statistics is the process of using data from a small group (sample) to make conclusions or predictions about a larger group (population). It helps researchers test ideas, estimate unknown values, and make decisions based on data, even when they can’t study everyone in the population.
There are two main approaches to inferential statistics: Bayesian and frequentist. Frequentist statistics looks at patterns in the data by repeating experiments to make predictions or decisions, treating probabilities as fixed. In contrast, Bayesian statistics adds prior knowledge or beliefs into the mix, updating these beliefs as new data comes in and considering uncertainty.
Frequentist statistics were developed in the early 20th century by Ronald Fisher, Jerzy Neyman, and Egon Pearson and is the standard approach used in most scientific research today. Fisher introduced analysis of variance (ANOVA) in 1919, and Neyman and Pearson later formalized hypothesis testing in 1933. This frequentist framework relies on hypothesis testing, using p-values and confidence intervals (CI) to determine the significance of results.
Bayesian statistics, on the other hand, dates back to the 18th century and was pioneered by Thomas Bayes. Though largely overlooked for centuries, Bayesian statistics gained traction in the mid-20th century as more advanced computational tools became available that made its complex calculations more feasible. Today, it’s largely used in machine learning and meta-analyses.
The frequentist vs. Bayesian perspective
The Frequentist POV
Frequentist statistics focuses on determining the likelihood of an event, like testing a hypothesis, based on how often it occurs. It treats probability as the frequency of an event happening over an infinite number of trials. Experiments are repeated a set number of times, like flipping a coin 2,000 times to see how many heads or tails you get, to gather sufficient data for analysis.
In frequentist inference, the parameter you're trying to estimate from a population (like the mean) is considered fixed, meaning there's one true value you're trying to find. Data points collected from an experiment are used to estimate these population parameters, and then probability calculations are made based on the data.
For example, if you flip a coin 100 times and get 48 heads, you estimate the probability of heads as 48/100 = 0.48. The true probability is assumed to be fixed, and with more flips, your estimate should get closer to it.
Hypothesis testing
Frequentist statistics use hypothesis testing, which is a way to check if there’s enough evidence in data to support a claim or idea. You start with two possibilities: the null hypothesis, which says there’s no effect or difference, and the alternative hypothesis, which suggests there is an effect.
After collecting data, you calculate a p-value, which shows how likely it is to see your results if the null hypothesis were true. If the p-value is small (usually less than 0.05), you reject the null hypothesis and say there’s enough evidence for the alternative. If it’s large, you don’t reject the null hypothesis and say there’s not enough evidence to support the claim.
For example, imagine you’re testing whether eating a protein-dense diet improves memory. You assign 100 participants to either a high versus low protein-based diet for eight weeks. After the intervention, both groups take a memory test, and their memory scores are compared. Using a frequentist approach, you might perform a t-test, which gives you a p-value of, say, 0.07. Since this is above the standard 0.05 threshold, you accept the null hypothesis and conclude that there is no effect. However, this approach only determines statistical significance, not the probability that the effect is true.
A few caveats on p-values:
P-values only measure the likelihood of a result under the null hypothesis, and do not include any information about the alternative hypothesis. This means a result that is unlikely under the rejected null hypothesis may be almost as unlikely under the accepted alternative hypothesis.
P-values also come with a CI which represents a range of values within which you can be confident that the true population parameter lies, based on your sample data. For example, a 95% confidence interval means that if you repeated the experiment many times, 95% of the intervals you calculate would contain the true value. It doesn't give you a probability about a specific result, but rather about the method used to estimate it.
The Bayesian POV
Bayesian statistics defines probability as a measure of belief or certainty about an event, and assumes that parameters have probability distributions rather than fixed points as in frequentist statistics. This belief may be updated as new information (aka data) becomes available.
This prior belief here is known as the prior probability before the study is conducted, which is then converted by Bayes' theorem into a posterior probability - or the probability of the parameter given the results of the study.
In Bayesian inference, if you flip a coin 100 times and get 48 heads, you start with a prior belief about the probability of heads (e.g., 50% for a fair coin). As you collect data, you update this belief using Bayes' theorem, resulting in a new probability distribution that reflects both your prior knowledge and the observed results. This allows you to express uncertainty about the true probability and refine your estimate as more data comes in.
Bayes Factor
Bayes’ theorem gives you a Bayes Factor (BF), which is a way to measure how strong the evidence is for a hypothesis. A BF greater than 20 usually suggests strong evidence for the hypothesis, while a BF less than 1 suggests no evidence. For example, a BF of 30 shows strong support for the hypothesis, while a BF of 10 shows weaker support.
Unlike frequentist methods, which give a point estimate, the BF compares the evidence for two different hypotheses. Bayesian statistics also gives confidence intervals, which tell you how certain the result is. For instance, a 95% confidence interval means you can be 95% sure about the result.
In our food and memory study, a Bayesian approach would begin with a prior belief about the effects of protein on the cellular level in the memory centers in the brain, based on previous research or expert opinion. After collecting data from both groups, Bayes' theorem updates this prior with the observed results to calculate a posterior probability. Instead of a p-value, the analysis might yields a BF of 55, indicating how strongly the data support the effects of protein on memory. This approach provides a measure of certainty about the effect rather than just a binary decision.
Advantages and disadvantages
Table of advantages/disadvantages of Bayesian and frequentist statistics
Choosing the right approach
There is no right or wrong answer when it comes to choosing between frequentist or Bayesian, but things like the type of data you have and whether you have any priors might influence which method would be better. Consider these before your next analysis:
Prior knowledge. If you have strong prior information about your research question or topic that you want to incorporate, Bayesian methods may be better as they let you factor in your prior beliefs.
Type of data. If you have an experiment where you’ll need to acquire new data, Bayesian methods are ideal since they allow for updates based on new information. Meanwhile, frequentist approaches are better suited for large, one-time experiments where you can select the appropriate statistical test.
Sample size. The Bayesian approach accounts for prior knowledge and can handle extreme values better, making it work better if you have small datasets. In contrast, frequentist methods are more suitable for large datasets where statistical significance is easier to assess.
Computational complexity. Frequentist analyses are simpler and widely available, whereas Bayesian analyses require more computational power and advanced statistical tools. If you have the training and resources to conduct Bayesian analysis, go ahead and try it out.
Interpretation of results. Frequentist statistics provide binary conclusions (reject or fail to reject the null hypothesis), whereas Bayesian statistics offer a probability-based understanding of uncertainty. Consider which one would be more informative for your research question.
Takeaways
Frequentist methods are popular because they are simple and widely used, while Bayesian methods are more flexible and allow you to include prior knowledge and update beliefs with new data. The best choice depends on the research question, the type of data, and how you want to interpret the results. Researchers should consider the pros and cons of each approach to find the best fit for their analysis or apply a sensitivity analysis and compare the results using both approaches.
References
Cohen, M. X. (2023). Modern Statistics: Intuition, Math, Python, R. SinceXPress.
Neyman, J. (1977). Frequentist probability and frequentist statistics. Synthese, 36(1), 97-131.
Bolstad, W. M., & Curran, J. M. (2016). Introduction to Bayesian statistics. John Wiley & Sons.