A Probabilistic Framework for Pairwise Comparisons
Go to CalculatorThe Bradley-Terry model is a statistical framework for modelling pairwise comparison data, first proposed by Bradley and Terry (1952). One application of the model is to take a series of pairwise comparisons (people are good at A vs B judgments due to reduced cognitive load in binary choices[1][4]) and converting them to a ranking (people struggle with direct ranking of multiple items due to working memory limitations[4][3]).
Given a set of items \( \{1, \ldots, n\} \), each with an associated positive parameter \( \pi_i \), the probability that item \( i \) is preferred over item \( j \) is given by:
where \( \pi_i \) represents the "strength" or "utility" of item \( i \).
Bradley & Terry (1952), Zermelo (1929)
The model is invariant to positive scaling of the parameters:
This means the parameters are typically constrained (e.g., \( \sum \pi_i = 1 \) or \( \pi_1 = 1 \)) for identifiability.
Bradley & Terry (1952)
The model can be expressed in log-odds form:
This reveals the model as a special case of logistic regression where the linear predictor is \( \log \pi_i - \log \pi_j \).
Bradley & Terry (1952), Springall (1973)
For any three items \( i, j, k \), if \( P(i > j) > \frac{1}{2} \) and \( P(j > k) > \frac{1}{2} \), then:
This stochastic transitivity property makes the model particularly suitable for ranking problems.
Davidson (1970)
The standard approach estimates parameters via maximum likelihood. Given observed comparisons \( y_{ij} \) (number of times \( i \) was preferred over \( j \)), the log-likelihood is:
The maximum likelihood estimates can be obtained via the following iterative scheme:
where \( w_i \) is the total number of wins for item \( i \), and \( n_{ij} = y_{ij} + y_{ji} \).
Bradley & Terry (1952), Zermelo (1929)
With a scale factor of 400, the Bradley-Terry model is equivalent to the Elo rating system:
where \( R_i \) and \( R_j \) are Elo ratings.
Bradley & Terry (1952)
Extension | Description | Reference |
---|---|---|
Plackett-Luce | Generalizes to rankings of multiple items | Plackett (1975), Luce (1959) |
Davidson Model | Incorporates ties in comparisons | Davidson (1970) |
Covariate BT | Includes item features as predictors | Springall (1973) |
BT Regression Trunk | Tree-based modeling of subject covariates | D'Ambrosio et al. (2023) |