Estimating mass sentiment from noisy, biased signals:
recent Australian elections and the Voice referendum

25 August 2023

Polls are like 1940s radar

  • noisy sensor (sampling error)
  • likely a biased sensor (“house effects”)
  • snapshots of dynamic target (discrete field period)
  • target’s law of motion is unknown (not ballistic)
  • limited resolution (coarse reporting of published polls)
  • dependencies among multiple targets (vote shares sum to 100%)

“House” effects: biases specific to a polling company

  • sampling methodology (e.g., RDD, landline/mobile mix; quotas from web panel)
  • weighting procedures and selection of weighting variables (post-stratification via raking; propensity score matching)
  • survey mode (live interviewer, IVR, web self-complete)
  • question wording
  • response options (are minor parties or DK offered or volunteered?; are DKs pushed?)
  • field operations (time of day, day of week)
  • reporting conventions (DKs reported or not)
  • compounded in low or uncertain voter turnout environments

Goals of model

  • combine information from multiple noisy/biased signals
  • recover trajectory, learn about campaigns and changes in public opinion
  • learn about pollster biases
  • forecasts for outcomes

Model for poll averaging: setup & notation, scalar target

  • Let \(t\) index campaign days.
  • Poll \(p\) fielded on day \(t\) by polling company \(j\) yields a estimated voting intention, a proportion \(\color{cyan}{y_p} \in [0,1]\), with sample size \(\color{cyan}{n_p}\). Variance of this estimate is approximately \(\color{cyan}{V_p = y_p (1-y_p)/n_p}\).
  • True, latent voting intentions on day \(t\) are \(\color{orange}{\xi_t} \in [0,1]\). These are observed exactly on election days, \(\color{orange}{\xi_1}\) and \(\color{orange}{\xi_T}\), respectively.
  • Polling company \(j\) has a time-invariant “house effect” \(\color{orange}{\delta_j}\), such that \(E(\color{cyan}{y_p}) = \color{orange}{\xi_{t(p)}} + \color{orange}{\delta_{j(p)}}\).

State-space model for poll averaging: locally constant latent state

  • Measurement model: \(\color{cyan}{y_{p}} \sim N(\color{orange}{\xi_{t(p)}} + \color{orange}{\delta_{j(p)}} \, , \, \color{cyan}{V_p})\)

  • Dynamic model: \(\color{orange}{\xi_t} \sim N(\color{orange}{\xi_{t-1}}, \color{orange}{\omega^2})\)

  • Given published polls, \(\color{cyan}{\boldsymbol{Y}}\), sample sizes, field dates and identity of polling companies — and the model — we seek

  1. trajectory of latent voting intentions \(\color{orange}{\boldsymbol{\xi}} = (\color{orange}{\xi_1}, \ldots, \color{orange}{\xi_T})'\)
  2. house effects: \(\color{orange}{\boldsymbol{\delta}} = (\color{orange}{\delta_1}, \ldots, \color{orange}{\delta_J})'\)
  3. “pace of change” parameter (innovation variance), \(\color{orange}{\omega^2}\).

Estimation and inference

  • Gaussian law of motion: Kalman filter.
  • in Bayesian statistics: dynamic linear model (West & Harrison).
  • house effects and partially observed polling data makes the model slightly non-standard for off-the-shelf Kalman filtering (many packages in R)
  • EM or MCMC via R and C/C++
  • jags via rjags (Plummer 2019).
  • Stan via RStan (Stan Development Team 2020)
  • nimble (de Valpine et al. 2017)
  • pomp (King et al. 2016)

Elaborations

  • optionally, use endpoint constraints from election results observed on \(\color{orange}{\xi_1}\) and (ex post) \(\color{orange}{\xi_T}\).

  • Augment model with unknown step/discontinuities \(\color{orange}{\gamma}\) in \(\color{orange}{\boldsymbol{\xi}}\) trajectory, for known “event” days (e.g., leadership changes).

  • add trend component to model: \[ \begin{align} \begin{pmatrix} \xi_{t+1} \\ \zeta_{t+1} \end{pmatrix} & = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix} \begin{pmatrix} \xi_t \\ \zeta_t \end{pmatrix} + \begin{pmatrix} v_t \\ w_t \end{pmatrix} \\[12pt] v_t & \sim N(0, \sigma^2_v) \\ w_t & \sim N(0, \sigma^2_w) \end{align} \]

Elaborations, continued

  • volatility regimes: “campaign period” vs bulk of the electoral cycle.

  • multivariate targets: US presidential politics, tracking 11 “battleground” states.

    • reasonable number of state-specific polls
    • interesting and subtle choices about covariance structure of latent state vector

Identification of model parameters

  • as initially presented, the model is over-parameterised
  • \(E(\color{cyan}{y_p}) = \color{orange}{\xi_{t(p)}} + \color{orange}{\delta_{j(p)}}\).
  • Invariant to translation: indistinguishable from \(E(\color{cyan}{y_p}) = [\color{orange}{\xi_{t(p)}} + \color{red}{c}] + [\color{orange}{\delta_{j(p)}} - \color{red}{c}], \quad \forall\ \color{red}{c} \neq 0\).
  • Post-election, end-point constraints: anchor \(\xi_T\) to known election result, and/or \(\xi_1\) to past election result as may be appropriate.
  • “Sum-to-zero” normalisation of house effects \(\color{orange}{\delta}\); i.e., set \(\color{red}{c} = \color{orange}{\bar{\delta}}\), such that \(\xi_t\) are identified up to a translation equal to the average bias of all pollsters.
  • With \(\color{orange}{\xi_1}\) or \(\color{orange}{\xi_T}\) known we pin down the \(\{ \color{orange}{\mathbf{\xi}} \}\) trajectory and can relax normalising restriction on house effects, revealing absolute (vs relative) pollster biases.

Examples

  • Australian federal elections 2007-2022

  • Pollster biases

  • Voice referendum

Example, 2019 Australian federal election

  • Widely considered one of the biggest “misses” for the polling industry.

  • prompted an international review and formation of the Australian Polling Council

2019 miss on 2PP was large

Example, 2019 Australian federal election

Election Day error of poll averages, 2007 to 2022

Positive/negative errors = polls are over/under estimates of party support.
year ALP GRN LNP LNP2PP OTH
2007 1.74 0.38 -1.00 -1.58 -1.37
2010 0.64 1.26 -1.48 -1.75 -0.27
2013 0.56 1.52 -1.19 -0.69 -0.77
2016 -1.56 1.29 0.06 0.59 -0.06
2019 1.60 0.10 -3.57 -2.46 1.97
2022 3.18 -0.30 -0.36 -1.18 -
Average 1.03 0.71 -1.26 -1.18 -0.10
MAE 1.58 0.82 1.10 1.38 0.77

Errors in poll averages, last 60 days of election campaigns 2007-2022

Known error in poll average 2007-2019 improves 2022 performance for ALP & 2PP poll averages

Corrected poll averages suggest LNP usually out-campaigns Labor

Voice referendum

  • 36 polls

  • 8 pollsters

    • two of them contributing just one poll each
    • two of them contributing just two polls each
    • Resolve 11 polls, Essential 9 polls, YouGov 7 polls.
  • for simplicity, we compute \(y\) = Yes/(Yes + No)

  • post 2019, many pollsters reporting effective sample sizes

“Yes” has shed nearly 20 percentage points in 12 months

Voice, Yes trend, weekly ∆

Voice, pollster biases

Summary

  • model can encompass different electoral & campaign settings

  • interesting professional journey from poll-averaging to election forecasting

  • in Australian context, discoveries include:

    • poll bias stubbornly persistent, tendency to underestimate Coalition support, overestimate Labor

    • after ex post correction for this bias, persistent trend to the Coalition over the closing months of Australian election cycles.

  • biases with respect to election poll averaging (interpreted as election forecasts) suggests caution in extrapolating from Voice poll average

    • little to no experience with how poll averages wrt referenda translate into referenda results

Thank you