Averaging Polls and Sampling Duration

In deciding how to aggregate public opinion data, the ORACLE team was given the task of deciding which polls should have more of an effect on the model. A poll taken one day before the election, for example, should have a larger impact on the average than a poll taken seventy-six days before the election. This scheme can apply to many factors, including poll grade, sample size, and more.

Although our class ultimately voted to use the z-test method, in which we take a new batch of polls each week and replace the old ones if necessary, we are still curious as to whether a poll’s duration -- the time elapsed between the first and last response -- factors into its accuracy.

To explain why duration could be an important factor, let’s use an example. Imagine a tight race between Candidate A and Candidate B. A polling company plans to take two national polls, one over the course of a week, and another on just the Monday of that week. On Sunday night, a viral news story shows that Candidate A habitually steals gum from his local pharmacy. But early Tuesday morning, the report is discredited as Candidate A claims he has been allergic to gum for his entire life.

Here’s how the polls turned out:

Week-Long Poll: Candidate A - 48% Candidate B - 46%

Monday Poll: Candidate A - 40% Candidate B - 52%

There’s an eye-popping difference between these two polls, and it’s not hard to see why. Every respondent to the Monday poll was under the impression that Candidate A is a shoplifter, while only about 1 in 7 respondents to the week-long poll had the same misconception. This is an example of a low-duration poll capturing a small shift in momentum, which can lead to incorrect data interpretation.

To test this concept, we weighted the polls in our 2016 collection by duration to see if the new average was considerably different. Because some poll durations were outliers (many elapsed for 20 or 30 days), we took the logarithm of each duration to bring values closer together. Then, we multiplied each of those logarithms by the respective Clinton percentage of the poll, and divided the total by the sum of logarithmic durations. Nationally, the two-party Clinton percentage went from 51.3% (before adjustment) to 51.1% (after adjustment), showing little, if any difference.

State	2party Clinton	Weighting	New 2party Clinton
US	0.513	Multiply every poll by its duration, divide by sum	0.505
US	0.513	Multiply every poll by log(duration+1), divide by sum	0.511
Michigan	0.533	Multiply every poll by its duration, divide by sum	0.538
Michigan	0.533	Multiply every poll by log(duration+1), divide by sum	0.532
Pennsylvania	0.526	Multiply every poll by its duration, divide by sum	0.525
Pennsylvania	0.526	Multiply every poll by log(duration+1), divide by sum	0.526

Another method we used to check the effect of duration was analyzing whether polls with duration of 2 days or higher were more accurate than all polls as a whole. After performing this method on specific Trump-carried swing states, including Michigan, Florida, and Ohio, we found no states with a difference greater than 1%.

State	Rawpoll_Clinton	Rawpoll_Clinton D>2	2p_Clinton	2p_Clinton D>2
Michigan	42.876	41.606	53.26	52.92
Minnesota	42.245	41.925	55.6	55.8
Florida	44.55	44.12	50.83	50.98
Iowa	39.673	39.49	0.498	0.499
Ohio	41.589	40.813	0.499	0.497
Pennsylvania	44.98	44.51	52.57	52.67

Results do not confirm the effect of duration we were looking for. There are many reasons for the absence of this effect, the most notable being that short-term momentum shifts do not radically change polls. This makes sense, as research has shown that the majority of voters know who they are voting for months before the election.

Overall, the 2018 method of averaging polls was not chosen by the 2020 class, partly as a result of it over predicting Clinton’s vote. The z-test method described earlier only overpredicted Clinton by 1.36 percentage points, while the averaging polls method (which weighted by time) predicted by 3 percentage points.

When looking at late polls in this election cycle, it may be helpful to analyze the poll’s key qualities -- time, grade, sample size -- before deciding whether that poll shows any new information. However, no individual feature should be considered make-or-break. Following the 2020 results, pollsters should analyze whether duration was a better predictor of accuracy than it was in 2016.

Presidential Election Model 2020

About Us

ORACLE of Blair is a project by seniors at Montgomery Blair High School in Silver Spring, Maryland. It was created during the Fall 2020 Political Statistics course taught by Mr. David Stein. Questions for the students about the model can be sent to mbhs.polistat@gmail.com, while Mr. Stein can be reached directly through the Blair website.

Please Note

Any views or opinions expressed on this site are those of the students in Montgomery Blair High School's 2020 Political Statistics class and do not necessarily reflect the official position of Montgomery Blair High School.