In the run up to the first round of the presidential
election, a widely reported poll from the
Indonesian Survey
Institute (LSI) suggested that Mr. Susilo
was ahead of Megawati by 23%. About 43% of the
voters sampled said that they would vote for Susilo,
while about 20% would vote for Megawati. The margin
of error was 3%. As it turned out the results of the
first round were: SBY won
33.58% of the popular vote, and Megawati 26.29%. It
was a gap of about 7%, far less than what the polls
suggested. The polls clearly were biased
upwardly in favor of Susilo. But what’s more
damaging is it was negatively biased in disfavor of
Megawati.
After the first round of the presidential election,
another survey by LSI in July showed that Susilo
would obtain 68% of the votes, far surpassing the
23% incumbent President Megawati would collect. It’s
a 45% difference! A month later,
the LSI reported the results of its survey, which
polled 1,200 people, where 61.3% the sampled voters
would vote for Susilo and 32.7% for Megawati. The
margin of error was 3%. It’s a gap of 28.6%. The
next month, in September, the LSI poll reported that
about 61% of the voters sampled said they would vote
for Yudhoyono, 24% for Megawati and 15% of voters
undecided.
Meanwhile, the polls by the Washington-based
International Foundation for Electoral Systems (IFES)
showed 61% support for Yudhoyono, 21% for Megawati
and 18% of voters undecided. An earlier survey by
the Institute for Social and Economic Research,
Education and Information (ISEREI) found 55.9% of
respondents said they would vote for Yudhoyono and
28.7% for Megawati, while 14.5% were undecided.
While the polls are consistently showing that about
60% of the voters would support Susilo, the results
also consistently show negative biased toward
Megawati. The polls in September gave a devastating
picture for Megawati. IFES’ results show a 40%
difference, and LSI were 37 %. With only two weeks
to the election day, the race was essentially over.
If people believed the surveys, nothing Megawati
could have done to challenge Susilo. This is
contrast to the claim made by IFES senior advisor
Hank Valentino saying that Megawati could still do
much to challenge Susilo in the runoff by
capitalizing on her status as the incumbent. This is
not only a complete joke, but also a total lack of
understanding of the statistical numbers he
provided.
Table 1. Election’s vs. Polls’ results
|
Institution |
SBY (61%) |
Megawati (%) |
Gap (%) |
|
Election Results |
60.9 |
39.1 |
20.8 |
|
IFES -- September |
61 |
21 |
40 |
|
--in August |
63 |
28.5 |
34.3 |
|
LSI – September |
61 |
24 |
37 |
|
-- August |
61.3 |
32.7 |
28.6 |
|
-- in July |
68 |
23 |
45 |
|
ISEREI |
55.9 |
28.7 |
27.2 |
All results are within 3 to 4% margin
of error.
It is difficult not to disagree that the results of
the polls did not do a serious damage to Megawati’s
chance. In a two-race contest, the race is
essentially a zero-sum game. Any negative perception
of Megawati could be considered a positive
perception of Susilo, and vice versa. Take for
example the results reported by LSI in July where
Susilo led by 45% margin, and the election was only
about two months away. Even a miracle perhaps could
not save Megawati from this number. The 68% for
Susilo was twice the percentage of the votes he
collected in the first round in July 5, 2004. And
the 23% for Megawati was actually 3% lower than what
she obtained in the first round. With a margin of
error is about 3%, the percentage of the voters that
would vote for Megawati could vary from 20 to 26%.
How on earth one could believe these numbers? Did
some of the Megawati’s supporters suddenly regret
for voting for her in the first round? It is hardly
so. After all the poll was conducted right after the
first election, and given the not-so-bad result for
Megawati in the first round, little doubt that her
supporters would back away from voting for her.
This brings us to question the validity of the
surveys. Take for instance the July’s LSI survey.
Survey—and polling for that matter—is aimed at
gauging the parameter values of the population.
Let’s try to understand this from the following
example. Suppose a marketing agency wishes to
determine the proportion of all the families in
Jakarta that watched a specific TV program. This
proportion is called a parameter value of
population. In other words, it is the true value
(proportion) of the population that watched the
program. How do we obtain it? There is only one way:
through a census. But census is very expensive
because it is involved the whole elements of the
population. However, statisticians have developed
techniques that can deal with this inefficiency
while at the same time minimizing the risk of biased
results by introducing random sampling. The idea is
simple. We can study a population by focusing on a
random sample drawn from the population. Will the
sample’s results be different from those of
population? Yes, it will. There are several factors
that can cause the differences. Among them are the
measurement errors and they way we draw the sample.
To overcome the bias that could potentially be
caused by the drawing, statisticians design random
sampling techniques. And so, continuing our case,
the marketing agency selected a random sample of
1000 adults through telephone interviews. Suppose
there were 550 of respondents said that they never
watched the program. This means that 55% of the
adults in the sample said that they never watched
the program.
Suppose the producer of the TV station which aired
the program claimed that 65% of the population in
Jakarta watched their program. Can we believe the
station’s manager? Using a statistical hypothesis
testing, one can draw conclusion at 5% level of
significance—I use this level for the remaining of
the discussion—that the TV station’ claim can be
rejected (since this procedure involves some
statistical concepts, readers who do not have any
knowledge of statistics before can send their
questions to
the Institute).
The level of significance is the probability of
rejecting your claim when the claim is true.
We can also draw conclusion using the margin of
error and the confidential intervals. Continuing our
example, let’s assume that the results of the survey
are as follows:
Question: Did you watch the TV program X last
night?
|
Watched |
55% |
|
Did not watch |
45% |
|
Number of People Polled |
1000 |
|
Margin of Error |
+/-3.1% |
|
Confidence Level |
95% |
Pollsters have relied on certain statistical
principles that assume survey results are accurate
95% of the time, provided they are obtained from a
truly random sample. So in the poll of 1000 people,
one can be 95% certain that the actual percentage of
people in Jakarta watching the TV program X
lies somewhere between 51.9 to 58.1%. What this
means is that if the poll were repeated over and
over, with random samples of the viewers, we expect
that 95% of the time the percentage of people who
answer “Watched” would be between 51.9% (55% - 3.1%)
and 58.1% (55% + 3.1%), and “Did not watch” would be
between 41.9% (45% - 3.1%) and 48.1% (45% + 3.1%).
Or 1 out of 20 polls, the poll would fail to be
accurate.
Every poll has a margin of error. Its value is
determined by the size of the sample used. The
larger sample size, the smaller is the margin of
error. Typically samples of 1000 subjects yield
results with a 3% margin of error, meaning that if
the survey indicates that 55% of the TV viewers
watched the show, the “true” percentage, if all
adults in Jakarta were surveyed, would fall
somewhere within the 52 – 58% confidence interval.
If the agency polled just 100 people, the margin of
error increases to 10% which resulting in a less
precise confidence interval of 45 - 65%. Remember
what a 95 percent confidence level means: if you
were to repeat this poll many times, the resulting
confidence interval would contain the true value you
are measuring 95 percent of the time, depending on
random fluctuation.
Now back to the presidential election polls. Until
the election day, we never know the exact percentage
of support each candidate had, unless we surveyed
everyone. So there is always the possibility that
poll results will be wrong. The confidence level of
a survey is the number that tells how confident we
are that our results accurately reflect the true
percentage of support in the entire population. The
margin of error gives the range in which you expect
the confidence level to apply.
Based on the July’s LSI survey, about 68% of the
sampled voters said they would vote for Susilo and
23% for Megawati. The sample size was 1200 and the
margin of error was 3%. That means the 95%
confidential interval for the true proportion of the
voters that would vote for Megawati is 20% to 26%.
What this means is that if the poll were repeated
over and over, with random samples of the voters,
you expect that 95% of the time the percentage of
voters who would vote for Megawati would be between
20% and 26%. Similarly, the true percentage of the
voters who would vote for Susilo would be between
65% to 71%. The 95% confidential interval also
implies that 5% of the time. This also means that 5
of 100 polls, the poll would fail to be accurate.
Therefore, based on the July’s LSI survey, if one
made a claim before the election day that 30% of the
voters would vote for Megawati, could we accept the
claim statistically? The answer is no, we could
never accept the claim. What about if one claimed
that 40% of the voters would vote for Megawati? It’s
the same. We would have rejected it immediately. So,
what is the reasonable claim that could be accepted?
The answer is about 26%. Now, suppose one claimed
that Megawati would only get 15% of the votes. Could
the claim be accepted? The answer is no. It could
not be accepted. However, it is possible to accept
if one made a claim that 20% of the voters would
vote for Megawati. These two numbers (20 and 26%)
are both the lower and the upper bounds that could
contain the true population proportion. In other
words, the margin of error is about 3%.
Now, let’s apply the same statistical methods to
Susilo’s numbers as reported by the July’s LSI
survey. Since all polls reported a margin of error
between 3 to 4%, the lower and upper bound for the
true percentage of voters that would vote for Susilo
are 63 and 71%. Even if we try to reconcile this
result to the IFES’s findings that 33.8% of
respondents who voted for the Golkar Party and that
38.4% of respondents who voted for the United
Development Party (PPP) in the April 5 legislative
election chose Susilo in the first round of
election, it’s very unlikely that 68% of the voters
would vote for Susilo in the second round.
Multiplying the numbers by the percentages of the
votes that Golkar and PPP obtained in the
legislative election, and adding them, the
additional percentage of the votes that Susilo would
obtain was 10%.
If we use the September’s LSI survey which also has
a margin of error 3%, the 95% confidential intervals
for the true percentage of voters that would vote
for Susilo are 58% to 64%, and for Megawati are 21%
and 27%. Again, this means that if the poll were
repeated over and over, with random samples of the
voters, you expect that 95% of the time the
percentage of voters who would vote for Susilo would
be between 58% and 64% and Megawati between 21% and
27%. As we have seen, all of the LSI surveys on the
presidential election are negatively biased toward
Megawati and, in some cases, positively biased in
favor of Susilo.
The IFES surveys are also producing the same
results. The September’s IFES survey is even more
disturbing. With a margin of error 3%, the
confidential interval for the true proportion of
voters who would vote for Megawati would be between
18% and 24%. The 15% of undecided voters from the
September’s LSI result was not convincing, and so is
the 18% of the undecided voters from the September’s
IFES survey. It is very doubtful that all the
undecided voters were finally decide to vote for
Megawati.
If we relied on the LSI’s and IFES’s survey results,
the 40% of the votes that Megawati had managed to
obtain would be a complete impossibility. Did
Megawati turn impossibility to something possible?
No, that’s not the case. It was the surveys that
were misleading. They are unreliable. They are fall
into the 5 out of 100 polls which are inaccurate
polls.
Can your poll
results be wrong?
Is it possible that the final election could be
really different from what you predicted? Or, is it
possible that your poll might just be one of those 5
times out of 100 where the results are flat out
wrong. It is worth considering, especially with
something as large as a national election on the
line. The answer is yes. Here are some of the
reasons why this can happen.
First, the samples were not randomly selected. If
the surveys only interviewed registered voters who
live in big cities, the sample might be skewed
towards a certain type of voter. Or if most of the
samples have a certain characteristic, the survey
might be biased. In 2003 for instance, a survey by
Charney Research of New York and AC Nielsen
Indonesia and commissioned by The Asia Foundation
found that 53 percent of voters preferred a strong
leader like former president Soeharto, and about 58
percent of those who supported a stronger government
at the expense of rights and freedom had an
educational background of high school or more. Since
the LSI and IFES surveys were conducted through the
telephone, it is very likely that most of voters in
their samples have education high school or above.
Secondly, there might be a large percentage of
undecided voters that the survey’s results
indicated. Should these undecided voters be removed?
It is certainly no. If there were removed from the
results, the sample would no longer represent the
population accurately. The third reason is the polls
were conducted too early in the campaign. If people
are still unfamiliar with the candidates or the
issues and have not yet made up their minds, they
may choose a candidate that appears knowledgeable on
the issues. But in the last month presidential
election, the LSI and IFES polls were conducted only
about 2-3 weeks before the election.
Of course, there are no polling techniques that will
eliminate error entirely. There's always the chance
that your sample, despite your best efforts to make
it random, may not accurately reflect the opinions
of the greater population. People may provide
thoughtless or dishonest answers, or end up changing
their minds when they're in the voting booth.
However, this should be reflected in the
confidential intervals and the margin of error
which, as I have explained in detail above, makes
make to convince that the surveys conducted by the
LSI and IFES are falling into the inaccurate polls.
With regards to IFES, the Institution has admitted
that the results of its surveys conducted before the
July 5 election had overstated the support for
Susilo. But they also committed the same mistake in
the run up to the second round of the presidential
election. This time, their surveys are obviously
flawed. They are the negatively bias toward
Megawati. The question of course is, are the
unreliable results driven by a political motive or
just purely caused by serious statistical mistakes?
It is hard to argue for the latter. After all, they
did not just conduct one or two surveys. Their
several results consistently show inconsistently, a
negative bias toward Megawati.
Who is the IFES? Last time I check, they don’t have
polls in the US presidential election. But what is
their agenda? And why their results are widely
quoted in the media? If we assume that polls do
affect people’s perception—as I believe they
do—then, absent of statistical mistakes, an
international institution is essentially trying to
affect the Indonesia’s presidential election.
Imagine an imaginary “International Institute for
Democracy” which has its base in Japan is trying to
do polls on the US presidential election and then
the US media enthusiastically quote the
Institution’s results. Does not is sound weird?
As for the LSI’s surveys, it is not a secret that
the Institution’s chairman,
Denny J.A., is a Susilo’s supporter. Before
supporting Susilo, Denny was a Megawati’s backer.
Perhaps, realizing the political dynamics has
changed, Denny decided to shift his support. It’s a
pragmatic—and opportunistic—decision. However, along
with the shifting of his support, came the
not-so-easy-to-understand polls from his
Institution.
The LSI’s polls are widely cited by the Indonesian
press. Their impacts should never be underestimated.
Once, a voter said to me that Megawati’s chance to
be elected was essentially over. She said Megawati
was far behind in the polls. Notice her expression,
“Megawati is far behind” not “Susilo is far ahead”.
Undoubtedly, the polls have also given the
Megawati’s camp a devastating blow. Knowing your
candidate was 40% behind in the polls would not only
be a great disappointment, but it could impede any
creative thinking to boost your candidate’s chance.
The last month presidential election was the first
direct presidential election in the country. In such
a system, the role of polls is crucial in giving a
general picture of where the candidates stand in the
race. As polls could affect voters’ perception of
the candidates, inaccurate polls can indirectly lead
voters to vote the wrong candidate. It is absolutely
important to have reliable and accurate polls. The
LSI (Lembaga Survey Indonesia) clearly shows that
they are incapable of doing that. And for the IFES,
they might rethink their aggressive efforts in
providing polls which have turned out to be
inaccurate ones.
---------
Elwin Tobing, teaching Statistics for Strategic
Solving Problems at the University of Iowa, the
United States.