That close race

Jojo Robles

… The website Get Real Philippines ( has noted “the almost algorithmic way with which Robredo chipped away at the initial one-million-vote lead of Marcos over several hours since the voting closed.” According to the article, the statistical aberration “has attracted the attention of many observers.”

On Facebook, the article said, Benjamin Vallejo Jr. “plotted the progressive decrease of Marcos’s lead over Robredo over time and found an almost perfect linear correlation.” “The correlation plotted a straight-path downward trajectory for Marcos’s lead,” the article said.

“Di kapani-paniwala [Unbelievable]!” said Vallejo, a faculty member of the University of the Philippines currently working as an exchange professor at St. Norbert College in De Pere, Wisconsin, noting the perfectly straight line….

Read more…


  1. manuelbuencamino

    What has Leni Robredo done to warrant an allegation of cheating on her part? Is she a trapo? Is she a crook? Has she ever been involved in anything questionable? Is there any record of her allowing herself to be used in election chicanery? To top it all, the columnist throwing innuendos at Leni is the infamous “Tomcat”, the mysterious classmate of PNoy.

    As far as BBM is concerned, why protest the unofficial count? The margin between him and Leni is so narrow that it would be perfectly legitimate for him, should he lose, to file a protest when the official tally is completed. Let’s not jump the gun. Tsismis lang ang PPCRV count; ang Comelec count ang tunay na storya.

    Jojo Robles….naman.


    I saw a graph trending on my FB newsfeed showing that there is an almost-straight-line relationship between the cumulative difference between BBM and LENI’s votes (ie., BBM minus LENI) and the cumulative percentage of polling precincts that have transmitted their election returns. There are, however, some basic shortcomings in that representation.

    First and most obvious, both the BBM-LENI vote gaps and polling precincts covered are measured cumulatively, and hence are both functions of the time that elapsed since the start of counting at around 5 pm of May 9.

    Hence, correlating the two will surely give us a very high “R-squared” …

    (Note: I only have data from the moment LENI overtook BBM. This is based on the Rappler article…)

    Second, even by ocular inspection (okay, I did test formally for veracity), it is safe to say that the two variables are not stationary. Hence, directly correlating the two without checking for cointegration is CARELESS and SPURIOUS. In other words, the linear relationship established may be mostly traced to the time effect and not due to any underlying factor. In fact, the two variables failed when I run a formal cointegration test (which may be expected given that we are talking about hourly data within a very short 2-day time frame). Granted that there is non-stationarity and no cointegration in the data used, an objective analysis would have to apply a corrective measure, i.e., using the CHANGE in the BBM-LENI vote gaps rather than the actual gaps themselves. This is basic time-series statistics. And true enough, the “clean progression” and nearly linear relationship dissipated to randomness, just like biased BBM supporters’ random and baseless rants.

    (Note: First differencing is the technique to remove stationarity. It is computed based on the formula: (Rappler’s updated gap this hour) minus (Rappler’s updated gap last hour).

    Although we have limited data points, it may be safe to say that the “clean progression” claimed elsewhere using the wrong data is fraudulent and misleading. I just hope what was trending earlier is only a case of carelessness because it’s unfortunate when people, like nerd-turned-villains in superhero movies, bend the principles of science to deceive others.

  3. [UPDATED PROJECTION] by Mig Barretto García
    Just to be fair to everyone concerned:

    1.) It’s natural for the total votes cast for President to be bigger than the Vice President. Back in 2010, the total votes cast for President was 671,000 votes more than the total votes cast for Vice President. So, sorry Sandro, your math does not add up.

    2.) The 55.7 million versus 54.36 million supposed discrepancy simply reflects the 54.36 million registered voters nationally plus the 1.37 million total overseas absentee voters (OAV). However, the point of reference is the projected 45 million votes cast, which will be around 81 percent of total registered voters, a projection many political analysts are expecting. One should be worried if voter turnout overshoots 100%.

    3.) Robredo’s numbers, as predicted, is rising very fast because of the incoming transmissions of votes coming from ARMM, MIMAROPA, Samar, Negros Island, Northern Mindanao, and CARAGA. We have the illusion that Marcos is holding a commanding lead because Ilocos, Cagayan Valley, Central Luzon, and NCR — all areas where Marcos lead — had faster transmission rates and reported earlier. So no, at this point, there is no magic, only faster internet connection in the North and slower connection in the South.

    4.) To explain further this sudden change, the simple (but long) explanation why there is a seemingly uniform increase in voter turnout is based on the increase of the transmission rate. It follows a logarithmic path (see Figure below from Rappler); initially, the transmission reported rapidly during the early hours of the evening and began tapering down later on. The first half reflect the fast transmissions from the Luzon and NCR precincts; the second half are the slower transmissions coming from provinces outside Luzon where Robredo is leading.

    Why Robredo is gaining 40 thousand votes every 1% is a reflection of the slow transmission rate of the unaccounted precincts that favours Robredo. In the first half of the transmission, Marcos pulled away very quickly because it reflected the fast transmissions coming from places where he lead. For example, in the Ilocos regions, Marcos easily garnered 85 percent to 95 percent share of the votes.

    Moreover, it would be most unwise to make inferences based on aggregate trends; you have to break it down at the level of the provinces and highly urbanized cities to see where the votes are coming from. So yes, the results are as predictable as you can get now. And for as long as the transmission rate, conditional on the provinces or cities that is reporting, is sampled randomly, you can already predict the distributions of votes as early as 20 percent, a practice that is done in the United States when they declare their winners.

    5.) The race is not yet over. The marginal differences in the Overseas Absentee Voters will crown the next Vice President. As of this point, the projected marginal difference is still too close to call, but the results are leaning Robredo. Based on the current transmission rate, the projection of a 250,000 margin of victory for Robredo is increasingly becoming plausible. The only upset victory right now for Marcos is if he can win the OAV votes by as much as 60% of the votes. Currently, Marcos commands 40% of the OAV votes while Robredo has 20%.

    6.) To see whether there are irregularities in the voter distribution, one should have to look into precincts that had complete attendance where the candidate in that precinct reported an overwhelming majority. That requires a separate set of data analysis.

    • ricelander

      #1. How do you designate what is “natural”? By one instance in 2010? I think it is natural to fill up both positions so the difference should not be that large.

      #4 “40 thousand votes every 1% is a reflection of the slow transmission rate”
      No, x number of votes every x length of time would be a reflection of slow transmission rate


    The transmission package was discussed in length by Smartmatic software engineer Andres Sanchez. I do not remember the exact XML tags used, but I will attempt to explain the contents of the transmission package, from a senior citizen’s hazy recollection.

    The transmission package (TP) is sent from the precinct VCM to the municipal CCS and to the Transparency Server and Comelec Central server, via the Internet (Smartmatic VPN) or by hand carrying the SD card containing the TP. The TP contains the all-important precinct election return, which is a listing of all candidates in the precinct and the number of votes that each one got. The TP is the expression of the will of the people as it travels from precinct to canvassing center, and so securing the TP is very important.
    The TP contains the following fields.


    All the fields are base-64 encoded. The ip-field contains precinct identification information. The electionret-field contains the actual election return. The digitalsig-field contains the digital signatures of the BEI. The hashcode-field is the hashcode (md5?) of aaa+bbb+ccc or of some combination of these.

    The hashcodes are expected to be different from one TP to the next TP that arrives at the Transparency Server, because different TP contain different data, and so have different hashcode. So the complaint of the “IT expert” that the hashcode has changed might not have a reasonable basis.

    I have requested Comelec during the source code review that the political parties and the public should get the “bbbb+cccc” fields, namely the election return plus the digital signature. This is so that the political parties and the public can verify for themselves that the election return is authentic and has not been tampered with. At present, the Comelec gives us a listing of candidates and the number of votes each one got, already stripped of the digital signature. If Comelec is to be trusted, it should give us the election return with digital signature.