[EL] McDonald study, birthdate distribution in real voter list

Bev Harris bev at blackboxvoting.org
Sun Sep 11 10:42:51 PDT 2011

Using first name, last name, birthdate as a locator for duplicate voters comes
under question in a study by Michael McDonald. Note that this study assumes
common name, like "Robert Smith" plus same birthdate, then draws conclusions
against using firstname lastname birthdate assuming all are common names, which
is a fallacy.

Because real voter lists are readily available, I think such studies should use
real lists, not hypothetical models. In private communications, I mentioned my
conservative assumption that 1 in 10,000 people on a large voter list will have
the same birthdate (year, month, day). That figure is derived as follows:
365 days in a year
50 years avg voting life
365 x 50 = 18,250

It is this quick estimate of frequency of same birthdate that comes into
question in the McDonald study, but in fact, the 1 in 10,000 figure is
corroborated by the actual data.

How do I go from 1 in 18,250 to 1 in 10,000? Lop off some for the bell-shaped
curve, fewer voters in young and old groups, and put in a fudge factor for
twins, and you have a napkin-calculation figure of a chance of 1 in 10,000 to
have the same birthdate, with a few more in the baby boom years. There's no
real precision in any of these statistics on a generic basis, and for obvious
reasons it's nicer to use 1:10,000 rather than 1:13,129 or 1:9837. It gives a 
rough guideline for how many repeated birthdates to expect.

In a smaller database, say, 8800, you will see more repetitions than 1:10,000,
for a simple reason -- twins. I think the last figure I heard was that you have
a 1:80 chance of having twins. That twins number also gives a same last name,
but is unlikely to have same first name. Possible, don't prosecute anyone just
on this basis, but unlikely.

This 1:10,000 calculation is called into question in the McDonald study, based
on theoretical models. So I thought you might be interested in real numbers,
which really do work out to the 1 in 10,000 handy guestimate technique.

I ran a birthdate frequency calculation on a voter list with 604,456 voters. It
was interesting for two reasons: (1) It confirmed the expectation that only in
in 10,000 birthdates will be the same in a voter list, and (2) It identified an
anomaly for one specific birthdate in this list. The bell shaped curve shows
consistency over 30,000 different dates, during the baby boom years, ranging
from 45 to 67 repetitions of a given birthdate. Except that 114 people are 
shown to have a birthdate on one day - Nov. 29, 1960.

See attached graph for the obvious and unlikely spike.

That one date spikes into the stratosphere in this database. I then looked at 
the registration dates for voters with that birthdate, and found a big chunk of
them coinciding with another unlikely spike. The number of voter registrations
per day in this jurisdiction maxes out at about 3,000, with an average, during
heavy registration periods, of closer to 1,500. But on Sept. 3, 1991, an
unlikely 12,107 registrations are shown as being entered into the system. Many
of the Nov. 29, 1960 birthdates were entered on Sept. 3, 1991. So there's a
"hmm" for you. And perhaps a quick diagnostic tool to spot voter list

At any rate, the off-the-cuff frequency for repeated birthdates, using 1 in
10,000 for a guideline, is indeed supported by the real voter registration data.

Now, using the real data it is also possible to determine the actual frequency
of any last-name, first-name combination in a given jurisdiction. The name
"Robert Smith" used in McDonald's study shows up 127 times in the 604,456
database, when you include all its permutations (Rob, Bob, Bobby etc). So
that's about a one in 5,000 chance in that jurisdiction that your name is
Robert Smith in this particular jurisdiction.

As I remember my stats class, in this problem you would need to calculate the
chance that a single item meets two low probabilities at the same time (for
example, 1 in 10,000 chance of same birthdate, at the same time as 1 in 5,000
chance of name Robert Smith). It really is unlikely that even for a common
name, you will find the same birthdate in the same jurisdiction.

Except for this: My data is showing that at least in Shelby County, TN, people
who did not vote are being listed as voting, people who did vote are being
listed as not voting, people who requested Republican ballots are being shown
as requesting Democratic ballots and vice versa, and at least two people are
checked in to vote as both absentee and polling place when it appears
impossible (one was overseas and one is confined to a nursing home).

In Shelby County, people who are Black are also showing up as White, and poor
Effie Washington, who was a Black woman, then got listed as an "Other, Man",
and now is listed as "Other, Woman." Or unfortunate lifelong Democrat Sharonda
Williams, a Black woman, who had her race changed to "other", then voted with a
Democratic ballot in the 2010 primary but is shown as not voting, then voted in
 the next primary but is shown as voting a Republican ballot.

Yeah. In Shelby County, if you are listed as voting twice, it's probably a
database error.

Bev Harris
Founder - Black Box Voting

* * * * *

Government is the servant of the people, and not the master of them. The
people, in delegating authority, do not give their public servants the right
to decide what is good for the people to know and what is not good for them to
know. We insist on remaining informed so that we may retain control over the
instruments of government we have created.

This message was sent using IMP, the Internet Messaging Program.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graph birthdate distribution oddity 11 29	1960.png
Type: image/png
Size: 21438 bytes
Desc: not available
URL: <http://webshare.law.ucla.edu/Listservs/law-election/attachments/20110911/9089cd6e/attachment.png>

View list directory