In this post we show a natural extension of the degree of the anonymity model from (Diaz 2002) for voting. The result is simple, but may be useful for complex cases such as weighted voting, where privacy concerns are not limited to the standard “bisimilarity-under-swapping” definition of (Delaune 2009). These concerns may exist due to the leaking of information from election results, as remarked in (Delaune 2009):

Note that this definition [bisimilarity-under-swapping] is robust even in situations where the result of the election is such that the votes of VA and VB are necessarily revealed. For example, if the vote is unanimous, or if all other voters reveal how they voted and thus allow the votes of VA and VB to be deduced.

The idea that information may leak from results leads us to the degree of anonymity model, which is

an information theoretic model that allows to quantify the degree of anonymity provided by schemes for anonymous connections. It considers attackers that obtain probabilistic information about users. The degree is based on the probabilities an attacker, after observing the system, assigns to the different users of the system as being the originators of a message

Whereas the standard privacy definition for voting is possibilistic, degree of information is information-theoretic, and therefore about probabilistic inference. The measure is not binary, but instead quantifies how much information an attacker can gain from observing the process. In the extension to voting discussed later, the attacker gains information by observing the results, which are available irrespective of whether ballot privacy exists.

The idea of quantifying information gain naturally leads to entropy in (Diaz 2009)

The authors then apply a normalization factor using the maximum entropy case (in other words, zero information leakage) to arrive at the degree of anonymity measure:

Given the normalization

Extension to voting

The degree of anonymity model is about determining the sender of a message out of a possible group (the anonymity set). We want to extend this model to the case of voting, where the attacker wishes to determine what choice a voter made. The individual votes as plaintexts are not generally available, depending on the specific secure voting scheme used. This means that the translation is not immediate, in one case we are talking about determining a one-out-of-n variable, whereas for voting it’s about determining vote choice from an election result.

But we can fit probabilities directly into the degree of anonymity model. These probabilities must be derived from election results, as this is the public information available to the attacker. Also, we’d like to do this in a general way that does not depend on the specifics of the electoral method (or rule) used or the form that an election result takes. We start with

these are sets of voters, choices and election results.

the function a specifies each voter’s selection, modeled as a function that maps voters to choices. The function t represents the election tally, a function that maps the set of all choices made by voters and to the corresponding result.

Ar is the set of all functions a, that is, the set of sets of selections by voters, that produce the outcome result r. Building on this, the function m represents the number of different functions a in which voter v selected choice c and the election result is r. In other words, it represents in how many cases voter v selects c and the overall result is r.

With these expressions in hand we can provide equivalent expressions for the terms that appear in the original degree of anonymity definition. First, the entropy corresponding to the uncertainty about a voter’s choice given an election result is

This is simply the standard expression for entropy, but with probabilities that correspond to the likelihood that a voter selected a certain choice given that a certain election result occured. The maximum entropy is the baseline, where we have no information about the election and so use a uniform probability distribution for the voter’s choice[3], which gives

Where |C| is the number of choices (the cardinality of C). Finally, the degree of privacy is

which quantifies the degree to which voter’s v choice remains secret given the election result r. This expression is analogous to the degree of anonymity model, with a few changes. First, we are talking about degree of privacy, since we are quantifying how much is known about a voter’s choice, instead of trying de-anonymize the sender of some message. Also, this result is per voter. This characteristic is not always relevant, but would be important for the case of weighted voting, where voters are not indistinguishable and would have different degrees. It would also possible to produce aggregate values, such as the average or the minimum value. Another generalization could be to produce expectation values over results for a given set up. As an extreme example of this case, an election with a single voter would always have a degree of anonymity of 0, irrespective of the result.

As was stated before, the above definition is general to any election irrespective of the electoral method, result type or even ballot design. Next we see some specific examples.

Examples

We calculate values for example cases. In all we are using a plurality rule for the function t: V => C => R.

Yes/No vote, Single voter

This is a Yes/No vote (ballot options are Yes or No). We have a single voter, Bob.

V = { Bob }, C = { Yes, No }, R = { {Yes:1, No:0}, {Yes:0, No:1} }

In the following expression

The value of n is 2, for two possible results (the cardinality of C). Consider the case where r = Yes, then

Ar = { (Bob, Yes) }, and |Ar| = 1

since it is the only way that Bob could have voted. Similarly

m(Bob, Yes, Yes) = 1 and m(Bob, No, Yes) = 0

again, for the same reason. The entropy then reduces to

H(v, r) = 1/1*log(1/1) = 0

which when plugged into

gives

d(Bob, Yes) = d(Bob, No) = 0

The degree of anonymity is zero, which matches the obvious fact that in an election with a single voter their vote will be revealed.

Yes/No vote, Unanimous result

V = { v1 …. vn }, C = { Yes, No }, R = { {Yes:n, No:0}, {Yes:0, No:n} }

In the following expression

we see that the unanimous vote case has a similar form as the case of a single voter, except for a general number of voters and results:

|Ar| = 1

for all r, and also

m(v, Yes, {Yes:0, No: 10} ) = 0 and m(v, Yes, {Yes:10, No: 0} ) = 1

which leads to

d(v, r) = 0

for all v and r. Again, this matches the obvious expectation: in a unanimous election the votes of all participants are unequivocally revealed.

Yes/No vote, General case

V = { v1 …. vn }, C = { Yes, No }, R = { {Yes:n, No:0} ….. {Yes:0, No:n} }

In this case the cardinality of R is |C|, because we are not restricting the results to those that are unanimous. The calculation for |Ar| and m differs from before. If we index the results in R by the number of positive votes, then we have

with a binomial coefficient on the right. This is the number of ways it is possible to obtain the result r. We also have

dividing the previous two expression gives

Because this is a Yes/No election

which when divided by |Ar|

The complete expression for H is therefore

at this point we can do a couple of sanity checks. First of all, the probabilities in the entropy expression (r/|C| + 1 – r/|C|) sum to 1. Secondly, they tell us a common sense conclusion. If the number of Yes votes for an election with |C| elections is r, then the probability that any voter has selected Yes must be r / |C|. Because our method of calculating entropy is general, it is more long winded than the our intuition for the special case of plurality rule with indistinguishable voters.

Finally, the degree of privacy for the general Yes/No election is

Below is a graph of this function for a fixed value of 10 voters (|C| = 10).

The special case of unanimous elections we saw in the previous section correspond to the two edges of the graph, where again we get d = 0.

Summary

We have seen how the degree of anonymity model can be extended to voting. The extension is general to any voting method. Several simple examples illustrate that its calculations are consistent with results obtained using more simple methods specific to the election type. Although these examples are trivial, degree of privacy may be useful for more complicated cases where voters are distinguishable and more information may leak from results. The most immediate case of this type is weighted voting. Although the definitions presented here apply equally well to that case, applying them in their general form may require more work to avoid combinatorial explosion.


Diaz 2002 – Towards measuring anonymity

Delaune 2009 – Verifying privacy-type properties of electronic voting protocols

[3] It is also possible to use a non-uniform prior probability here. In that case the resulting conditional probabilities must be derived in such a way that relative magnitudes are preserved.