In the previous post we suggested an extension to degree of anonymity of (Diaz 2002) to voting. Recall the suggested definition for degree of privacy

where

we also mentioned that this extension could be useful for weighted voting, because in that case results leak more information than in more typical elections. The problem is that calculating the terms in the entropy expression is in general intractable due to combinatorial explosion; computing those values is only possible for small numbers of voters and weight categories.

But although we cannot compute a value for d in general, we can restrict the task to a more limited, but significant, calculation. The decision problem defined by

expresses whether or not the vote cast by voter v, when the election result is r, can be revealed. If the value of d is 0, vote secrecy is broken. Otherwise (d > 0), the attacker cannot determine what the voter selected with certainty. This is the significance of the decision problem as a restricted calculation of the general value of d. In terms of the entropy expression, the decision problem reduces to determining whether there is some choice ci to which the assigned probability is 1. That is

the condition on the right says that it is certain that voter v selected ci, and therefore d is 0.  Note that voters cannot be distinguished within their anonymity sets, defined by their weight, meaning that a successful attack requires all voters in that set to act the same way. The decision procedure is to calculate the terms on the right for every cand see if any two values are equal. The hard part is to calculate those terms, because we are dealing with weighted voting. Which brings us to techniques from other areas.

Elections as high dimensional polytopes

The calculation we are trying to make can be reformulated in terms of computational discrete mathematics, specifically integer programming and knapsack problems. An integer program is formulated as (wikipedia)

Let’s ignore the maximization part, and compare the bottom expressions with what defines a weighted voting tally under plurality. First let’s group voters according to their weight and choice:

so, for example, v(Yes, 3) would be the set of voters who selected Yes and have an assigned weight of 3. The tally function is then

which when rearranged as

has the same form as

where A is the weight matrix, x is the grouping of voters, and b is the set of results per choice. Note how the number of voters in each group is a positive integer, satisfying the two lower conditions[2]. Thus, the outcome of an election defines an integer programming problem. Geometrically, an election can be interpreted as a polytope whose interior represents the space of possible results that are consistent with available data. We can visualize this (wikipedia)

Our decision problem is the question of whether all the points in the feasible region defined by the election result have the property of encoding the same choice for a given voter. If so, then it is certain that said voter made that choice, and d = 0.

The decision procedure is then to count the number of points with the property and seeing if they equal the total number of points in the region.

Latte Integrale

Latte Integrale is a software package for lattice point counting and integration over convex polytopes. A theory page linking to academic research is here, the main result used in our use is (Koppe 2006). Latte takes several input formats to define integer programs whose lattice points can then be counted. The decision procedure is then

  1. Given some dv,r derive the equations that define the corresponding general integer program.
  2. Count the number of lattice points for the general program.
  3. For each possible cmodify the corresponding equation so that the integer program reflects the constraint that voter v selected ci
    • Count the number of lattice points for the specific case ci
    • If the numbers match dv,r = 0, voter v‘s choice has been revealed as ci.
  4. Otherwise dv,r > 0.

Successful attack examples

We used elections with Yes/No choices and plurality weighted voting to test the procedure. These examples were found by hand using comparatively small values and a similar pattern, so they’re not representative of real world cases. We show the Latte definition only in the first example, a link to all examples is provided at the bottom. The voters marked in bold have their privacy broken.

Example 1


ChoicesYes, No
Yes tally1503
Weights61, 24, 18, 12
Voters15, 10, 20, 30
Lattice points83

Example 2


ChoicesYes, No
Yes tally94
Weights17, 8, 4, 2
Voters2, 6, 6, 10
Lattice points19

Example 3


ChoicesYes, No
Yes tally323
Weights43, 18, 14, 12
Voters3, 6, 7, 9
Lattice points7

Example 4


ChoicesYes, No
Yes tally193
Weights27, 10, 8, 6
Voters3, 6, 7, 9
Lattice points10

Example 5


ChoicesYes, No
Yes tally1521
Weights61, 24, 18, 12
Voters15, 10, 20, 30
Lattice points73

The full set of example integer programs is here.

Summary

In this post we have seen how the degree of privacy definition can be restricted into a decision problem that represents whether sufficient information is leaked by the result to break privacy. The decision problem was then reformulated as an integer program whose feasible region corresponds to the set of possible election outcomes consistent with public data. We then used the Latte Integrale software and its Barvinok algorithm implementation to test the procedure and showed several examples of successful attacks. We have not looked at whether these examples are representative of real world cases (where anonymity sets may be large).

Appendix: Technical notes

  • A point in the feasible region of the polytope strictly corresponds to a set of election outcomes according to the grouping function v(c, w).

The multiplicities for each point is given (abusing notation) by

where Bin is a binomial coefficient and

The grouping function v(c, w) reduces the dimensionality of the polytpope avoiding the explosion mentioned previously.

  • One way to compute general values for d would be to use weighted ehrhart polynomials supported by the Latte –valuation=top-ehrhart feature, using the required monomials to sum the values of interest. Unfortunately this misses the multiplicities resulting from binomial coefficients at each point, and there seems no way to incorporate the binomial computations into the weighting monomials.
  • The frequency of cases with d = 0, is expected to be low due to the multiplicity of these points, which contains at least one binomial term Bin(n, n) = 1, the minimum possible value.

[1] Koppe 2006 – A primal Barvinok algorithm based on irrational decompositions

[2] Equations (inequalities) that specify the number of voters per weighted group were left out for clarity