The post Encoding plaintexts in ElGamal appeared first on nVotes - Online Voting.

]]>The nVotes voting system uses an ElGamal re-encryption mixnet to ensure ballot privacy. When using ElGamal encryption it is necessary to first encode protected data into the plaintext space where ElGamal operates in. Once encoded in this space, the information can be encrypted. When decrypting, the reverse process occurs; first the ciphertexts are decrypted, and then the plaintexts are decoded back into the original data.

The plaintext (and also ciphertext) space for semantically secure ElGamal is the multiplicative subgroup G_{q }of the ring of integers Z_{p}, where p is a safe prime modulus, p = 2q + 1. The standard procedure to encode data is to first convert into an integer (a universal data type that can hold any information), and then map that integer into G_{q}. The range of integers that can be encoded in a single ciphertext is given by the order q of G_{q}, so we’re looking for a procedure to map numbers from Z_{q }into G_{q}. I recently looked for a reference for this simple procedure and found none that explains it well, so I’m writing it here.

One reference we *can* find is this from Advances in Cryptology – EUROCRYPT 2017.

First of all, when using a safe prime p = 2q + 1, the multiplicative subgroup G_{q }is precisely the set of quadratic residues mod p, this is in fact what makes the scheme semantically secure as stated above. Given this, the above suggests the simple m x (m/p) as our encoding procedure. But why is (m/p) x m guaranteed to be a quadratic residue? The reference contains hints, which we expand below.

First of all, the legendre symbol is defined as

Because the range of input integers is q, which is always < p, then there are only two possibilities for encoding. The expression (m/p) x m reduces to leaving m unchanged (1 x m) or reversing its sign (-1 x m). In particular, if m is already a residue, then (m/p) x m is still a residue since (1 x m) = m.

Second, we have that

Third, the first supplement of the law of quadratic reciprocity states

since p is a prime, then p mod 4 has only two possible values (1 and 3), so

*if p ≡ 3 (mod 4) then −1 is a nonresidue modulo p*

Fourth, or modulus p is a safe prime p = 2q + 1, where q is also a prime. This means that

*q = 1 mod 4 or q = 3 mod 4*

Expanding each case^{[1]}

*q = 1 mod 4 ⇒ **2q = 2 mod 4 ⇒ **2q + 1 = 3 mod 4 ⇒ **p = 3 mod 4*

similarly

*q = 3 mod 4 ⇒ 2q 4 = 6 mod 4 ⇒ 2q = 2 mod 4 ⇒ p = 3 mod 4*

Therefore in both cases, for a safe prime p, we have that

*p = 3 mod 4*

Now we combine each of the previous 4 steps of the argument in reverse. By step 4 we have that:

*p = 3 mod 4*

which by step 3 implies that

*-1 is a nonresidue modulo p*

which by step 2 implies that

*(-1 x m) is a residue if m is not a residue*

which by step 1 implies that

*(m/p) x m is a residue if m is not a residue*

Also by step 1 we saw that

*(m/p) x m = 1 x m is a residue if m is a residue*

Therefore in all cases

*(m/p) x m is a quadratic residue modulo p for a safe prime p = 2q + 1*

which is what we set out to prove.

Let’s look at how this is implemented in two open source e-voting projects with ElGamal encryption, Helios by Ben Adida and UniVote by the Bern E-voting group. Because vote encryption takes place at the voting booth running in the browser the code is javascript. In Helios we can find the ElGamal encoding of integers in elgamal.js file, which is also included in the nVotes voting booth

var y = m.add(BigInt.ONE); var test = y.modPow(pk.q, pk.p); if (test.equals(BigInt.ONE)) { this.m = y; } else { this.m = y.negate().mod(pk.p); }

The first line is a technicality, it adds 1 to the input because the subgroup G_{q }does not include the value zero. The subsequent lines implement the (m/p) x m encoding. If you recall, the legendre symbol has two possible values 1 and -1. The first branch of the if statement leaves m unchanged, the (1 x m) case. The second branch corresponds to changing m’s sign, the (-1 x m) case. The expression in the if statement applies Euler’s criterion to calculate the legendre symbol

This shows that the javascript code is in effect calculating (m/p) * m. We can see a similar method in UniVote, the code is found here:

var one = leemon.str2bigInt("1", 2, 1); var t1 = leemon.add(bigIntInZq, one); var t2 = leemon.powMod(t1, encryptionSetting.q, encryptionSetting.p); if (leemon.equals(t2, one) == 1) { return t1; } else { return leemon.sub(encryptionSetting.p, t1); }

Again, we are adding 1, and then branching based on the Euler criterion. There is a small difference, whereas above the value (1 x m) was calculated, in this case we have the value p – m. But because modular congruence is compatible with subtraction^{[2]}, then

*-m mod p = (0 – m) mod p = p – m mod p*

So both expressions, -m and p – m are equivalent, mapping the input m to quadratic residue.

Here we show some concrete encoding examples for small p. Let’s first print the values in G_{5} for a choice of safe prime 11 = (2 x 5) + 1.

The quadratic residues are thus G_{5 }= {1, 3, 4, 5, 9}. Now let’s see if the encoded in the allowed range q=5 actually map to residues.

where we can see that the encoded values are indeed in G_{5}. Let’s run another test for p = (2 x 11) + 1

Again, the encoding works.

In this post we have seen how encrypting in ElGamal first requires encoding data into the correct subgroup, and how this can be done with the expression (m/p) x p in terms of the legendre symbol. We also saw why this works, using several properties of modular arithmetic. Two implementations with this technique were shown, in which the Euler criterion was used to determine residuosity. Finally we saw some real value examples of subgroups and encodings for small values of p and q.

[1] We are using two properties here

- if a = b mod n, then
*k a*=*k b*(mod*n*) for any integer*k*(compatibility with scaling) - if a
*= b mod n, a*_{1}+*a*_{2}≡*b*_{1}+*b*_{2}(mod*n*) (compatibility with addition)

[2] Compatibility with subtraction,

- if
*a*then_{1}= b_{1 }mod n and a_{2}= b_{2}mod n,*a*_{1}–*a*_{2}=*b*_{1}–*b*_{2}(mod*n*) (compatibility with subtraction)

The post Encoding plaintexts in ElGamal appeared first on nVotes - Online Voting.

]]>The post Ballot privacy for weighted voting appeared first on nVotes - Online Voting.

]]>In previous posts we have discussed the idea of degree of privacy and its application to weighted voting. In those posts we were concerned about what can be revealed about voter’s choices given an election result. We saw an attack method where privacy was outright broken, as well as a more general way to compute probabilities over choices. Both these methods used techniques from polyhedral geometry as implemented in programs developed by researchers in that area. Having covered this ground, here we suggest a simple modification to incorporate weighted to a standard mixnet-based secure voting scheme we have discussed before. To be clear, this scheme addresses ballot privacy specifically, and does not handle information leaks of the type discussed before.

The original scheme is described at a high level here. You can also find additional technical information at nMix github page and its guide, which implements the protocol.

The main steps of the protocol are

- Key generation
- Voting
- Mixing
- Decrypting
- Tally and verification

Our proposal for adding support for weighted voting is simple, modifications need to be made only at the voting step. During this step, users cast votes through the voting booth, which is in charge of encoding and encrypting their choices. The voting booth then sends this encrypted content to the ballotbox, where they are stored prior to mixing, decrypting and then tallying. We would like to add some mechanism by which the vote tally reflects the weights of their corresponding users. A naive first idea would be to encode and then multiply vote values such that when the plaintexts are added together they automatically reflect weights. Unfortunately, this is not a good idea, even if it sounds like it’s a good fit for the homomorphic property of ElGamal encryption.

The first problem is that weighted votes have to be validated such that they fit the census data that specifies who’s vote has which weight. If the weights are part of the ciphertext, you cannot directly verify they are correct, and instead need to resort to zero knowledge proofs which make the scheme more complicated.

Second, there are ballot flexibility problems. For some electoral systems, for example those using preferential ballots, there is no way to reflect the user’s greater weight in the vote value. Note that this would also be a problem if attempting to carry out the entire tally in ciphertext space, as is the case for systems based on homomorphic encryption (see Adida 2009).

Finally, and this is the show stopper, the vote weights would appear when decrypting into plaintexts, which would compromise privacy. This is a severe version of the problem discussed in the previous posts: the ballots are linkable to users with the corresponding weight, irrespective of the result.

There is a simple solution, instead of multiplying values in the votes, one can multiply the entire ballots themselves. During the voting phase, encrypted ballots are received at the ballotbox. But instead of just storing the received ballot, the ballotbox (or equivalent component) can duplicate said ballot according to the user’s weight. The resulting set of ballots is then stored as usual. Cast-as-intended and recorded-as-cast verifiability mechanisms are the same, except for the fact that in the latter case, the voter can not only verify the ballot was received correctly, but that it was duplicated the required number of times.

From then on, the mixing, decrypting and tallying phases are identical. Note that, since the votes are anonymized in the mix, the duplication of input ballots has no effect and no consequences for privacy. In particular, a set of duplicate plaintexts from one voter are indistinguishable from a set of individual plaintexts from several voters (modulo our previous posts). During the tally and verification phase, the proofs of shuffle grant counted-as-recorded verifiability and thus the scheme as a whole is still end-to-end verifiable. And because weighting occurs at the ballot level, it works for any electoral method or ballot design.

There is a price to pay for the simplicity, however, and that is the obvious fact that there will be more ballots to process in the mixnet, and the tally will take longer. The extent of this problem depends on the weights, how large they are, and how many users have them. Additionally, if there are non-integer weights it may be necessary to duplicate large integer values to preserve relative power. Or one could reduce duplicity at the expense of reducing tally accuracy. A related tradeoff is discussed in (Adida 2009).

Depending on the magnitudes involved, the weighting via duplicating solution may not be appropriate. In spite of this possibility, it seems that in general the extra computation and tally time is well worth the simplicity of the scheme, both theoretically and in terms of implementation.

Adida 2009 – Electing a University President using Open-Audit Voting: Analysis of real-world use of Helios

The post Ballot privacy for weighted voting appeared first on nVotes - Online Voting.

]]>The post Degree of privacy for weighted voting appeared first on nVotes - Online Voting.

]]>The problem is that calculating the terms in the entropy expression is in general intractable due to combinatorial explosion; computing those values is only possible for small numbers of voters and weight categories.

In this post we generalize the inference method to probabilities and therefore degree of privacy. We then apply it to the previous examples of successful attacks, and check for consistency. Since those examples have small values, the method is expected to be computationally feasible.

Recall that the problem can be framed in terms of polyhedral geometry and integer programming. An election is a polytope whose feasible region corresponds to the possible outcomes consistent with public data, including the result. Points are described in terms of the vote grouping function

and the set of points consistent with a result **r** is

where **t** is the tally function and **a _{p}** encodes voter selections implied by

The first step is noticing that each point actually corresponds to a *set* of possible outcomes, point coordinates specify the number of voters from a weight group **w _{n}** that selected each choice

The multiplicity of a point is thus

To obtain the total multiplicities for |**A _{r}**| for a result

This is the overall total multiplicities. Next we need to select those corresponding to specific votes. The fraction of the outcomes for point in which voter **v** chose **c** is then

As before, to obtain the multiplicities for a result **r** we simply sum over all the points consistent with that result

We now have expressions for the two terms required to calculate probabilities and degree of privacy.

In the previous post we needed to calculate the *number* of points in the feasible region of the election polytope. In this case our expressions not only depend on the number of points, but on their coordinates. We need to list these points instead of just counting them, and then calculate multiplicities and proportions for each. For this we turn to the Polymake software and one of its backend tools, the Parma Polyhedral Library. We will use examples from the previous post as test elections.

This is the data for Example 2

Choices | Yes, No |

Yes tally | 94 |

Weights | 17, 8, 4, 2 |

Voters | 2, 6, 6, 10 |

Lattice points | 19 |

we use the following script to convert the data into a polymake polytope and compute the lattice points

use application 'polytope'; my $inequalities = [ [2, -1, 0, 0, 0], [6, 0, -1, 0, 0], [6, 0, 0, -1, 0], [10, 0, 0, 0, -1], [0, 1, 0, 0, 0], [0, 0, 1, 0, 0], [0, 0, 0, 1, 0], [0, 0, 0, 0, 1] ]; my $equations = [ [-94, 17, 8, 4, 2] ]; my $e = new Polytope(INEQUALITIES=>$inequalities,EQUATIONS=>$equations); print $e->LATTICE_POINTS;

we get

there are 19 rows, which correspond to the point count obtained with Latte as shown in the data. These points are then input to the calculations we obtained above to yield the degree of privacy and probabilities. This is implemented here

object D2 extends App { val lines = scala.io.Source.fromFile(args(0)).getLines val voters = args.slice(1, args.length).map(_.toInt) println(s"Processing file '${args(0)}' with groups [${voters.mkString(" ")}]..") // we skip the first column and take n number of variables val lineValues = lines.map(_.split(' ').slice(1,voters.size + 1)) // calculate total points for each voter casting Yes, plus overall total val totals = lineValues.map { line => val lineInts = line.map(_.trim.toInt) val withVoters = lineInts.zip(voters) // multiply binomial coefficients for all coordinates val total = withVoters.map { case (a, b) => binomial(b, a) }.reduce(_.multiply(_)) // the fraction of points in which the voters cast Yes val points = withVoters.map { case(a, b) => total.multiply(BigInteger.valueOf(a)).divide(BigInteger.valueOf(b)) } // the last array holds the total multiplicities for each point points :+ total } val arr = totals.toArray println(s"Found ${arr.size} lattice points") // sum multiplicities for each point, m(v, c, r) val sums = arr.transpose.map { values => values.reduce(_.add(_)) } // overall multiplicities, Ar val allSpace = new BigDecimal(sums(sums.length - 1)) println(s"Total solutions $allSpace") // probabilities = m(v,c, r) / Ar val ps = sums.dropRight(1).map(new BigDecimal(_).divide(allSpace, 5, RoundingMode.HALF_UP)) println("p = " + ps.mkString(" ")) // entropies val hs = ps.map { prob => val p = prob.doubleValue val q = 1 - p -(p * (Math.log(p) / Math.log(2)) + ((1-p) * Math.log(1-p) / Math.log(2))) } // degree of privacy, with 2 options Hm = 1, so a = H(x) println("d = " + hs.mkString(" ").replace("NaN", "0")) /** * Binomial coefficient */ def binomial(n: Int, k: Int): BigInteger = { var ret = BigInteger.ONE; for(i <- 0 to k-1) { ret = ret.multiply(BigInteger.valueOf(n-i)).divide(BigInteger.valueOf(i+1)) } ret } }

which when run gives us

The probabilities and degree of accuracy are highlighted in red. Note how the degree of accuracy for the two voters in the first group is 0, we know they voted **Yes** with certainty. This matches what we expect given the successful attack data shown in the previous post. Also shown is the total number of possible outcomes consistent with the result, which in this case is 142,442. We can do the same for Example 1.

Choices | Yes, No |

Yes tally | 1503 |

Weights | 61, 24, 18, 12 |

Voters | 15, 10, 20, 30 |

Lattice points | 83 |

for which we get

where again we observe the expected value of **d** = 0 among the results. Other probabilities are closer to 0.5 and therefore **d** values are closer to 1.0, but still leaking some information.

In this post we have combined results from the previous two posts to arrive at a method to obtain probabilities and degree of privacy for weighted plurality elections. We have seen how this requires taking point multiplicities into account, based on lattice point coordinates and not just point totals. We used polymake and ppl for lattice point listing, and then implemented the calculations in code for the case of |**C**| = 2 (Yes/No) elections. Applying this to the examples of the previous post gave us probabilities over voter’s choices; the presence of values p = 1.0 and **d** = 0 are consistent with successful attacks shown there. Although the implementation shown is limited to Yes/No case the method could be implemented for general values of |**C**|. We have not seen how the polyhedra tools and code scales to large values of voters, where points and multiplicities could explode.

The post Degree of privacy for weighted voting appeared first on nVotes - Online Voting.

]]>The post A privacy attack on weighted voting appeared first on nVotes - Online Voting.

]]>In the previous post we suggested an extension to degree of anonymity of (Diaz 2002) to voting. Recall the suggested definition for degree of privacy

where

we also mentioned that this extension could be useful for weighted voting, because in that case results leak more information than in more typical elections. The problem is that calculating the terms in the entropy expression is in general intractable due to combinatorial explosion; computing those values is only possible for small numbers of voters and weight categories.

But although we cannot compute a value for **d** in general, we can restrict the task to a more limited, but significant, calculation. The decision problem defined by

expresses whether or not the vote cast by voter **v**, when the election result is **r**, can be revealed. If the value of **d** is 0, vote secrecy is broken. Otherwise (**d** > 0), the attacker cannot determine what the voter selected with certainty. This is the significance of the decision problem as a restricted calculation of the general value of **d**. In terms of the entropy expression, the decision problem reduces to determining whether there is some choice **c _{i}** to which the assigned probability is 1. That is

the condition on the right says that it is certain that voter **v** selected **c _{i}**, and therefore

The calculation we are trying to make can be reformulated in terms of computational discrete mathematics, specifically integer programming and knapsack problems. An integer program is formulated as (wikipedia)

Let’s ignore the maximization part, and compare the bottom expressions with what defines a weighted voting tally under plurality. First let’s group voters according to their weight and choice:

so, for example, **v**(Yes, 3) would be the set of voters who selected Yes and have an assigned weight of 3. The tally function is then

which when rearranged as

has the same form as

where **A** is the weight matrix, **x** is the grouping of voters, and **b** is the set of results per choice. Note how the number of voters in each group is a positive integer, satisfying the two lower conditions[2]. Thus, the outcome of an election defines an integer programming problem. Geometrically, an election can be interpreted as a polytope whose interior represents the space of possible results that are consistent with available data. We can visualize this (wikipedia)

Our decision problem is the question of whether *all* the points in the feasible region defined by the election result have the property of encoding the same choice for a given voter. If so, then it is certain that said voter made that choice, and **d** = 0.

The decision procedure is then to count the number of points with the property and seeing if they equal the total number of points in the region.

Latte Integrale

Latte Integrale is a software package for lattice point counting and integration over convex polytopes. A theory page linking to academic research is here, the main result used in our use is (Koppe 2006). Latte takes several input formats to define integer programs whose lattice points can then be counted. The decision procedure is then

- Given some
**d**derive the equations that define the corresponding general integer program._{v,r } - Count the number of lattice points for the general program.
- For each possible
**c**modify the corresponding equation so that the integer program reflects the constraint that voter_{i }**v**selected**c**_{i}- Count the number of lattice points for the specific case
**c**_{i} - If the numbers match
**d**= 0, voter_{v,r}**v**‘s choice has been revealed as**c**._{i}

- Count the number of lattice points for the specific case
- Otherwise
**d**> 0._{v,r}

We used elections with Yes/No choices and plurality weighted voting to test the procedure. These examples were found by hand using comparatively small values and a similar pattern, so they’re not representative of real world cases. We show the Latte definition only in the first example, a link to all examples is provided at the bottom. The voters marked in bold have their privacy broken.

5 5 1503 -61 -24 -18 -12 15 -1 0 0 0 10 0 -1 0 0 20 0 0 -1 0 30 0 0 0 -1 # m(v, c, r) # linearity 2 1 2 # Ar linearity 1 1 nonnegative 4 1 2 3 4

Choices | Yes, No |

Yes tally | 1503 |

Weights | 61, 24, 18, 12 |

Voters | 15, 10, 20, 30 |

Lattice points | 83 |

Choices | Yes, No |

Yes tally | 94 |

Weights | 17, 8, 4, 2 |

Voters | 2, 6, 6, 10 |

Lattice points | 19 |

Choices | Yes, No |

Yes tally | 323 |

Weights | 43, 18, 14, 12 |

Voters | 3, 6, 7, 9 |

Lattice points | 7 |

Choices | Yes, No |

Yes tally | 193 |

Weights | 27, 10, 8, 6 |

Voters | 3, 6, 7, 9 |

Lattice points | 10 |

Choices | Yes, No |

Yes tally | 1521 |

Weights | 61, 24, 18, 12 |

Voters | 15, 10, 20, 30 |

Lattice points | 73 |

The full set of example integer programs is here.

In this post we have seen how the degree of privacy definition can be restricted into a decision problem that represents whether sufficient information is leaked by the result to break privacy. The decision problem was then reformulated as an integer program whose feasible region corresponds to the set of possible election outcomes consistent with public data. We then used the Latte Integrale software and its Barvinok algorithm implementation to test the procedure and showed several examples of successful attacks. We have not looked at whether these examples are representative of real world cases (where anonymity sets may be large).

- A point in the feasible region of the polytope strictly corresponds to a
*set*of election outcomes according to the grouping function**v**(c, w).

The multiplicities for each point is given (abusing notation) by

where *Bin* is a binomial coefficient and

The grouping function **v**(c, w) reduces the dimensionality of the polytpope avoiding the explosion mentioned previously.

- One way to compute general values for
**d**would be to use weighted ehrhart polynomials supported by the Latte*–valuation=top-ehrhart*feature, using the required monomials to sum the values of interest. Unfortunately this misses the multiplicities resulting from binomial coefficients at each point, and there seems no way to incorporate the binomial computations into the weighting monomials. - The frequency of cases with
**d**= 0, is expected to be low due to the multiplicity of these points, which contains at least one binomial term Bin(n, n) = 1, the minimum possible value.

[1] Koppe 2006 – A primal Barvinok algorithm based on irrational decompositions

[2] Equations (inequalities) that specify the number of voters per weighted group were left out for clarity

The post A privacy attack on weighted voting appeared first on nVotes - Online Voting.

]]>The post Degree of privacy in voting appeared first on nVotes - Online Voting.

]]>Note that this definition [bisimilarity-under-swapping] is robust even in situations where the result of the election is such that the votes of VA and VB are necessarily revealed. For example, if the vote is unanimous, or if all other voters reveal how they voted and thus allow the votes of VA and VB to be deduced.

The idea that information may leak from results leads us to the *degree of anonymity model,* which is

an information theoretic model that allows to quantify the degree of anonymity provided by schemes for anonymous connections. It considers attackers that obtain probabilistic information about users. The degree is based on the probabilities an attacker, after observing the system, assigns to the different users of the system as being the originators of a message

Whereas the standard privacy definition for voting is *possibilistic*, degree of information is information-theoretic, and therefore about probabilistic inference. The measure is not binary, but instead quantifies how much information an attacker can gain from observing the process. In the extension to voting discussed later, the attacker gains information by observing the results, which are available irrespective of whether ballot privacy exists.

The idea of quantifying information gain naturally leads to entropy in (Diaz 2009)

The authors then apply a normalization factor using the maximum entropy case (in other words, zero information leakage) to arrive at the degree of anonymity measure:

Given the normalization

The degree of anonymity model is about determining the sender of a message out of a possible group (the anonymity set). We want to extend this model to the case of voting, where the attacker wishes to determine what choice a voter made. The individual votes as plaintexts are not generally available, depending on the specific secure voting scheme used. This means that the translation is not immediate, in one case we are talking about determining a one-out-of-n variable, whereas for voting it’s about determining vote choice from an election result.

But we can fit probabilities directly into the degree of anonymity model. These probabilities must be derived from election results, as this is the public information available to the attacker. Also, we’d like to do this in a general way that does not depend on the specifics of the electoral method (or rule) used or the form that an election result takes. We start with

these are sets of voters, choices and election results.

the function **a** specifies each voter’s selection, modeled as a function that maps voters to choices. The function **t** represents the election tally, a function that maps the set of all choices made by voters and to the corresponding result.

**A _{r}** is the set of all functions

With these expressions in hand we can provide equivalent expressions for the terms that appear in the original degree of anonymity definition. First, the entropy corresponding to the uncertainty about a voter’s choice given an election result is

This is simply the standard expression for entropy, but with probabilities that correspond to the likelihood that a voter selected a certain choice given that a certain election result occured. The maximum entropy is the baseline, where we have no information about the election and so use a uniform probability distribution for the voter’s choice[3], which gives

Where |**C**| is the number of choices (the cardinality of **C**). Finally, the degree of privacy is

which quantifies the degree to which voter’s **v** choice remains secret given the election result **r**. This expression is analogous to the degree of anonymity model, with a few changes. First, we are talking about degree of privacy, since we are quantifying how much is known about a voter’s choice, instead of trying de-anonymize the sender of some message. Also, this result is per voter. This characteristic is not always relevant, but would be important for the case of weighted voting, where voters are not indistinguishable and would have different degrees. It would also possible to produce aggregate values, such as the average or the minimum value. Another generalization could be to produce expectation values over results for a given set up. As an extreme example of this case, an election with a single voter would always have a degree of anonymity of 0, irrespective of the result.

As was stated before, the above definition is general to any election irrespective of the electoral method, result type or even ballot design. Next we see some specific examples.

We calculate values for example cases. In all we are using a plurality rule for the function **t**: **V** => **C** => **R**.

This is a Yes/No vote (ballot options are Yes or No). We have a single voter, Bob.

**V** = { Bob }, **C** = { Yes, No }, **R** = { {Yes:1, No:0}, {Yes:0, No:1} }

In the following expression

The value of **n** is 2, for two possible results (the cardinality of **C**). Consider the case where **r** = Yes, then

**A _{r}** = { (Bob, Yes) }, and |

since it is the only way that Bob could have voted. Similarly

**m**(Bob, Yes, Yes) = 1 and **m**(Bob, No, Yes) = 0

again, for the same reason. The entropy then reduces to

**H**(v, r) = 1/1*log(1/1) = 0

which when plugged into

gives

**d**(Bob, Yes) = **d**(Bob, No) = 0

The degree of anonymity is zero, which matches the obvious fact that in an election with a single voter their vote will be revealed.

**V** = { v_{1} …. v_{n} }, **C** = { Yes, No }, **R** = { {Yes:n, No:0}, {Yes:0, No:n} }

In the following expression

we see that the unanimous vote case has a similar form as the case of a single voter, except for a general number of voters and results:

|**A _{r}**| = 1

for all **r**, and also

**m**(**v**, Yes, {Yes:0, No: 10} ) = 0 and **m**(**v**, Yes, {Yes:10, No: 0} ) = 1

which leads to

**d**(**v**, **r**) = 0

for all **v** and **r**. Again, this matches the obvious expectation: in a unanimous election the votes of all participants are unequivocally revealed.

**V** = { v_{1} …. v_{n} }, **C** = { Yes, No }, **R** = { {Yes:n, No:0} ….. {Yes:0, No:n} }

In this case the cardinality of **R** is |**C**|, because we are not restricting the results to those that are unanimous. The calculation for |**A _{r}**| and

with a binomial coefficient on the right. This is the number of ways it is possible to obtain the result **r**. We also have

dividing the previous two expression gives

Because this is a Yes/No election

which when divided by |**A _{r}**|

The complete expression for **H** is therefore

at this point we can do a couple of sanity checks. First of all, the probabilities in the entropy expression (**r**/|**C**| + 1 – **r**/|**C**|) sum to 1. Secondly, they tell us a common sense conclusion. If the number of Yes votes for an election with |**C**| elections is **r**, then the probability that any voter has selected Yes must be **r** / |**C**|. Because our method of calculating entropy is general, it is more long winded than the our intuition for the special case of plurality rule with indistinguishable voters.

Finally, the degree of privacy for the general Yes/No election is

Below is a graph of this function for a fixed value of 10 voters (|C| = 10).

The special case of unanimous elections we saw in the previous section correspond to the two edges of the graph, where again we get d = 0.

We have seen how the degree of anonymity model can be extended to voting. The extension is general to any voting method. Several simple examples illustrate that its calculations are consistent with results obtained using more simple methods specific to the election type. Although these examples are trivial, degree of privacy may be useful for more complicated cases where voters are distinguishable and more information may leak from results. The most immediate case of this type is weighted voting. Although the definitions presented here apply equally well to that case, applying them in their general form may require more work to avoid combinatorial explosion.

Diaz 2002 – Towards measuring anonymity

Delaune 2009 – Verifying privacy-type properties of electronic voting protocols

[3] It is also possible to use a non-uniform prior probability here. In that case the resulting conditional probabilities must be derived in such a way that relative magnitudes are preserved.

The post Degree of privacy in voting appeared first on nVotes - Online Voting.

]]>The post Election Methods: A typed classification appeared first on nVotes - Online Voting.

]]>Electoral systems consist of sets of rules that govern all aspects of the voting process: when elections occur, who is allowed to vote, who can stand as a candidate, how ballots are marked and cast, how the ballots are counted (electoral method), limits on campaign spending, and other factors that can affect the outcome.

A large variety of electoral systems are employed throughout the world, each with their specific characteristics. Classifying them is not easy, and existing classifications are broad rather than fine grained. The most common classification is the one we can see for example in Golder 2005 [1]:

Similarly, Norris 2007 [2] proposes:

where a distinction is made between purely proportional systems using party lists and semi-proportional methods. Wikipedia follows Golder 2005, although the concept of semi-proportional systems is mentioned in several places.

Again, the main categories are Majoritarian, Proportional and Mixed.

We can also propose a more technical classification geared towards software engineering. By this we mean a classification which serves to construct software that implements electoral methods for the purposes of electronic voting. With this approach we focus more on the data representation and counting properties of electoral systems, and leave other details as practiced in traditional voting aside. Our classification must be precise enough to start writing code.

We can start with the following attributes which act as facets:

- Ballot data

The structure of the data that describes ballot information. For example, a unique choice, or a ranked set of choices.

- Voting rule

The algorithm that takes ballot data as input and produces result data as output. For example, plurality.

- Result data

The structure of the data that describes result information. For example, a unique winner, or a set of winners.

We can encode this classification scheme in a programming language. In this case we are doing it with Scala, and since this it is not a full implementation we can create the necessary structure with abstract types. Besides the added precision, this encoding allows us to type-check our classification and reveal inconsistencies. Here’s the structure that represents an electoral method.

Where we can see data types corresponding to ballot data (Option, Choice), voting rule (Rule) and result data (Result). The types at the bottom are extra technical details required for type checking. The Method trait is then refined for each particular electoral method. Before that, we need to specify the types for voting rules, for which we choose a set of commonly used ones:

Above we can see, for example, that STV is a voting rule that uses ranked ballots since it uses a *SortedSet* data structure. Finally, here’s an example of a specific method, in this case first-past-the-post:

which is an electoral method that uses accepts single choice on the ballot, uses the Plurality rule, and produces a single winner. You can see the full Scala encoding here.

[1] http://mattgolder.com/files/research/es3.pdf

[2] https://sites.hks.harvard.edu/fs/pnorris/Acrobat/Choosing%20Electoral%20Systems.pdf

The post Election Methods: A typed classification appeared first on nVotes - Online Voting.

]]>The post Anonymity and e-participation: Pros and Cons appeared first on nVotes - Online Voting.

]]>The following serves as a starting point[20]

Our long community-network experience suggests that this weak form of identification is inadequate, if a trustworthy social environment that encourages public dialogue and deliberation is to be created. Online identity should, insofar as possible, reflect offline identity: if citizens wish to get a public answer from someone who plays a public role and appears online with her/his actual identity, they must do the same. They have to ‘show their face’ and take responsibility for participating under their actual identity[34]. This serves also to root the online community in the “proximate community” served by the network[35].

The authors consider that deliberation is indeed incompatible with anonymity and therefore privacy, as we mentioned above. Citizens participating anonymously would not be accountable or responsible for their contributions, and these are important requirements for deliberation to take place. On the other hand, the authors also observe that[20]

Nevertheless, even in online deliberative contexts, there are cases in which it is worth protecting participants’ privacy. This might occur during public consultations and discussion on sensitive issues or public assessments of an official that could bounce back on the participants, as in the case of the assessment of a teacher by his/her students as well as in the case of doctors rated by patients (e.g. http://www.patientopinion.org.uk/). In all these cases, there is the need to integrate a strong authentication policy (so that, e.g., only the students who have actually taken a class can rate the teacher) with secrecy techniques for protecting participants’ identity. Software can help achieve this by obscuring the identity of the sender of a message in such critical discussion areas.

Beyond this introduction, the report goes on to summarize the main points found in other studies in literature with short bullet-point descriptions:

Greater anonymity may increase uncivil behaviour and the use of offensive words[21,22].

Greater anonymity may reduce comment quality[23].

Greater anonymity may reduce trust, cooperation and accountability[22].

Conversely

Greater anonymity may increase participation[25] and engagement[26].

Greater anonymity may yield more information[23] and produce more honest[26] and original ideas[29].

Greater anonymity may produce more equal[30,31] interactions leading to free discussion of controversial issues.

From this we can summarize even further and identify the fundamental tension in the discussion of e-participation, deliberation and privacy. If we protect privacy with anonymity we reduce accountability, which may produce suboptimal outcomes. If we enforce real identities we reduce freedom, which again may produce suboptimal outcomes. The central tension is thus between accountability and freedom[36], modulated by the specifics of the particular case.

For example, if the object of deliberation is subject to strong social pressures and therefore self-censorship, then freedom is essential to ensure that dissenting and less represented views are heard. On the other hand, if the deliberation on the platform has a low signal to noise ratio, or is a target of uncivil behavior, measures may be required to increase accountability.

Given the opposing forces in play, it is difficult to find any general recommendations:

Prior research is not specific any enough to warrant practical recommendations for Decidim, only general trends to bear in mind. Some of the drawbacks and benefits mentioned above may not appear when using anonymized pseudonyms, since that technique exists at a midpoint in the anonymity spectrum[32].

The last sentence hints at a possible equilibrium between accountability and freedom in pseudonymity. We will return to this in further posts.

References

[20] De Cindio, Fiorella. 2012. “Guidelines for Designing Deliberative Digital Habitats: Learning from E-Participation for Open Data Initiatives.” The Journal of Community Informatics 8 (2).

[21] Fredheim, Rolf, Alfred Moore, and John Naughton. n.d. “Anonymity and Online Commenting: An Empirical Study.” SSRN Electronic Journal. doi:10.2139/ssrn.2591299.

[22] Cho, Daegon, and Alessandro Acquisti. 2013. “The More Social Cues, The Less Trolling? An Empirical Study of Online Commenting Behavior.”

[23] Diakopoulos, Nicholas, and Mor Naaman. 2011. “Towards Quality Discourse in Online News Comments.” In Proceedings of the ACM 2011 Conference on Computer Supported Cooperative Work – CSCW ’11. doi:10.1145/1958824.1958844.

[25] Fredheim, Rolf, Alfred Moore, and John Naughton. n.d. “Anonymity and Online Commenting: An Empirical Study.” SSRN Electronic Journal. doi:10.2139/ssrn.2591299.

[26] Davies, Todd. 2009. Online Deliberation: Design, Research, and Practice. Stanford Univ Center for the Study.

[29] Connolly, Terry, Leonard M. Jessup, and Joseph S. Valacich. 1990. “Effects of Anonymity and Evaluative Tone on Idea Generation in Computer-Mediated Groups.” Management Science 36 (6): 689–703.

[30] Flanagin, A. J., V. Tiyaamornwong, J. O’Connor, and D. R. Seibold. 2002. “Computer-Mediated Group Work: The Interaction of Sex and Anonymity.” Communication Research 29 (1): 66–93.

[31] Klenk, Nicole L., and Gordon M. Hickey. 2011. “A Virtual and Anonymous, Deliberative and Analytic Participation Process for Planning and Evaluation: The Concept Mapping Policy Delphi.” International Journal of Forecasting 27 (1): 152–65.

[32] “Identity and Anonymity.” 2016. Accessed December 20. http://web.mit.edu/gtmarx/www/identity.html.

[34] Casapulla, G., De Cindio F., Gentile, O., & Sonnante, L. (1998). A Citizen-driven Civic Network as Stimulating Context for Designing On-line Public Services.

[35] Carroll, J.M. & Rosson, M.B. 2003. A trajectory for community networks. The Information Society, 19(5), 381-393.

[36] This is reminiscent of the privacy-integrity tension found in secure voting, and indeed there is overlap on the privacy part as it relates to freedom.

The post Anonymity and e-participation: Pros and Cons appeared first on nVotes - Online Voting.

]]>The post 3 crypto schemes for liquid democracy (III) appeared first on nVotes - Online Voting.

]]>In part 1 and part 2 we showed two schemes supporting liquid democracy. Scheme Mixnet/Mixnet combined results from two tallies to obtain each election result. Scheme Homomorphic/Mixnet differed from Mixnet/Mixnet in that the tally of votes for delegates (votes to delegates) and delegate votes (votes by delegates) was done homomorphically, for example through additively homomorphic ElGamal. This allowed the accumulation of delegate weights to occur in ciphertext space, retaining higher levels of privacy.

Continuing the theme of operating in ciphertext space we have the FH scheme; which stands for Fully Homomorphic. The possibility of using fully homomorphic encryption was suggested to me by Sandra Guasch when discussing liquid democracy at a crypto conference. In a fully homomorphic encryption scheme it is possible to compute arbitrary functions on encrypted data. For this reason some consider fully homomorphic encryption to be the holy grail of cryptography, opening the door to all sorts of applications where third parties can process information without access to its contents.

In the FH scheme below we use a fully homomorphic scheme that supports both addition and multiplication (in contrast to ElGamal which can only be one of the two at the same time). From these two primitive operations it is possible to perform arbitrary computations on encrypted data. In our case it turns out that addition and multiplication are exactly the two operations required for a liquid tally.

Above, all boxes contain double borders indicating encrypted content. We can see therefore that all tally operations occur in ciphertext space. Only when the encrypted data has been processed to result in the final tally is it encrypted. There is also an optional arrow to decrypt delegate votes as a mechanism for voters want to monitor that their choices, to keep them honest. The sequence of operations is

- the votes for delegates are added to result in delegate weights,
- these are then multiplied by delegate choices, and
- finally combined with direct votes.

All these operations are carried out homomorphically on encrypted data. This scheme achieves the maximum level of privacy, capable of maintaining privacy even for delegates’ votes. In practice this is not really a benefit as it’s generally desirable for delegate’s to cast their votes in public to maintain transparency and trust. Moreover, unlike the previous two schemes, this scheme is not practically viable. As of this writing, fully homomorphic encryption is still experimental and prohibitively inefficient. Nonetheless we chose to include it as an interesting possibility that fits in well with liquid democracy.

We wrote a demo, FHLD, using Victor Shoup’s HELib library that implements the fully homomorphic crypto scheme we referred to before. The demo is available here. Be advised that the FHLD internals are experimental and the code that deals with HElib is somewhat cryptic.

In this series of posts we have seen three schemes that support secure liquid democracy. Below is a table that sums up the main properties of each

While the last scheme is mostly of theoretical interest, the first two can be implemented today using standard cryptographic techniques. There is thus no fundamental barrier to the use of liquid democracy tools with support for privacy as well as verifiability.

The post 3 crypto schemes for liquid democracy (III) appeared first on nVotes - Online Voting.

]]>The post Secure voting – a definition appeared first on nVotes - Online Voting.

]]>What do we mean by *secure voting*?

The expression is overloaded and ambiguous. Part of the ambiguity stems from the different contexts where the term is used. On one hand, we can speak about secure voting in the general context of cybersecurity and the internet. On the other, secure voting has more specific definitions in the academic research literature into voting systems. In this post we try to clarify the meaning of secure voting, starting from general intuitions leading to more precise technical definitions.

In a general context, secure voting is understood to be about methods, software and systems that aim to protect an election from fraud and disruption. It a question of correctness and integrity. A voting system is secure in the sense that we can trust that the results of an election are fair and correct. The main threats faced by a secure voting system are those typical of any computer security problem: hacking, intrusion, manipulation, disruption. If we restricted our discussion to this general context, secure voting would simply be another problem in cybersecurity.

However, election integrity is not the whole story. Although the concept of cybersecurity includes concerns about data theft and privacy, the emphasis on privacy in the case of voting is critical. It’s what separates secure voting from more general problems in cybersecurity. This emphasis stems from voting’s nature as a political activity, where the crucial importance of the secret ballot is well established. The importance of the secret ballot has been recognized since roman times, and is even enshrined in the declaration of human rights^{[4]}.

Article 21.3 of the

Universal Declaration of Human Rightsstates, “The will of the people…shall be expressed in periodic and genuine elections which…shall be held by secret vote or by equivalent free voting procedures.”

Accounting for this, the goal of secure voting is then protecting voter *privacy* as well as election integrity. It turns out that these two objectives are fundamentally opposed. For this reason, secure voting as a technical discipline is about finding methods that allow achieving both objectives simultaneously. It is in this domain (the academic literature on secure voting) where we find more precise definitions that factor in the core tension that general considerations about cybersecurity fail to address. And it is in this technical domain where voting-specific cryptography is employed to solve the unique problems that arise. Thus, the term “cryptographically secure voting” seems a reasonable choice to refer to this refined, more specific meaning.

At a high level^{[3]}, the goals of cryptographically secure voting are described by^{[1]}

One of the most challenging aspects in computer-supported voting is to combine the apparently conflicting requirements of privacy and verifiability. On the one hand, privacy requires that a vote cannot be traced back from the result to a voter, while on the other hand, verifiability states that a voter can trace the effect of her vote on the result. This can be addressed using various privacy-enabling cryptographic primitives which also offer verifiability.

We stress that the use of cryptography is a means to an end, not the end in itself. A voting system that includes cryptographic techniques is not necessarily a cryptographically secure voting system.

This mistake is commonly made when assessing blockchain systems: these systems use cryptography, but this cryptography has no bearing on privacy^{[5]}. Cryptography is there to satisfy certain requirements, not the other way around.

Let’s now pin down exactly what these requirements entail.

Starting with privacy^{[2]}

Privacy: In a secret ballot, a vote must not identify a voter and any traceability between the voter and its vote must be removed.

An alternative statement is^{[1]}

Ballot-privacy: no outside observer can determine for whom a voter voted

Note that the expression “no outside observer” refers to anybody that is not the voter. The important implication is that not even the administrators of the voting system or anyone with privileged access to hardware/software can violate this privacy. If this condition is not met, a system does not support privacy and is therefore not a secure voting system. Solutions that simply “forget” data, or merely store data at different locations, do not satisfy privacy. It is not enough for the system to voluntarily disregard privacy-compromising information; said information must not be available at all. Relaxing this privacy requirement makes a building voting system trivial, typically reducing to the use of SSL for communication.

The literature presents several variants^{[1]}:

Individual verifiability (IV): a voter can verify that the ballot containing her vote is in the published set of “all” (as claimed by the system) votes.

Universal verifiability (UV): anyone can verify that the result corresponds with the published set of “all” votes.

These two requirements appeared first in the literature, and were later augmented with[1]

End-to-end verifiability: a voter can verify that:

– cast-as-intended: her choice was correctly denoted on the ballot by the

system,

– recorded-as-cast: her ballot was received the way she cast it,

– tallied-as-recorded: her ballot counts as received.

The notion of verifiability of a voting system is directly related to integrity, and is in fact a strictly stronger property. Not only must the system operate correctly and election results must be fair, but it must be possible for participants and external observers to certify this unequivocally. Verifiability is one of the areas in which electronic voting systems may offer better guarantees than traditional voting. This is accomplished through cryptographic proofs and publicly available bulletin boards that collect election data.

With these defintions in hand we can suggest:

*A cryptographically secure voting system is one that supports privacy and end-to-end verifiability.*

For brevity, these systems are referred to simply as end-to-end verifiable. End-to-end verifiable systems are considered the goal standard for electronic voting. When these characteristics are further combined with general computer security techniques the result is a generally secure voting system.

We have seen that

- The term “secure voting” is generally thought to refer to cybersecurity and resistance to cyberattacks.
- However, cybersecurity is a general property of hardware/software that does not reflect the specific requirements of voting as a political process. The secret ballot is an established and indispensable requirement for voting.
- Secure voting systems must support privacy as well as integrity; these two requirements stand in opposition.
- In a system supporting privacy, no one, not even system administrators or other privileged observers can violate the secret ballot.
- In a system supporting end-to-end verifiability, voters can ensure that their vote was cast, recorded, and counted correctly.
- Cryptographically secure voting systems employ cryptographic technology to satisfy these two properties simultaneously. The gold standard are end-to-end verifiable voting systems.
**A secure voting system is an end-to-end verifiable voting system that also employs general computer security principles.**

This last point expresses our view of what it means for a voting system to be secure. Although this definition is very demanding, we believe it is appropriate to be conservative in an area that overlaps with political decision making. Unfortunately this approach implies that many systems that are labelled secure voting systems do not in fact belong to that category.

References

[1] Privacy and Verifiability in Voting Systems: Methods, Developments and Trends [https://eprint.iacr.org/2013/615.pdf]

[2] A framework and taxonomy for comparison of electronic voting schemes [https://www2.ee.washington.edu/research/nsl/papers/JCS-05.pdf]

[3] The literature includes many other properties relevant to secure voting, we concentrate on the essential ones. Please refer to [2] for further info.

[4] Secret ballot [https://en.wikipedia.org/wiki/Secret_ballot#International_law]

[5] Unless you include specific cryptographic techniques such as zk-snarks (zcash) or linkable ring signatures in addition to the blockchain. One of our proposals is related to the first case, found here. Another proposal related to blockchain’s distributed nature is here.

The post Secure voting – a definition appeared first on nVotes - Online Voting.

]]>The post Anonymous authenticated registration for e-participation appeared first on nVotes - Online Voting.

]]>Following the previous post, we describe a two-phase process by which citizens can anonymously register with an e-participation platform while maintaining authentication and eligibility guarantees. Technical details are left out of this description for the sake of clarity, please refer to the previous post and its accompanying report.

To start things off, we assume there is some existing credential system in place that determines which citizens are authorized and which are not. This existing system could require on-site interaction, for example via a physical (governement or other institution issued) id, or some other previously existing digital credential (like a private key or password for other existing systems).

In the case of on-site physical credentials, there is an extra step in which the citizen is provided with a temporary digital credential. Once this physical interaction has taken place the rest of the steps are common to both types of existing credentials.

In the first phase of registration, a citizen accesses a registration page and logs in with their pre-existing digital credential. He/she is thus authenticated and validated as eligible to participate on the e-participation platform. The registration page then includes javascript code that will run on the client device in order to generate confidential information not available to the e-participation platform. This code will

- Generate a unique random token, using a secure random source.
- Display the token such that the citizen can copy it in some format (paper, electronic device).
- Encrypt the token with a public key previously generated by a set of independent trusted authorities.

The encrypted token is then sent to the server for storage. The token is accepted because it is sent by a logged in user (using their pre-existing credentials), and therefore belongs to a valid citizen. Note that, given this authentication, the server knows which citizen the encrypted token belongs to, but because the token is encrypted the link between token and citizen cannot be established.

At this point the first phase of registration is complete. Each citizen has copied their unique token while the server has stored them in encrypted form.

Phase two begins with the anonymization of the encrypted tokens, which is carried out by a mixnet (eg nMix) executed by the trusted authorities. At the end of the mixing process the anonymized encrypted tokens are decrypted. This results in a set of plain unencrypted tokens that now cannot be linked to their respective citizens.

To complete registration, citizens again access the registration page, but this time anonymously, without logging in as they did in phase one. Instead, they submit their token which acts as an anonymous credential. Once authenticated with this token, the client is allowed to create their user, as they would in a normal online registration process. The difference is that this newly created user is validated yet anonymous. From this point onwards users function as they would in any web platform. Citizens can access the e-participation platform using this newly created user and participate normally, but anonymously. Their user is effectively a privacy preserving pseudonym.

A final note on pseudonymity, as mentioned in the previous post:

E-participation tools whose nature is best described as social information filtering[6], consultation and ideation co-production[15], reputation[16], or deliberation systems require pseudonymity[36] as the privacy protecting mechanism. The notion of users with linkable contributions is fundamental to these platforms.

Users within these platforms function as pseudonyms, retaining some level of identity for purposes such as deliberation, reputation and accountability. Citizens’ anonymity is protected by the fact that the pseudonym cannot be linked to the citizen, not even by the administrators of the platform. The technical term for this is unlinkable pseudonymity. In this particular case there is an exception to this: if all independent trusted authorities agree, they can revoke a citizen’s anonymity.

The post Anonymous authenticated registration for e-participation appeared first on nVotes - Online Voting.

]]>