Millionaire’s Problem and me :-)

Python is a beautiful world.It indeed takes you near the nature -where we can see and make things implemented in the language – nearest to what nature speaks in.

I dont know -but i believe that if God is a coder/programmer than he would have coded the world in Python language and spoken Sanskrit:-)

Well, i try to write and edit about something -that demonstrates the usefulness and ease of Python in day to day life.

Was just going through Bryan Mills article on “The Millionaires Problem” and was thinking can I have some better solution for this. And indeed I had one..Please go through the complete traditional solution provided by Yao on his research paper and let me know your way of doing this.

Will see if yours is better than mine or not?

Lets assume there are two millionaires, named Alice and Bob. Alice and Bob what to know who is richer without revealing how much money either one of them have. This is what is known as the millionaires problem, originally introduced by Andrew Yao.

Why am I talking about this? Well I’m interested in the ability to share information among people while preserving their data privacy. In this case the two millionaires will know which is richer (information) without sharing how much either is worth (their private data).

The solution we will be exploring relies upon two assumptions. First, it is assumed that the two millionaires knows approximately how rich each are. In our example we’ll assume that Alice and Bob knows that they are both worth somewhere between 1 and 10 million dollars. The second assumption is that Alice and Bob are using a public key encryption system, in this example we’ll be using RSA. With those assumptions lets begin working through an example.

We’ll give Alice 8 million dollars (i=8) and Bob 6 million dollars (j=6). Then lets assume that Alice’s public key is (17, 1591) and her private key is (89, 1591). These keys are really small to make the math easy, also note that the maximum number that can be encoded with this pair is limited (actually the maximum is 1590).

STEP 1: Lets have Bob start the process. To do this Bob first generates a N bit random number called U, we’ll choose 1180. Bob then takes that number and computes C = 1180^17 mod 1591 = 749. Note this is actually the encrypted version of Bob’s random number using Alice’s public key.

STEP 2: Bob takes the C value he computed and adds his wealth value J and subtracts one, C – J + 1 = 749 – 6 + 1 = 744. Bob then sends Alice this value.

STEP 3: After receiving this value Alice creates values Y1, Y2, …, Y10. Where Y1 is the RSA deciphering of (C – J + 1), Y2 is the deciphering of (C – J + 2), Y3 is the deciphering of (C – J + 3), etc. Although Alice doesn’t know C or J, Alice can still compute this because Bob sent her (C – J +1). This produces the following table of Y values.

x (C – J + x) RSA Function Yx
1 744 744^89 mod 1591 805
2 745 745^89 mod 1591 281
3 746 746^89 mod 1591 339
4 747 1311
5 748 1244
6 749 1180
7 750 1062
8 751 1359
9 752 219
10 753 753^89 mod 1591 942

STEP 4: Alice generates a random N/2 prime number called p, remember Bob’s random number was N bits. We’ll use 631, its prime and its close enough to N/2 bits. Using this number Alice generates Z1, Z2, Z3, … Z10. Where Z1 = Y1 mod p, Z2 = Y2 mod p, etc. This operation generates the following table of Z values.

x (C – J + x) RSA Function Yx Zx (= Yx mod 631)
1 744 744^89 mod 1591 805 174
2 745 745^89 mod 1591 281 281
3 746 746^89 mod 1591 339 339
4 747 1311 49
5 748 1244 613
6 749 1180 549
7 750 1062 431
8 751 1359 97
9 752 219 219
10 753 753^89 mod 1591 942 311

Detail Note: The random prime needs to be chosen such that the consecutive Z values differ by at least 2, the one we chose does and its easy to verify in the general case.

STEP 5: The rest of this is just the setup, heres the punchline. Alice then sends Bob the random prime she selected and the calculated Z values: Z1, Z2, Z3, … up to the i value (the number of millions Alice has i = 8). The remaining 10 values Alice sends but adds one to each value; Z9 + 1 and Z10 + 1. To recap Alice sends the first i Z values as they were calculated, she then sends the remaining Z values plus 1. Here are the numbers alice sends:

p Z1 Z2 Z3 Z4 Z5 Z6 Z6 Z8 Z9+1 Z10+1
631 174 281 339 49 613 549 431 97 219+1=220 311+1=312

STEP 6: Bob then computes G = U mod p = 1180 mod 631 = 549. Recall U is the random number that Bob selected in step 1. The value p is known because Alice told Bob the random prime she selected (p). Bob then looks at the jth value (his wealth in millions) of the 10 Z values sent to him by Alice (ignoring the random prime Alice sent first). If this jth value is equal to G then Bob knows that Alice is of greater than or equal wealth to himself. In our case G does equal the jth value (549 = 549), this means that Alice is either of greater or equal wealth. On the other hand if G is not equal to jth value it means that Bob is richer than Alice.

STEP 7: Bob then tells Alice that she is richer. Both parties now know which person is richer and was able to determine this without exposing their actual wealth. Note that Bob actually determines if Alice is greater than or equal wealth, to determine if its equal or greater than you would have to run the algorithm in reverse (swap the roles for Alice and Bob).

At the end they both know more, for example Alice knows that Bob is worth between 1 and 8 million and Bob knows that Alice is between 6 and 10 million. This is a result of just knowing who is richer than the other and the original range of values they already knew. Besides this no other information can be determined by the involved parties.

Still not convinced? Here is some python code that demonstrates how this algorithm works, it implements RSA public key encryption and then walks through one example (similar to this article). You can easily change the wealth of Alice and Bob and see how it works. How to play with this:

  1. Download million.py
  2. Execute this file using python. python million.py
  3. At the bottom of the file is the “main” body of the program. You can edit various values in this section and generally play with everything there. Specifically BOB_VALUE, ALICE_VALUE, and MAX_VALUE.

There are several implementation problems with this when in a real world problem. First, what if people lie? Bob and or Alice could simply not tell the truth about their values. By lying Alice and Bob could run the algorithm multiple times to and eventually determine exactly how rich the other party is. Yao actually addresses this issue in his original paper and proposes a few solutions. Second, computation cost is extremely high, since you are computing public key encryption values for several values the computation time could be very high. Third, the range of values, in most practical problems the range of values is greater than 10 which means lots of computation but also lots of Z values need to be sent between Alice and Bob. Fourth, if you implement this for multiple parties there can be collusion among the parties which could reveal private data about an individual in the group.

Its interesting to think that this was written in 1982 just a few years after public key encryption was discovered. Was it a problem looking for a solution or a solution looking for a problem? Which ever direction the solution came from the millionaires problem proves that you can share information yet protect ones private data. While this solution might have some issues at least there is a solution! Since this paper was published there has been several more sophisticated solutions but this one is elegant and is fairly obvious after working through a few examples.