Google Privacy Intersection and Technical Analysis 2 – Technical Overview

In the previous article, we analyzed the application scenarios of Private Join and Compute ( Google Privacy Intersection and Technical Analysis 1 – Application Scenario Analysis ). This article analyzes its technology.

Google's open source library is to use the existing cryptographic technology to solve the problem by combining existing technologies. It's a bit like Bitcoin (the bitcoin white paper parsing link ), all standing on the shoulders of giants.

How does Google pick up fruits from academia to solve practical problems in industry?

The main job of Google's open source library is to design a practical cryptographic secure computing protocol for industrial use.

01 problem model

The main problem solved by the agreement is the calculation of privacy intersection and ( Private Intersection-Sum , PIS for short ) .

The problem model can be abstracted as:

Each of the two parties has a data set containing the user's identity, and one of the parties also has an integer associated with the user's identity, for example, the integer can be the transaction amount of the user. Both parties want to know the following:

(1) The number of common users owned by both parties;

(2) The sum of the integers corresponding to these common users without revealing any private information entered by the user.

This is a privacy intersection and ( P IS ) issue.

This problem is not an imaginary problem, but comes from the specific needs of the enterprise.

For example, in an advertising campaign, calculating the specific ad conversion rate, that is, the effect of advertising. How many people bought goods because of advertising. In this demand, multiple companies may be involved. This is a situation that often occurs in corporate collaboration.

This problem has important practical value and is needed in many scenarios and has commonalities .

02 technical framework

At first glance, the above problem is very similar to the Private Set Intersection (P SI ) . Note that PIS and PSI are two issues.

PIS is a traditional cryptographic problem in which the intersection of sets is calculated without revealing the intersection.

The PIS defined by Google here is capable of doing aggregate calculations on intersections in addition to the functions performed by PIS. Obviously this will bring extra computational overhead.

Note that aggregation is the summation of elements of the same attribute.

What Google's open source library does is to expand on the PSI solution. Extend it to do aggregate calculations on the corresponding attributes without leaking the intersection.

So the architecture of the open source library is:

PSI + sums the intersection elements (without disclosing the intersection elements)

03 technical route

The technical route of the library is to first select the most effective solution as an alternative according to the existing PSI solution. The aggregation function is then implemented by adding homomorphic encryption.

Over the years, the cryptography community has had many PSI solutions. Two ways to solve the PSI problem have been chosen on the Google technology route.

One approach is based on Random Oblivious Transfer, which takes advantage of the inadvertent PRF (OPRF) technique to obtain the ability to hide the identity of the intersection elements. Then, using additive homomorphic encryption , the aggregation function is provided without divulging the intersection elements.

The second method is to construct an oblivious protocol using the encrypted Bloom filter under additive homomorphic encryption. Aggregation is still implemented by additive homomorphic encryption.

In addition to the above two protocols, a third protocol , called the DDH type protocol, is constructed. The protocol is based on the traditional set intersection protocol, using Pohlig Hellman ciphertext (based on the difficulty of judging DDH problems). This type of protocol can be thought of as an inadvertent PRF using a shared key. Similarly, the aggregation function is also implemented by additive homomorphic encryption.

04 performance

All three protocols require addition homomorphic encryption. There are currently three additive homomorphic encryption schemes:

1. Paillier encryption scheme

2. Exponential ElGamal encryption scheme

3. Ring LWE encryption scheme

From the perspective of communication efficiency and computational efficiency, Google conducted a detailed analysis of the three protocols based on these three additions and homomorphic encryption.

The data shows that the third protocol, the DDH type protocol, achieves the best communication efficiency. In the case where the input set element is 100,000 elements, only 9.28M of traffic is required.

In addition, in terms of computational efficiency, the DDH type protocol based on the ring LWE encryption scheme still achieves the best performance. In the case where the input set contains 100,000 elements and the associated integer is 32 bits, the calculation of the PIS problem takes only 395.78 seconds.

For the other two protocols, although the computational optimization is done, the computational bottleneck is mainly spent on the homomorphic operation.

—–To be continued—–
Source: Gemi chain
Author: Dr. Zhiyuan