Popular science | Privacy protection is worrying? Encrypted data warehouses show their talents (core use cases and requirements analysis)

This article is based on the second part of the paper " Encrypted Data Vaults " by the Rebooting Web of Trust at RWOT IX – Prague, 2019 . Following the previous section, which introduced the methods and architecture of the current encrypted data warehouse, the requirements for derivation, the design goals, and the risks that developers should be aware of when implementing data storage, this section will focus on common use cases and requirements analysis for data storage systems. And some guidelines and design goals for building an encrypted data warehouse . In the next issue, we will bring the final part of Encrypted Data Vaults to explore the architecture of the encrypted data warehouse and some security and privacy considerations.

Original: https://github.com/WebOfTrustInfo/rwot9-prague/blob/master/final-documents/encrypted->

Author (in alphabetical order): Amy Guy, David Lamers, Tobias Looker, Manu Sporny, and Dmitri Zagidulin

Contributors (in alphabetical order): Daniel Bluhm and Kim Hamilton Duffy

First, the core use case

The following four use cases are common application patterns for data storage systems, but they are by no means the only use cases.

1. Store and use data

The user wants to store the data in a secure location, but does not want the storage service provider to be able to see any data he stores, that is, only the user can see and use the data.

2. Search data

Over time, users will store large amounts of data. Users need to search for data, but don't want the service provider to know what the user wants to store or search.

3. Share data with one or more entities

Users typically share their data with multiple entities such as other people or services. The user may decide to grant other entities access to the data in their storage area when the data is first saved, or during later use. His storage and data are only accessible to others if the user explicitly agrees.

Users want to be able to revoke access to others at any time, and when sharing data, you can set the expiration date for third parties to access their data.

4. Store the same data in multiple places

Users need the system's ability to back up their data across multiple storage locations to prevent data loss. These locations can be hosted by different storage providers and can be accessed through different protocols. These locations may be the user's phone or cloud storage. In addition, these locations should be able to synchronize with each other. Therefore, no matter how the user creates or updates the data, the data at these locations is up-to-date and can be automatically synchronized without user assistance.

Second, the demand analysis

From the above four core use cases, we can extract some of the requirements for the storage system.

1. Privacy and multi-party encryption

One of the main goals of the system is to ensure the privacy of physical data in case unauthorized people (including storage providers) access the data.

To do this, the data must be encrypted as it is transmitted (via the network) and saved (on the storage system).

Since data can be shared with multiple entities, the encryption mechanism must also support sharing of encrypted data to multiple parties, allowing multiple parties to access.

2. Sharing and authorization

It is necessary for the system to provide an authorization mechanism to allow encrypted information to be shared between one or more entities.

In the system, you can specify a mandatory authorization scheme or other alternative authorization schemes. These licensing schemes include OAuth2.0, Web Access Control, and ZCAPs.

3. Identity

The system should be independent of the identity. In general, identifiers in the form of URNs or URLs are preferred. Suppose the system uses decentralized identity (DID) in some way, but hard-coded DID is not a good model.

4. Version management and copy

In general, we expect the system to continuously back up information. For this reason, the system needs to support at least one mandatory version management policy and one mandatory copy policy, while also allowing other version management and copy policies.

5. Metadata and search

The system generally stores a large amount of data, and the user needs to be able to efficiently and selectively retrieve the data. To this end, the encryption search mechanism is a necessary function of the system.

For the client, it is important to be able to associate metadata with the data so that the data can be searched. At the same time, since the privacy of data and metadata needs to be guaranteed, the metadata must be stored encrypted. In addition, service providers must be able to perform those searches in an opaque and privacy-protected manner, rather than viewing metadata.

6. Communication protocol

Since the system needs to be compatible with a variety of business environments, at least one communication protocol must be enforced. But it's also important that the design should also allow the system to use other protocols such as HTTP, gRPC, Bluetooth, and other online protocols.

Third, the design goal

This section details some of the guiding principles and design goals for building an encrypted data warehouse.

1. Layered and modular architecture

Using a layered architecture approach ensures that the underlying functionality of the system is easy to implement while allowing more complex layers of functionality to be superimposed on lower layers.

For example, system layer 1 may contain some of the most basic functions that are mandatory; layer 2 may contain features that are useful for most deployments; and layer 3 may contain a subset of the advanced features required for an ecological project; Layer 4 may contain extremely complex features that are only needed for a small number of ecological projects.

2. Prioritize privacy issues

The construction of an encrypted data warehouse must first protect the privacy of the entity. When exploring new features, always consider the impact of privacy. New features that negatively impact privacy will undergo a rigorous review to determine if new features are worth implementing.

3. Push implementation complexity to the client

The system server should focus on the implementation of the encrypted data storage and retrieval functions. The more the server knows about the data, the more privacy risks the entity storing the data is, and the service provider is more responsible for the hosted data. Pushing complexity to the client allows the service provider to provide a stable server-side implementation, and the client can make some innovations.


To be continued, the next issue will bring the final part of Encrypted Data Vaults to explore the architecture of the encrypted data warehouse and some security and privacy considerations, so stay tuned! If you are interested in data privacy protection, please join the ontology technology community and discuss with us.