PDK: the PIMS Development Kit

PDK components

With the PIMCity PIMS Development Kit (PDK), we offer basic and generic components that support the fundamental functionalities for Personal Information Management Systems (PIMS). These modules are released as a Software Development Kit (SDK), aiming to streamline the development and integration of PIMS. Our goal is to commoditize the complexity of creating PIMS to lower the barriers for companies to enter the web data market.

First Release of the PDK software

The source code of all modules is available online on GitLab

Tools to improve users’ privacy

They include functionalities that allow the users to take informed decisions about which information to share and with whom.

The source code of this module is available online on GitLab

The Personal Data Safe (P-DS) is the means to store personal data in a controlled form. It implements a secure repository for the user's  personal information like navigation history, contacts, preferences, personal information, etc. 

The Personal Privacy Metrics (P-PM) represent the means to increase the user’s  awareness. This component collects, computes and shares easy-to-understand metrics to allow users know how a service stores and manages the data.

The Personal Consent Manager (P-CM) is the means to define all the user's preferences when dealing with personal data. It defines which data a service is allowed to collect, process, or which can be shared with third parties.

The Personal Privacy Preserving Analytics (P-PPA) allow extracting useful information from data while preserving users’ privacy. It leverages concepts like k-Anonymity and Differential Privacy. 

Tools for a new data economy

Fundamental for PIMS is the creation of a transparent, open and easily accessible data market. We identify two fundamental components and functionalities for this.

The source code of this module is available online on GitLab

DVTMP

The Data Valuation Tools from the market perspective (DVTMP) module leverages some of the most popular existing online advertising platforms to estimate the value the audience

DVTUP

The Data Valuation Tools from the user perspective (DVTUP) provides estimated valuations of end-users' data for the bulk dataset they are selling through the marketplace

Tools for novel data management

Data needs to be exported, imported and exchanged using standard mechanism, with proper metadata that let the system know the data source, data value, and facilitate the data aggregation from heterogenous sources. 

The source code of this module is available online on GitLab

The Data Knowledge Extraction (DKE) component offers the means to extract knowledge from the raw data implementing machine learning and big data solutions. One of the biggest challenges here is the creation of value out of the raw data. When dealing with personal data, this must be coupled with privacy preserving approaches, so that only the necessary data are disclosed, and the data owner keeps the control on them. The DKE consists of machine learning approaches to aggregate data, abstract models to predict future data (e.g., predict user’s interests in recommendation systems), fuse data coming from different sources to derive generic suggestions (e.g., to support decision by users, providing suggestions based on decisions taken by users with similar interests).

The purpose of the Data Portability and Control (DPC) tool is to allow individual users to migrate their data to new platforms, in a privacy-preserving fashion. More specifically, it provides methods for extracting data from one PIMS (e.g., Bank data through the TrueLayer API), process it by filtering out sensitive information or user-inputted data (e.g., remove login credentials or debit card numbers), and outport it into other PDK module, a new PIMS (e.g., EasyPIMS) or an exported file in a common data interchange format, e.g., JSON.

The Data Provenance module OpenAPI allows developers to insert watermarks of ownership in the datasets they share in the marketplace. In general, this component is used internally by the PDK and developers that are in need of controlling data ownership even after a dataset has left the platform. This is done by embedding difficult to remove watermarks into the datasets.

The Data Aggregation (DA) tool enables data owners that hold a bulk of their users’ data to aggregate and anonymize them. This allows sharing these data in a privacy-preserving way.