FAIR Data and Distributed Analytics

FAIR stands for Findable, Accessible, Interoperable and Reusable.

We envision crossing borders of data silos, analyzing them by making data and services FAIR.

Our research focuses on methods for machine actionable data and services to foster data-driven science and innovation.

Data FAIRification and Management

Guide practitioners to develop a value-oriented FAIR data management policy to enable organizations to manage their data through its lifecycle and support their data driven business models

Distributed Analytics Platforms

The FIT Data Analytics Train platform provides a solution to gain full benefits of distributed data, without sharing any data. Analytics algorithms visit the decentral data centres and return (and travel on) with trained models of what they have learned from the data.

PID systems

Persistent Identifiers (PID) used for managing and sharing digital resources in complex data-intensive production and research.  PID systems identify digital objects (such as data, software) globally uniquely and make them findable both for human and machine users. 

FAIR Capability Maturity Models and Assessment

Making your data FAIR is a journey: each organization decides the best path for themselves.  Capability Maturity model helps organizations to identify their critical process for their goals and guides them to improve those for achieving FAIR data.

Our services

PADME

Personal Health Train (PHT) is a novel approach, that aims to establish a distributed data analytics infrastructure enabling the (re)use of distributed healthcare data. At the same time, data owners stay in control of their data. The main principle of the PHT is that data remains in its original location, and analytical tasks visit data sources and execute the tasks. The PHT provides a distributed, flexible approach to using data in a network of participants, incorporating the FAIR principles. PADME is a PHT implementation developed by Fraunhofer in collaboration with RWTH Uni, Cologne University Hospital and Leipzig Uni. Distributed Analytics (DA) has been introduced to overcome the challenges of accessing and performing data analysis on privacy-sensitive data. The main principle of DA is that the analysis task is brought to the data instead of bringing data to a centralised location to run the data analysis algorithms

Our study is part of German MII and GoFAIR initiatives.

Publications