Eight (8) projects were awarded funding as a result of our Digital Life Sciences Open Call.

The project work kicked off in spring/summer 2021. The achievements made by meeting both small and large project milestones are described under the individual project descriptions below.

Funded Project 1

Integrating EU-RI datasets for preclinical and discovery research bioimaging

 

Project team: Elisabetta Spinazzola and Sara Zullino, University of Torino

Jean-Marie Burel, Marco Comerci, Philipp Gormanns, Dario Longo, Mario Magliulo, Rafaele Matteoni, Jason Swedlow, Andrea Zaliani

Brief project summary:

European Research Infrastructures (RIs) generate an increasing amount of data across different life science domains, such as animal model strains (INFRAFRONTIER), chemical entities (EU-OPENSCREEN), and biomedical imaging (Euro-BioImaging). These three RIs are joining forces to try to combine their resources by building novel, open-source tools for integrating the information associated with data belonging to their infrastructures. This will ultimately provide users with free, open access to collections of datasets distributed over multiple sources when making searches by specific keywords.

RIs involved:

Euro-BioImaging

INFRAFRONTIER

EU-OPENSCREEN

Funded Project 2

A workflow for marine Genomic Observatories data analysis

 

Project team: Christina Pavloudi, Hellenic Centre for Marine Research

Haris Zafeiropoulos, Stelios Ninidakis, Antonis Potirakis, Evangelos Pafilis, Cymon J. Cox, Gianluca De Moro, Robert Finn, Ekaterina Sakharova, Martin Beracochea, Ibon Cancio, Maria Luisa Chiusano, Erwan Corre, Katrina Exter, Nicolas Pade

Brief project summary:

Shotgun marine metagenomic datasets are produced by the EMBRCʼs Genomic Observatories (GOs) in order to decipher the dynamics of marine ecosystems.

This project supported by EOSC-Life  developed metaGOflow, a workflow that allows researchers to analyze this increasing amount of data more effectively and rapidly.

This initiative makes the data produced by the GOs more easily interpretable by providing the taxonomic inventories of each sample in a timely manner and in a non-technical format.

RIs involved:

EMBRC

ELIXIR

Sustainable outcomes:

 

 

Overview of metaGOflow workflow steps created by Dr. Haris Zafeiropoulos

 

Funded project 3

PDB-REDO-cloud: A flexible and scalable engine for computational structural biology

 

Project team: Anastassis Perrakis, Netherlands Cancer Institute

Maarten Hekkelman, Robbie Joosten, Hans Wienk

Brief project summary:

PDB-REDO provides a computational platform to optimise experimentally obtained biomolecular structures that allow a better understanding of the basic chemistry of life. The aim of this project was to provide cloud access to the PDB-REDO engine as a flexible, sustainable and scalable platform for structural biologists.

The new PDB-REDO-cloud allows non-expert users to run PDB-REDO directly without need to install a complex software package while still having full control over their calculations. This enhanced flexibility is not available previously through the PDB-REDO webserver. Please see this publication for all information.

Expert users can design their high-throughput computational experiments with access to all PDB-REDO features. In 2022, some 1500 experiments were run.

The PDB-REDO-cloud is available through API with example scripts in Python, PERL and JavaScript provided: https://pdb-redo.eu/api-doc. Data transactions are digitally signed and encrypted to secure users’ valuable research. Additionally PDB-REDO cloud also allows Instruct-ERIC facilities and others organisations to implement PDB-REDO-cloud into custom workflows. Such implementations are already available in crystallographic workflow environments CCP4i2 and CCP4-cloud (https://www.ccp4.ac.uk/).

RIs involved: 

Instruct-ERIC

Sustainable outcomes:

 

Funded project 4

Expression Atlas’ RNA-Seq and Microarray analysis pipelines migration to workflow environments for cloud deployment and reproducibility

 

Project team: Irene Papatheodorou, EMBL-EBI

Pablo Moreno, Andrey Solovyev, Pedro Madrigal, Jonathan Manning

Brief project summary:

A vast amount of gene expression data is produced and processed daily in diverse areas of Life Sciences. This pilot project was carried out to modernise the current Expression Atlas gene expression data analysis pipelines, ensuring they are both portable and cloud deployable. This initiative allowed Expression Atlas pipelines to shift from being strongly dependent on the EBI cluster and shared file system to a modern, community maintained, explicit workflow environment that can run outside of the EBI infrastructure. More importantly, the migration of these analysis pipelines facilitates the re-use and re-analysis of RNA sequences and microarray data by third parties.

RIs involved: 

ELIXIR

Sustainable outcomes:

 

Funded project 5

Increasing the FAIRness of Phytolith Data

 

Project team: Emma Karoune, Historic England

Carla Lancelotti, Juan José García-Granero, Marco Madella, Javier Ruiz-Pérez

Brief project summary: 

Our project aimed to increase the knowledge of and the use of the FAIR data principles in phytolith research to improve communication of methods, data sharing and archiving practices within the discipline. Phytoliths are silica bodies that are deposited in or between plant cells during the life-cycle of the plant. They are used in different scientific fields such as archaeology, palaeoecology and plant sciences to address questions of past plant exploitation and long-term environmental and biodiversity changes.

In this project, we conducted a community survey to find out about current data sharing and opinions on the use of open research practices (Ruiz-Perez et al. under review; dataset available here). Furthermore, we looked at published phytolith research and assessed the data and metadata within them in terms of the FAIR data principles (Kerfant el al. in revision, preprint).

Using these two new datasets, we have drawn together FAIR recommendations for the phytolith community that will then be reviewed and adapted by the community itself, to produce FAIR phytolith guidelines (Version 1.0 on Zenodo). We are currently running a series of training workshops to upskill our community in open research skills including standardised vocabularies and FAIR data (material available on Zenodo and videos available on Youtube). Future plans include the creation of a phytolith ontology to aid interoperability of phytolith data and an online open repository for phytolith data.
For more information, please see our website: https://open-phytoliths.github.io/FAIR-phytoliths/

 

RIs involved: 

ELIXIR

EMPHASIS

Sustainable outcomes:

The following resources can also be helpful:

Funded project 6

Reference Data Resource

 

Project team: Ignacio Eguinoa & Frederik Coppens, Flemish Institute for Life Sciences

Björn Grüning

Brief project summary: 

The project objective was to develop a centralised, community-supported repository that manages genomic reference data using the reference genome resource manager Refgenie. This initiative enabled the accessibility and re-use of datasets to be improved, the assets associated with a genome build to be linked under a namespace identifying the genome build, and each reference dataset to contain associated metadata with provenance information.

RIs involved: 

ELIXIR

EMPHASIS

Sustainable outcomes:

Funded project 7

Open Source Secure Data Infrastructure and Processes for Life Sciences (OSSDIP4LIFE)

 

Project team: Andreas Rauber, TU Wien

Martin Weise, Martin Krajiczek, Dietmar Winkler, Tomasz Miksa, Niki Popper

Brief project summary: 

Secure sharing of sensitive data from the data owner to research experts is extremely challenging due to privacy reasons, but also due to the massive risk involved in sharing commercially sensitive data. To address this issue, TU Wien developed a fully open-source based high-security data visiting platform called OSSDIP and is trying to adapt it to the needs of selected data owners in the life sciences through this EOSC-Life-funded project, extending its functionality to support more flexible data analytics. This platform will allow data owners to provide:

  • highly selective access (data visiting)
  • to specific (fine-granular or aggregated) subsets of data
  • for identified individuals
  • for limited periods of time
  • to answer precisely defined questions accepted by the data owner

RIs involved: 

ECRIN

Sustainable outcomes:

 

Funded project 8

Towards FAIR data for X ray-based structure-guided drug design

 

Project team: Jose A. Marquez, Instruct, EMBL Grenoble (France)

Sameer Velankar

Brief project summary: 

New applications in the field of X-ray-based crystallography such as X-ray based fragment screening have led to a remarkable increase in data production and a bigger dependency from the user community on advanced facilities like those within Instruct-ERIC. However, these facilities are lacking tools to properly support their users with FAIR data handling. To address this issue, this project will work on the development of new tools for automated harvesting, validation, and deposition of such macromolecular crystallography data and metadata which should considerably promote re-use and interoperability of this data.

RIs involved: 

Instruct-ERIC

EU-OPENSCREEN

First Digital Life Sciences Open Call - Information for Grantees