Improving biodiversity research e-Infrastructures by collaboration

Improving biodiversity research e-Infrastructures by collaboration

Research questions within biodiversity research and nature conservation often focus on geographical areas and ecosystems that extend across national borders. But biodiversity and ecosystem e-Infrastructures, on the other hand, are developed individually in each country, which makes access to regional data difficult. The Nordic-Baltic Collaboration on e-Infrastructures for Biodiversity Informatics (DeepDive) research project is seeking to enable access to data across countries.

The DeepDive project conducted under the Nordic e-Infrastructure Collaboration (NeIC) sets out to reach this goal by facilitating and intensifying collaboration in the Nordic-Baltic region. The project was started in 2017 with partners from Norway, Denmark, Estonia, Finland, Iceland and Sweden.

The purpose of DeepDive research e-Infrastructures is to host data and make it available. In the context of biodiversity, data may comprise a variety of different things. It can be information about species, characteristic biological features, or an overview of animals and plants or habitat information that illustrates how the species live together. Often biodiversity data are genetic data that are associated with the species. DeepDive’s project manager, Matthias Obst, points to the importance of taxonomy, or the practice of naming and classifying species into groups according to their similarities. Even in scientific descriptions of species, which are written in Latin, a species often has several names. 

“The names and the connections between the identities and names of the species are constantly changing, which makes collecting and associating data with a species a highly dynamic process,” Obst clarifies.

The role of the research e-Infrastructures is to keep track of this process and manage the change of the names and the identities of the species.

Matthias Obst

“That is a very complex issue and it is usually taken care of in each country, but not so much across countries. If I would want to obtain all relevant information associated with a certain bird, for example which type of tree it lives in, what it eats, when it mates and how many eggs it usually lays, I could get a great deal of information in each individual country, but not for all countries where the species occurs. One might think that querying the national databases separately takes only a little extra time, but we’ve come to a point where we don't want to analyse single species anymore – we want to analyse thousands of them in ecoregions shared by Nordic countries, such as the Baltic Sea and the Baltic shield,” Obst states. 

Producing software and community-building 

The goal of the DeepDive project is to explore synergies in research e-Infrastructure development among the Nordic and Baltic countries, and establish common services based on best practice and technical interoperability to support biodiversity and ecosystem research. In other words, the aim of the project is to produce software that enables researchers to access data about different species and ecosystems also outside their country of residence.  

The project sets out to reach these goals by addressing three topics. The first and general aim of the project is to build up a community and unite the system developers and the users (scientists) of the research e-Infrastructures. This is done by looking at ways to improve interoperability among services and infrastructures that up to now have been emerging individually in each country, on a national scale. Unification is achieved by making it possible for the infrastructures to communicate with each other.  

“We are not physically moving them, we are just making the machines able to talk to each other. In other words, we are improving the interfaces and the standards of the information systems so that they can easily get same type of information from different countries,” Obst explains. 

Working with big data

The second topic addresses the challenges with big data. New technology brings with it new kinds of data along with new ways of collecting them. Looking back just 10-15 years, information about species was collected manually. One person or a small group of scientists collected the data and stored it on their local drive in spreadsheet applications — or perhaps merely on paper. Thanks to new technology this is changing, and the scientists of today don’t necessarily go out in the field themselves. Instead, they use robots, drones, remote sensing devices and other kinds of new systems to scan the environment and to recognise the species and habitats. According to Obst, the new technologies both allow and force scientists to work with big data. 

“But we are not trained to do that. Instead, we are trained to have our data on local hard drives and to work with our data in an Excel table format. Now we need to train scientists to move away from working with their own data and to include other people’s data, including data generated by robots and sensors. And that automatically means working with big data,” Obst says. 

Training and capacity-building — learning from each other 

As its third topic the project focuses on educating the scientists.  

“You cannot work on your desktop anymore. You have to be able to go into databases and identify the relevant criteria and construct the queries that will produce the relevant data for your research question,” Obst states.   

The training is not only offered to scientists, but to infrastructure providers as well. The research e-Infrastructures exist to serve the scientists. Hence, according to Obst, it is crucial to make sure that the research e-Infrastructures are useful for the scientists. This is achieved by learning from each other. 

“In order to make sure our infrastructures are useful for the scientists we not only have to acquire technical competence – we also need to be able to understand the scientific problems that our infrastructure can help address. Therefore, we try to “breed hybrid people” in DeepDive’s educational approach, i.e. people with both technical and scientific insight. So, we organise workshops where people with either science or technical backgrounds come together and learn from each other. “ 

Results and future goals

During the process the goals of the project have been narrowed down and become more targeted, but the main themes remain. In its first year DeepDive has analysed the research e-Infrastructure landscape within biodiversity and found its key partners. It has established a community whose members are aware of each other’s expertise. In addition, it has identified some “sweet spots”, the areas where the synergy can be produced. 

During the next two years the DeepDive project will continue to unite users as well as solidify and consolidate the community.  

After the second year Obst is expecting to see both technical and scientific results. In addition, the project will set up long term strategic goals. One long-term goal for the project is to share its expertise outside the group.  

 “When we have a success trail, in one or two years, we can go out and offer our network to a wider community of biodiversity informatics practitioners in the north,” Obst concludes.


Biodiversity informatics is a methodological discipline that helps biodiversity research overcome issues related to the whole value chain of data from data capture to analyses and data products regarding vocabularies, ontologies, digitisation of collections, data sharing, data integration, data reliability (fitness for use), data quality, visualisation, analysis and long-term archival.

Text: Iiris Tarvonen

Header photo:

Photo of Matthias Obst: the interviewee's own