The smartest storage for the world’s largest data volumes

The smartest storage for the world’s largest data volumes

09.03.2018
CERN’s Large Hadron Collider is the world’s largest and most complex experimental facility, generating the world’s largest data volumes. This calls for extremely smart data handling and storage solutions, and one of the smartest solutions is distributed between four Nordic countries.

The Large Hadron Collider (LHC) outside Geneva, on the border between Switzerland and France, is the latest addition to the accelerator complex at the European Organization for Nuclear Research (CERN). The LHC allows physicists to test different theories of particle physics, mainly by accelerating beams of protons or lead nuclei up to almost the speed of light. When these particles are flying fast and carrying very high energies, researchers smash them together – in order to investigate the new and strange particles that sometimes arise.

An unprecedented amount of data

The LHC is an extremely advanced and complicated research facility, and the multitude of sensors and instruments along the LHC also produce enormous amounts of data. The current production amounts to hundreds of petabytes per year, and production is expected to increase by 10 to 100 times in the near future. That is why CERN needs the Worldwide LHC Computing Grid (WLCG), which is exactly what the name implies: a global collaboration of more than 170 computing centres in 42 countries linking together national and international grid infrastructures.

“The structure of distributed computer centres was chosen because no institution alone would have been capable of receiving, storing and processing the total output of data from CERN,” explains High Performance Computing expert Mattias Wadenstein at NeIC. He is also the manager of the distributed Nordic Tier 1 site in the WLCG.

The WLGC is composed of four levels or “tiers”, of which Tier 0 is CERN’s own data centre – which is distributed between Switzerland and Hungary. All the data from the LHC passes through this central hub, but it provides less than 20 per cent of the grid’s total computing capacity.

A Nordic model for the future

Tier 1, where Mattias Wadenstein is working, consists of 13 computer centres. They provide round-theclock support for the whole grid and are responsible for storing raw and reconstructed data, as well as for performing large-scale reprocessing and storing the corresponding output. The centres are also responsible for distributing data to Tier 2 institutions, typically universities and other scientific institutes that can store sufficient data and provide adequate computing power for specific analysis tasks. The majority of the grid’s resources are located in Europe, but there are resources in North America and East Asia as well, and centres in other parts of the world also contribute to this global effort.

One of the Tier 1 centres is located in the Nordic countries, and it functions like any other Tier 1 computer centre. But a closer inspection reveals that the Nordic Tier 1 facility, which is part of the globally distributed network, is itself a distributed network with storage facilities at the universities in Oslo and Bergen in Norway, Linköping and Umeå in Sweden, Espoo in Finland and Copenhagen in Denmark.

“The Nordic facility is in fact one of the smaller Tier 1 facilities, although it is still a massive undertaking. But the most interesting thing is perhaps that the distributed structure has worked so well that it is now being seen as a model for the future,” Mr Wadenstein explains.

According to a recent evaluation report by the Spanish expert Dr Josep Flix, the Nordic model provides a unique set of competencies and is a successful example which other data and computing centres around the world could learn from. The evaluation report states that the Nordic distributed model may even prove critical for the worldwide LHC network’s future capacity to handle large volumes of data from CERN.

Rising to the challenge

One of the requirements for being one of CERN’s Tier 1 computer centres is that it should appear to be one site. Therefore, the Nordic team built a solution that hides the distributed nature effectively. “That was a challenge, and was only possible because we managed to recruit a good technical team that can maintain and evolve the setup,” Mr Wadenstein says.

“One of the advantages with the Nordic structure is the national co-funding, which offsets some of the extra costs that come with distribution. In addition, the structure brings a lot of competence closer to more universities than would otherwise have been the case,” he explains. He adds that the Nordic Tier 1 is a meta-centre under the shared control of different national organisations.

“In my opinion, this is an excellent example of Nordic collaboration rising to the challenge of receiving and storing huge data flows. The key to success is a combination of good routines, a high-quality platform, and continuous development of the open source software that we are heavily dependent on,” Mr Wadenstein says.

Looking for a needle in the haystack

The LHC started up in September 2008 and achieved its greatest success so far in 2013 with the discovery of the Higgs boson, one of the most fundamental components of the fabric of our universe. Far from closing the books, this discovery opened up whole new areas of research into the stability of the universe, why it seems to hold so much more matter than antimatter, the composition and abundance of dark matter, and so on.

Mattias Wadenstein adds that the researchers at CERN and the LHC are throwing away more than 99.9 per cent of the data from the experiments because they are only interested in observing the collisions and special events. They are looking for “the needle in the haystack”, so they have no use for data about the hay. But the remaining amount of data, which must be stored for future research, is still enormous.

The LHC machine is taken down for upgrading from time to time. The latest upgrade, in 2015, was so successful that the experiments started to generate more than twice the amount of data that the facilities were prepared to accommodate. That was of course a test that the Nordic facility passed with flying colours.

According to Mr Wadenstein, one of the advantages of the distributed setup is that the system is very scalable. The future will be bringing new challenges because a planned LHC upgrade in 2022 could lead to a factor 100 increase in data volumes to be stored and processed.

“I am confident that we shall be able to rise to this challenge”, says Wadenstein.

 

Text: Bjarne Røsjø

Photo: Terje Heiestad

This article has previously been published in NordForsk Magazine 2017

Facebook