Fighting the virus with knowledge

The EU has set up a platform that enables researchers worldwide to exchange data about the coronavirus.

A researcher at the Helmholtz Centre for Infection Research
A researcher at the Helmholtz Centre for Infection Research dpa/pa

The symptoms of COVID-19 vary considerably. That makes it difficult for people to judge whether they are infected or not. Doctors cannot easily diagnose the disease either – they need to carry out tests to be sure. Researchers will have to understand the virus extremely well to be able to develop a vaccine. They will need as much information as possible for that, and this data has to be up-to-date, easily accessible and protected well enough to prevent misuse of patient details.

This is why we want to help scientists to share data – across borders, disciplines and healthcare systems.

European Commission President Ursula von der Leyen

That is why the European Commission began the development of a COVID-19 Data Portal – with the goal of advancing scientific cooperation worldwide. “Scientists around the world have already produced a wealth of knowledge on coronavirus,” says Commission President Ursula von der Leyen. “But no researcher, lab or country could find the solution alone. This is why we want to help scientists to access data and share it with others – across borders, disciplines and healthcare systems.”

The database is being coordinated by the European Bioinformatics Institute (EMBL-EBI) in Hinxton near Cambridge in the UK. It is part of the European Molecular Biology Laboratory (EMBL) based in Heidelberg. “We are mainly storing three groups of data: virus DNA sequence data, genetic data and clinical data,” says Rolf Apweiler. The German biologist is Joint Director of EMBL-EBI. Virus DNA sequences are especially important for vaccine development. “The vaccine has to be designed so that as far as possible it does not react to changes in the virus – in other words, that it remains effective despite new developments.”

There are various theories about the mutation rate of the virus. Some researchers believe the virus mutates unusually fast, which means it can rapidly adapt to changed conditions. That would be a problem for the development of a vaccine. However, Apweiler doubts that: “This theory has emerged because some researchers accept the data as published. But data is collected using different sequencing technologies that often have different error rates.” If you take these errors into account, the virus is not mutating unusually fast, he explains. This shows how important good data is.

Rolf Apweiler, Joint Director of the European Bioinformatics Institute
Rolf Apweiler, Joint Director of the European Bioinformatics Institute EBI

Thousands of DNA sequences have already been stored on the EU data platform. Tens of thousands more are now being processed and will soon go online. The database administrators expect hundreds of thousands over the coming months and years. “We need them, too, to enable machine learning algorithms to go through them,” says Apweiler.

You can give good explanations for certain reactions to the virus – for example, as a result of underlying medical conditions – but not for all.

Rolf Apweiler, Joint Director of the European Bioinformatics Institute (EBI)

The second data group covers patients’ genetic data. “People react to the virus in different ways,” says Apweiler. “You can give good explanations for certain reactions to the virus – for example, as a result of underlying medical conditions or age – but not for all.” Large amounts of data are important here too. Researchers can compare the genetic data of patients with severe disease progressions with data from those with only mild symptoms and find out which genes influence these differences. This knowledge makes it possible to assess individual risks more precisely.

The third data group consists of clinical data collected directly from patients – for example, information about the course of the disease or the affected organs. The virus does not only impact the respiratory tract. There are also reports of heart disorders, kidney disorders and complex pathologies among children. There is still a lack of knowledge about the long-term effects of an infection. “There’s a large gap between such clinical data and research data, which we must close,” says Apweiler.

The platform does not only store data. It gives researchers enough computing power to evaluate data. There are also tools to filter patient data according to specific criteria, for example.

Among others, the partners in Germany include the German Network for Bioinformatics Infrastructure (de.NBI) at Bielefeld University. The network plays an important role in bringing together data from Germany and other countries.

The COVID-19 platform is part of the EU’s ERA vs Corona Action Plan. Its most important participants alongside the EMBL-EBI are the European Commission and the EU member states.