Interview: ‘The fear is that data will be lost’

Henrik Schönemann | historian and activist With the U.S. government actively removing valuable scientific data from its websites, a group of European scientists have begun to secure that data.
Marcel aan de Brugh

Henrik Schönemann sitting in front of a desk with multiple computer monitors and a laptop. The laptop is adorned with various stickers, including one that reads 'guerrilla archivist', one with 'Protect Trans Lives', another that says 'Feminism is for everyone.' He is resting his head on their hand be in contemplative pose. The monitors display a dark-themed interface with purple accents, showing a web application with the title 'SciOp.' The left monitor shows a section titled 'Details,' while the right monitor displays a list of datasets.
Photo: Gordon Welters; embedded via images.nrc.nl

Henrik Schönemann is a historian at Humboldt University in Berlin for his day job. But he has additionally been spending a lot of time since January of this year, rescuing all kinds of data stored in the United States. Data which may be at risk now that the Trump administration is censoring government communications for words and information on gender, discrimination, climate, and public health. Schönemann started an initiative; Safeguarding Research & Culture, which some 170 volunteers from the U.S., Europe and Australia have now joined.

How much time do you spend on this initiative?

“Right now about four to six hours a day, I think.”

Why did you start duplicating and securing data?

“I felt it coming when Trump was elected president. Project 2025; the conservative agenda to reshape government, was in the news a lot. That felt like a real danger to me.”

Why?

“Mainly because of Trump's statements about eliminating gender ideology. The attacks on the LGTBQI+ community. But also because of his positions on climate, public health, and everything that is un-American.

Especially because of the attacks on the LGTBQI+ community, you say. Do you belong to that community?

“I am bisexual. I came out years ago.”

What kind of data are you duplicating?

“I started searching by keywords, like 'trans' and 'gender.' These are articles, PDFs. Then I expanded it to other topics that you know are at risk.

“We downloaded health data from the Centers for Disease Control and Prevention (CDC), from the Department of Education, weather and climate data from NOAA, historical data,

“You don't actually know what is still safe. For example, Enola Gay, the bomber that dropped the first atomic bomb, would be on the list of suspect words because it has 'gay' in it. Trump also ranted the other day about transgender mice. But it turned out to be research about transgenic mice.”

Isn't it a lot of work for one person?

“I expanded the initiative and brought in other people. The core team now consists of 12 people. In addition there are almost 170 volunteers. About half of them are in the U.S., 40 percent in Europe, 10 percent in Australia.

“We work with pseudonyms among ourselves. And we work with different security levels. Not everyone can access everything. It's organized a bit anarchically. It works well because of that.”

Do you only search on your own initiative?

No, we also get requests, for example from climate scientists. The Trump administration is cleaning house at NOAA, the National Oceanic and Atmospheric Administration, which collects a lot of weather and climate data. Thousands of people are being laid off. The fear is that data will be lost. For example, we have duplicated fundamental climate data from NOAA, an archive of global radioprobe data, and all kinds of weather observations. Two researchers have written a script for us on how to download data on storms and paleoceanography.”

How do you store all that?

“We use torrenting, a proven technique where you store bits of information on lots of individual computers. It's the same technique which, for example, The Pirate Bay used for movies, music and video games. Only, their content was illegal. We only download publicly accessible information. We don't break through paywalls.”

Climate data requires a lot of storage space. How much do you have available?

“We can handle up to 100 terabytes. With that, you can store a lot. But data from satellites and climate models require even more. Then you're talking about petabytes. We are in contact with climate institutes about this.

“By the way, it's not just about storing data. We are also putting them online, so they can be accessed by others.”

What are your further plans?

“I want to register officially as a non-profit. If we register as an association we can also have a bank account. There are already parties who have indicated they want to support us, but that requires a lot of bureaucracy now. That will be easier as an association.”

In the night March 25, at 3:00 AM, Schönemann emails that they have managed to download much more data from the U.S. Department of Education “just in time.” This is the largest collection yet, he writes. “Several terabytes.”