A persistent EHDS misunderstanding
Letter from the Dutch Minister of Helath
On January 20, 2026, the Dutch Minister of Health, Welfare and Sport sent a letter to the House of Representatives, informing it about the current status of the EHDS implementation. This letter was largely excellent and clear. However, it contained a persistent misunderstanding. On page 6, four items or functions are listed that must be developed in the coming years to ensure a well-functioning Health Data Access Body, the new government body where permits can be requested to reuse health data for public health purposes. The Minister describes the first of these as follows: “A national dataset catalog: a catalog provides insight into the location of which data categories for secondary use.” This description is incorrect, as one can also apply for a permit concerning data that are not included in the catalog. Describing the catalog in this way would make the EHDS significantly less useful to science than the EU intended.
What should be included in the catalog?
Every entity (except micro entities) that holds EHDS data collected in a set is legally required to register those sets in the national catalogue. This could be the Health-Ri catalogue, but also, for example, the DANS catalogue, established by the Royal Netherlands Academy of Arts and Sciences (KNAW), among others. DANS is one of the leading repositories in Europe. The HDAB, which is to be established, will have to make a choice in this regard. Data holders must register all their datasets with the HDAB (annually) for inclusion in the catalogue. These are defined as: “a structured collection of electronic health data.” These datasets have to be be labeled according to Health DCAT-AP (a metadata standard) so that researchers can more easily search throughout Europe for datasets that they can reuse for their research. The National Catalogue will hereto be linked to a European Catalogue.
What can you apply for at the HDAB?
However, what one can request from the HDAB, as a researcher, encompasses more than what is listed in the catalogue. One can apply for a permit to work with “data,” and this is defined separately in the EHDS (a definition that stems from the Data Governance Act). Data concerns: “any digital representation of acts, facts or information and any compilation of such acts, facts or information, including in the form of sound, visual or audiovisual recording.” The difference is that datasets are “structured collections” of data, while “data” also includes all kinds of unstructured data that have never been collected in a structured set. Does it matter whether or not the application procedure (DAAMS) is restricted to the catalogue? Absolutely, it makes a big difference.
You cannot label unstructured data
The first problem is that in the Netherlands, people often seem to think that data holders must register all “data” in the catalog, including all unstructured data. Healthcare institutions would then have to inventory all the healthcare data they hold and register it for the catalog, regardless of whether a scientist will ever be interested. This isn’t particularly efficient. Moreover, it’s unclear how this should be implemented. Of course, the data itself isn’t included in the catalog, only the metadata—a description of the data. But if this concerns unstructured data, it becomes extremely difficult to provide it with metadata in a way that allows others to understand what that unstructured data is about. Because it’s unclear how this should be implemented (including all unstructured healthcare data in the Netherlands in a catalog), it won’t work.
Unstructured data cannot be applied for?
This leads to the second problem. If not all EHDS data are actually in the catalogue, while the application process is limited to what is in the catalogue, then all kinds of data cannot be requested. Limiting the DAAMS to the data in the catalogue, therefore leads to the limitation of secondary use of data to the data that has already been used for research or statistics before. What will not be possible, in this way, is extracting previously uncollected, and therefore unstructured, data from the healthcare system. The EHDS system is then limited to data that has already been used for science or statistics, but does not extend to making available or generating entirely new datasets. Those who are aware of this can indeed insist that we include as much (unstructured) data as possible in the catalogue, but it seems more logical and efficient to only start looking for data when someone actually requests it, through a permit application procedure that is completely separate from the catalogue and in which one simply writes down what one needs and where it can probably be found.
Then rare data and new data still cannot be collected
The third problem is that the existing datasets, which were created under the current flawed system, contain very little rare data. Scientists seeking to advance their careers will often choose to work with more readily available data rather than searching for extremely rare data (because those who write remain). However, the EHDS licensing system does not solely solve the problem concerning data requests that are currently being rejected (for whatever reason), but also the problem that some data are currently difficult to find. For example, based on the EHDS text, one could submit an international request regarding the data of patients diagnosed with both Henoch-Shönlein purpura and Long-Covid in the next five years (a completely arbitrary example). There should be a system in place which notifies the coordinating HDAB if such a patient is indeed found to exist. Moreover, if data holders must submit datasets to the catalog annually, and the application process is limited hereto, this means one would not be able to request newly generated data during a pandemic. This is despite the fact that the EHDS is explicitly intended to make fast data collection possible in such situations.
Split catalog and application process for greater impact
The EHDS thus introduces a permit for both requesting data from society, and reusing existing datasets. Article 67 of the EHDS contains a list of everything that must be submitted when applying for a permit. This article does not mention the catalog at all. It states what one must provide: “a description of the requested electronic health data, including their scope, time range, format, sources and, where possible, the geographical coverage.” So it does state that one must indicate what the data sources are, but the word data holder surprisingly does not appear there. Moreover, recital 73 states: “The health data access body (…) should assist health data users in the selection of the suitable datasets or data sources for the intended purpose of secondary use.” This means that, unlike now, not the scientist, but the scientist together with the HDAB has to locate the data in question. Therefore, when making the application, it is not necessary for the scientist himself to know exactly who holds the data; a general indication of the sources is enough, after which the HDAB must further assist the scientist. This will make scientific progress in all kinds of new directions possible. But hereto the system must be designed in such a way that the DAAMS does not require you to designate a dataset in the catalogue.
Er is een hardnekkig misverstand ten aanzien van de EHDS. Datahouders moeten hun datasets laten opnemen in een datacatalogus. Maar datagebruikers kunnen een vergunning aanvrgen, ook voor data die nog niet in de catalogus staan. De aanvraagprocedure kan dus niet (alleen) aan de catalogus verbonden zijn.
The EHDS is about data, not bodily material. The Dutch draft Bodily Material Act is about material, not data. This might lead one to believe there's no overlap. But if you extract data from material, you're doing something with both data and material. That's why I'm discussing my thoughts on the draft act here. Spoiler alert: it's not good.
De EHDS gaat over data en dus niet over lichaamsmateriaal. De concept-Wet Zeggenschap Lichaamsmateriaal gaat over materiaal en dus niet over data. Daarmee zou men dus kunnen denken dat er geen overlap is. Maar als je data uit materiaal haalt, dan doe je en iets met data, en ook iets met materiaal. Daarom bespreek ik hier wat ik vind van de concept-WZL. Spoiler alert: het is echt bagger.