GDPR Compliance: At the Intersection of AI and Life Sciences

GDPR Compliance: At the Intersection of AI and Life Sciences

May 1, 2024

Late last month, the Association of Corporate Counsel (ACC) hosted a panel on artificial intelligence and how it is rapidly transforming the life sciences sector, allowing companies to leverage large datasets to accelerate drug discovery, optimize clinical trials, streamline supply chains, and more. However, because this typically involves processing highly sensitive individual health data, many of the AI systems and tools being utilized across the industry are subject to the European Union's General Data Protection Regulation (GDPR)—an influential piece of legislation that has shaped how companies and organizations throughout the world approach data privacy.

Understanding the Terrain

In enacting GDPR, the European Union sought to enhance data privacy by giving individuals more control over their personal data and establishing guidelines for how organizations should collect, process, and store that information. Any company processing the data of EU citizens must comply with its provisions, including foreign companies that provide goods and services in the EU or that track cookies or IP addresses of people visiting their websites from one of the EU’s member nations.

The GDPR defines "personal data" as any information relating directly or indirectly to an identified or identifiable natural person (the data subject), including data that reveals information about a person's health in the past, present, or future, such as:

  • Health data by nature (medical history, illnesses, services provided, test results, treatments, disabilities, etc.)
  • Data that, when cross-referenced with other data, allows a conclusion to be drawn about a person's state of health or health risk
  • Data that becomes health data because of its intended purpose (e.g., used for medical purposes)

The GDPR does make an exception for anonymized data. So long as the data is truly anonymous and individuals are no longer identifiable, it will not fall within the scope of the legislation. However, anonymization under the GDPR requires more than “pseudonymization” or simply substituting names and identifiers in a dataset with placeholders. To ensure compliance, it’s critical that data controllers eliminate any possibility of identification through careful use of generalization and randomization techniques.

Establishing a Legal Basis

The GDPR stipulates that personal information, including health data, may only be collected for specified, explicit, and legitimate purposes and must not be processed for other reasons incompatible with the original purpose. In terms of artificial intelligence, this standard applies whether data processing occurs in the development and training of AI systems, during the deployment of AI tools and systems to hospitals and other healthcare providers, or during ongoing training or machine learning.

For data to be processed fairly and lawfully, each processing operation must be carried out under one of the five lawful bases for processing in Article 6. Those most applicable to life sciences include:

  • Consent: Data can be processed lawfully when the subject has consented to the processing, provided consent was "freely given, specific, informed, and an unambiguous indication of the data subject's wishes."
  • Public Interest: Data can be processed when necessary to perform a task carried out in the public interest or to exercise official authority vested in the controller.
  • Legitimate Interest: Data can be processed when necessary for the identified legitimate interests.

The GDPR strictly prohibits the processing of personal health data, except when a data controller has met one of the conditions in Article 9(2). Examples of such conditions include instances where the data subject provided explicit consent or publicly disclosed the information, or when the processing is deemed essential to the interest of public health.

Privacy Notices and Transparency

The GDPR requires that a data subject be provided with an explicit privacy notice any time personal data, including health data, is collected under a legal basis. This notice must be furnished:

  • When personal data is collected from EU residents
  • When initial contact is made with an EU resident whose personal data was obtained indirectly or within one month of obtaining the data, whichever comes first
  • Prior to using data for a purpose other than the one originally stated when that data was collected

If data is being sourced from a preexisting database, there is a chance it’s being reused for a new purpose. To mitigate their compliance risk, it’s recommended that data controllers:

  • Determine if a privacy notice has been provided to data subjects • Provide appropriate information to data subjects
  • Check whether an exception (Article 14, 5) can be applied:
    • The data subject already has this information
    • Providing the information proves impossible or would require a disproportionate effort, particularly for scientific research

Data Subject Rights

Data subject rights under the GDPR empower individuals to retain control of their personal data, including the right to access and deletion, and data controllers must explain how they can exercise these rights (to whom, in what form, etc.). In principle, subjects who choose to do so are entitled to a response in one month.

If the controller no longer needs to identify a data subject for the intended purpose of processing personal data, they are not obligated to maintain, process, or acquire additional information solely for GDPR compliance. However, the controller must notify a subject seeking to access data when they can’t be identified. Data subject rights will not apply unless the subject supplies information enabling their identification.


Navigating the intersection of GDPR compliance and artificial intelligence in the life sciences industry requires a thoughtful and proactive approach. By prioritizing transparency, legal basis, and data subjects’ rights, companies will be better positioned to meet their regulatory burden and uphold the trust of individuals whose highly sensitive health data forms the foundation of AI-driven advancements and innovations.

Blog Info
By Eric Elting, Regional Director