Tell me in two minutes
- The Australian public is nervous about AI and has low trust that companies using AI will protect their personal data. Privacy issues can arise in a number of ways, such as through the inclusion of personal information in training data, by users inputting personal information into AI models, and in the potential for AI models to inadvertently surface or re-identify personal information. These concerns about personal information become even more acute in use cases such as automated decision-making, profiling, sentiment analysis, and surveillance.
- In Australia, the Federal Government is preparing to update the Privacy Act 1988 (Cth) (Privacy Act) in ways that will impact the use of AI, by proposing measures to increase transparency, strengthen individual rights to access and delete information, and new enforcement measures (such as the proposed direct right of action for privacy violations).
- As the regulatory and litigation landscape for AI shifts, businesses should proactively establish governance frameworks to manage the development and deployment of AI systems, paying close attention to privacy risks and to upcoming changes to the Privacy Act.
Context
AI models need to be trained on data to enable them to identify patterns, learn from examples, and make predictions or decisions. The better the quantity and quality of the data they can be trained on, the better they can perform. While there are intellectual property (and other) issues about how and what datasets are used by generative AI (GenAI) developers to train their models, there is also a growing appreciation of the privacy issues and risks associated with the training and use of GenAI. In a recent survey by Ipsos, Australia ranked amongst the highest nations in terms of nervousness about AI and amongst the lowest in terms of its trust that companies using AI will protect the public’s personal data.
These privacy issues include whether personal information can be inputted into a GenAI model (particularly a large language model or LLM) or used to train the model, what happens when the model surfaces personal or sensitive information, and the use of GenAI for automated decision making. We explore these risks below.
This article is part of KWM’s series on the risks of GenAI and considers the privacy risks of GenAI. Find the other articles here.
What are the key privacy risks associated with GenAI?
Using personal information held by the organisation to train GenAI models
The information an individual shares with an organisation, including their personal information, might be used to update or train a GenAI model. In this respect, one potential privacy concern is whether training a business’ GenAI model on personal information contained in its customer data would be a breach of Australian Privacy Principle (APP) 6.1 which provides that personal information held by an organisation about an individual should only be used for the primary purpose (or a related secondary purpose[1]) of collection.
In many cases, the disclosed uses of personal information in many organisations’ privacy policies are broad enough to encompass training of GenAI. However, some organisations are also updating their privacy policies and terms of service to specifically add in training of AI on customer data as a specific purpose. We are seeing organisations make changes to these documents to introduce words such as artificial intelligence, machine learning and GenAI. There are risks associated with this, including adverse customer reaction and publicity, or the potential for a privacy policy or terms of service being unfair or misleading where it allows an organisation to retroactively scrape old data. Internationally, there have been several examples of adverse customer reactions, including:
- Recently Adobe introduced new language into its terms of service, which gave rise to concerns from customers that the update would permit Adobe to use their creative works to train Adobe’s GenAI. Adobe has since rolled out new terms of service to confirm it will not use customer data to train GenAI.
- Zoom encountered a similar adverse reaction in 2023 when it updated its terms of service to state that ‘Service Generated Data’ would be used to train and tune AI algorithms and models. After customers spoke out against this inclusion, Zoom subsequently clarified (and changed the terms of service) to confirm that it does not use customer audio, video, chat and other communications to train Zoom’s AI models.
- The Irish Commissioner for Data Protection has delayed Meta’s plans to roll out Meta AI in Europe, after it told Meta to delay training the AI model using publicly available posts, images and online tracking data. This follows scrutiny from a European advocacy group, NYOB, who launched 11 complaints against Meta to data protection regulators across Europe in response to Meta’s updated privacy policy that came into effect on 26 June 2024.
It is important to note that if the Privacy Act is amended to include a fair and reasonable test, then organisations will need to show that use of personal information (that they hold) to train GenAI is fair and reasonable in the circumstances. If individual rights are introduced (such as the right to withdraw consent or a right to erasure of personal information), organisations must also ensure they are only using personal information in accordance with how those rights are exercised. We explore the upcoming changes to the Privacy Act later in this article.
Training using interactions with GenAI models
GenAI chatbots like OpenAI’s ChatGPT, Google Gemini, Anthropic’s Claude and Microsoft Bing AI can collect data (including personal information) from their interactions with customers (in particular, prompts that are inputted into them), and use that data to update the chatbot model or train an entirely new model. The use of data inputted into prompts to train GenAI models has been widely recognised as a privacy and security risk.[2] It has heightened concerns that personal information inputted into a GenAI model could be surfaced in response to a prompt from another user.
Internationally, industry has had to respond to address this risk. For example, in April 2023, ChatGPT announced that users could turn off chat history for the public facing ChatGPT, allowing individuals to choose the conversations that would be allowed to be used to train their models (which was already the default for API models).[3] Organisations who are alert to these risks will have implemented policies prohibiting staff from inputting personal information (or other confidential information) into public versions of GenAI models. They will have also taken steps to ensure that the settings around their enterprise API implementations of these GenAI models do not permit model developers to train their models on user prompts.
Surfacing personal information
Much of the discussion about AI and privacy surrounds the use of personal information to train a GenAI model. A related issue arises when a GenAI model is trained on personal information and is subsequently prompted to surface personal or sensitive information about an individual. The risk is increased where the personal information used to train LLMs is taken from publicly available data, for example, through ‘data scraping’[4].
Many AI developers have fine-tuned their GenAI models to mitigate against the risk of such information being surfaced, but such mitigations are not foolproof. Where mitigation has not taken place, if organisations use GenAI tools in their business that could inadvertently surface the personal information they hold (e.g. of customers or employees), those organisations may be at risk of breaching APP 11 for failing to take reasonable steps to protect that personal information from misuse, interference or unauthorised access. Organisations should therefore take care to mitigate against the risk of the GenAI models they use inadvertently surfacing personal information.
Automated decision making
Automated decision making (ADM) is the process of making a decision by automated means, and without human involvement. It is not a new technology and chances are many businesses already use some form of automation to support decision making. There is a clear use case for LLMs in ADM, particularly given the capabilities of LLMs to summarise information and provide a convincing response to a question.
In the context of ADM, a LLM can be used to guide a decision maker, recommend a decision, provide information, or make the entire decision itself. ADM that has a material impact on the rights or obligations of an individual does raise significant issues, particularly if there is no transparency as to the basis on which that decision was made and if there is no ability to contest that decision.
Article 22(1) of the EU General Data Protection Regulation (GDPR) grants individuals the right to not be subject to a decision based solely on automated processing (including profiling) which has a legal or similarly significantly effect. The Court of Justice of the European Union (CJEU) recently decided that a ‘credit score’ is an automated decision for the purpose of Article 22.[5]
While there is no current prohibition on the use of ADM in Australia, there are currently reform proposals for enhanced privacy disclosures relating to use of personal information with ADM that are discussed further below. In addition, the Royal Commission into the Robodebt Scheme has made a number of recommendations around the operation of ADM in government services, recommending that among other things, there should be a clear path for those affected by decisions to seek review and the underlying business rules and algorithms should be made available to enable independent expert scrutiny.
Profiling
GenAI can also be used to profile individuals. Profiling is defined in Article 4 of the GDPR to include automated processing of personal data to analyse and predict an individual’s performance at work, economic situation, health, personal preferences, interests, reliability, behaviour, location or movements. GenAI can use predictive analytics to gather a profile on an individual, including anything from their spending habits to their health profile.
Profiling is not a new phenomenon (it was at the centre of the Cambridge Analytica controversy for the US Presidential Election in 2016) and it can have tangible, adverse impacts on individuals where an organisation is using profiled data in its ADM processes. The profiled data can be unreliable or biased, and there is no guarantee that the AI model will produce an accurate output. Predictions are, after all, only a prediction. The reforms to the Privacy Act referred to below are likely to address profiling and possibly targeting to some extent.
Sentiment analysis
Understanding customer sentiment can be an important tool for organisations. Sentiment analysis, also known as opinion mining or emotion AI, is a technique that uses AI to infer emotions or sentiment from an individual’s text, speech inflections or facial expressions. Even before the rise of GenAI, sentiment analysis has been used in areas such as customer service, advertising and health industries,[6] including to improve customer satisfaction, monitor business reputation and detect fraud.
Of course, there are also risks, including the potential for sentiment analysis to accentuate human biases.[7] Where sentiment analysis is linked to a particular individual, it will be considered personal information and must be treated accordingly. In addition, if the sentiment analysis is linked to an individual’s tone or expression in relation to a sensitive topic, such as their political opinions, it may be considered sensitive information for the purposes of the Privacy Act (irrespective of the analysis being correct).[8] In the case of sensitive information, more stringent privacy obligations will apply to its collection, use and disclosure. This type of analysis may also be used for customer profiling, which raises issues of its own.
Surveillance and monitoring
Surveillance systems can identify, track and monitor an individuals’ behaviour in an array of online and offline contexts, a process which can be amplified by AI which has the ability to analyse data at an unprecedented level. For example, AI is used in facial recognition technologies (FRT) that can detect human faces in images or video, which is a concern given the sensitivity of the information captured. As FRT is used to capture ‘biometric information’ for ‘the purpose of automated biometric verification or biometric identification’, facial images (and the data derived from those images) will be treated as sensitive information under the Privacy Act.
Surveillance, monitoring and FRT has been the focus of regulatory scrutiny in Australia and internationally. In 2021, the OAIC and UK Information Commissioner conducted a joint investigation that found that the US based company, Clearview AI Inc, had breached Australian privacy law by scraping facial images of individuals from social media and the internet, a decision which was affirmed in part by the Australian Administrative Appeals Tribunal in Clearview AI Inc v Australian Information Commissioner.[9]
Although surveillance practices are already subject to the Privacy Act, they are likely to be regulated further by the upcoming reforms. Privacy Commissioner, Carly Kind, recently discussed ‘surveillance capitalism’[10] and other privacy risks in relation to ‘intrusive tracking practices’. This has been a consistent concern over the years and is likely to remain a focus of regulatory scrutiny moving forward.
Re-identification and anonymisation
GenAI’s ability to analyse and parse vast amounts of data and undertake large scale pattern recognition means it may be possible for individuals whose data is within anonymised datasets to be re-identified by cross-referencing and correlating information from different datasets (which may be anonymised or not) via something known as the ‘Mosaic Effect’. This risk is both a privacy issue and a cybersecurity issue where malicious threat actors can re-identify personal and sensitive information.
At the same time, GenAI can be used for data anonymisation through a range of techniques, including to create synthetic data that closely resembles real data, or by incorporating differential privacy techniques to inject noise into generated data. It is really a tool that can be used for good or evil!
Are we going to see more laws addressing the privacy risks of GenAI?
Jurisdictions around the world are responding to the use of GenAI, especially in a privacy context. In 2023, Italy became the first Western country to temporarily ban the use of ChatGPT, the Italian Data Protection Authority citing privacy concerns, though ChatGPT was quickly brought back online following changes to improve transparency. Privacy investigations have followed in France, Germany and the United States.
There have been some legislative responses specifically targeting GenAI, for example the European Union passed the AI Act (EU AI Act) which includes obligations specific to general purpose AI models. In this context, it is important that emerging AI regulation works with, and complements, existing privacy legislation. The EU AI Act and the GDPR are designed to work together, but there are overlaps and tensions. As the EU AI Act does not come in force until August 2024, the interrelationship is still evolving.
Other countries are taking a less formal approach. As we have previously discussed, the United Kingdom is taking a ‘pro-innovation’ approach and has not passed any targeted laws,[11] and the Biden Administration in the United States issued the Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence, which is largely directed at the executive arm of the US Government with a few limited reporting obligations on private entities. Nevertheless, there is some appetite for customers to bring privacy lawsuits through existing legislation. In the United States, a class action was commenced against Microsoft and OpenAI alleging they violated various consumer and privacy related legislation by using personal information to train their AI models, although the suit has since been dropped.
The Australian Government’s consultation on responsible AI indicates that Australia is taking a slow and steady approach to GenAI regulation more generally. In Australia, we do not have any legislation that specifically regulates GenAI but there are legislative and regulatory regimes of general application that will be relevant to GenAI (which, for privacy risks, is the Privacy Act). Australian regulators have emphasised that existing laws apply to the use of GenAI. ASIC Chair, Joe Longo, recently stated “Businesses, boards, and directors shouldn’t allow the international discussion around AI regulation to let them think AI isn’t already regulated. Because it is”.
How will the upcoming Privacy Act reforms in Australia affect GenAI?
At present, the Privacy Act is the main vehicle for privacy related regulation in Australia. Although GenAI is not addressed specifically in the Privacy Act, it already regulates the collection and use of personal information (which would include use for GenAI applications). In addition, later this year, the Government expects to introduce a suite of reforms to the Privacy Act based on the Privacy Act Review Report and subsequent Government Response.
The reforms are likely to include reforms in relation to use of personal information for ADM. Although draft legislation is yet to be released, in relation to ADM, the proposed reforms would require entities to include information on whether personal information will be used in ADM which has a legal, or similarly significant effect on an individual’s rights in the entity’s privacy policy (Proposal 19.1). Individuals will have a right to request meaningful information about how substantially automated decisions are made (Proposal 19.3). While these proposals are a start at aligning Australia’s approach to ADM with the GDPR, they clearly do not go as far as the GDPR in allowing individuals to object to being subjected to ADM.
As we have previously outlined, there are a raft of other proposed reforms to the Privacy Act that may impact on GenAI, including:
- A new requirement to ensure that collection, use and disclosure of information is fair and reasonable in the circumstances (irrespective of consent) will likely be introduced (Proposals 12.1 – 12.3). Commissioner Kind has called this out as the reform that could see the end of surveillance, online tracking or other privacy risks, as it would “prevent organisations from using consent as a shield for bad privacy practices.”
- The definition of ‘collects’ information may be extended to include inferred or generated information (Proposal 4.3). This is relevant to information generated or surfaced by GenAI. Further, through predictive analytics, GenAI might predict sensitive information about an individual (including racial origin, political opinions, religious beliefs, sexual orientation, or criminal records) or create a ‘profile’ on an individual. These types of predictions would likely be covered by the Privacy Act as ‘inferred’ or ‘generated’ information.
- Individuals may be given an express ability to withdraw consent (Proposal 11.3) and a right to erasure of their personal information (Proposal 18.3). The proposals present an interesting challenge for GenAI trained on personal information. If an individual requests the erasure of personal information, will that require the operator of an AI model to fine-tune the model so that it ‘unlearns’ that personal information?
- Entities will likely be required to conduct a Privacy Impact Assessment for all high-risk activities (Proposal 13.1). There is no doubt some use cases of GenAI will be considered a high privacy risk. As such, businesses will need to conduct a Privacy Impact Assessment that identifies the impact AI will have on an individual’s privacy rights. Many businesses are already undertaking Privacy or AI Impact Assessments in relation to AI as a matter of course.
- There will likely be a direct right of action for individuals in relation to an interference with privacy (Proposal 26.1) and a statutory tort for serious invasions of privacy (Proposal 27.1) as the Australian Government has committed to the latter. Given the changing litigation landscape in relation to GenAI and privacy, there is a good chance the direct right of action will come into play where GenAI use cases cause a breach of the Privacy Act.[12] For businesses, this means there may be a significant litigation liability if appropriate safeguards and governance frameworks are not in place.
In relation to data that might be ‘re-identified’, the Government has noted the proposal that entities should take reasonable steps to protect de-identified data (Proposal 21.4). As it was only ‘noted’ it will likely not be enacted, now or in the immediate future. However, the Government has indicated it will consider options in relation to protecting against the risks of re-identification.
Conclusion
Given the changing regulatory and litigation landscape, if businesses have not already done so, they should establish an AI governance framework to oversee the development and deployment of GenAI systems and specifically consider how the framework addresses data and privacy. Any such framework should include consideration of privacy risks. See our insight here. Directors may also want to consider the Directors’ Guide to AI Governance.
At minimum businesses need to be alert to the upcoming changes to the Privacy Act. Reform is imminent and will impact the way businesses manage the privacy risks related to GenAI.
In the case of sensitive information, the secondary purpose needs to be ‘directly’ related.
See for example https://theconversation.com/chatgpt-is-a-data-privacy-nightmare-if-youve-ever-posted-online-you-ought-to-be-concerned-199283 and https://www.techradar.com/news/samsung-workers-leaked-company-secrets-by-using-chatgpt
Conversations would still be used to monitor for abuse.
In August 2023, the OAIC and other international data protection regulations released a joint statement in respect of safeguarding against unlawful data scraping. As such, global regulators have made it clear that personal information that is publicly available is still subject to privacy laws.
Hessen v SCHUFA Holding AG (Court of Justice of the European Union, C-634/21, 7 December 2023).
Emotion AI, explained (Meredith Somers, MIT Sloan School of Management, 8 March 2019).
S Kiritchenko and SM Mohammad, ‘Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems’ (2018). ArXiv, abs/1805.04508.
Sensitive information includes information or opinions about an individual’s political opinions, religious beliefs.
Clearview AI Inc v Australian Information Commissioner [2023] AATA 1069.
‘Surveillance capitalism’ is the concept of commodifying personal data for profit, a term conceived by Professor Shoshana Zuboff in the 2014 article, A Digital Declaration.
We are yet to see if the recent change of government in the UK will have any impact on its approach to AI.
In Australia at present, if an individual wants to bring an action for breach of privacy they must bring under another established causes of action (i.e., misleading or deceptive conduct or breach of confidence) or through a complaint to the Privacy Commissioner.
Getting lost in the changing landscape of AI regulatory requirements?
View our resources and videos developed by our experts to help you stay on top of the latest GenAI and tech developments.
Our GenAI regulatory map will help you to understand and keep up with this fast moving regulatory and stakeholder landscape. |
This easy-to-use and regularly updated timeline will help you stay on top of important developments across key areas of tech-related regulation, including GenAI. |
|