Meta Gains Approval To Train AI With UK User Posts
Meta can now use public posts from U.K. users in its AI training.
After pausing the development of its AI systems based on U.K. user posts back in July, Meta says that it has now gained approval to use public user posts within its AI training, after negotiation with British authorities.
As per Meta:
“We will begin training for AI at Meta using public content shared by adults on Facebook and Instagram in the UK over the coming months. This means that our generative AI models will reflect British culture, history, and idiom, and that UK companies and institutions will be able to utilise the latest technology.”
Which is a fairly grandiose framing of how Meta’s using people’s data to train models in order to replicate human interaction.
Which is the main impetus here. In order to build AI models that can understand context, and produce accurate responses, Meta, and every other AI development company, needs human interaction as input, so that the system can develop an understanding of how people actually talk to each other, and refine its outputs based on such.
So it’s less about reflecting British culture than understanding the varying use of language. But Meta’s trying to frame this in a more beneficial and appealing way, as it seeks to lessen resistance to the use of user data for AI training.
Meta’s been granted approval to use U.K. users’ public posts under legal provisions around “legitimate interests”, which ensures that it’s covered for such usage under U.K. law. Though it is keen to note that it is not, as some have suggested, using private posts or your DMs within this dataset.
“We do not use people’s private messages with friends and family to train for AI at Meta, and we do not use information from accounts of people in the UK under the age of 18. We’ll use public information – such as public posts and comments, or public photos and captions – from accounts of adult users on Instagram and Facebook to improve generative AI models for our AI at Meta features and experiences, including for people in the UK.”
As noted, Meta paused its AI training program in both the U.K. and Brazil back in July due to concerns raised by the respective authorities in each region. According to Meta’s president of Global Affairs Nick Clegg, Brazilian authorities have now also agreed to allow Meta to use public posts for AI training, which is another significant step for its evolving AI effort.
Though E.U. authorities are still weighing restrictions on Meta around the use of European user data.
Back in June, Meta was forced to add an opt-out for E.U. users who don’t want their posts used for AI training, via the E.U.’s “Right to Object” option. E.U. authorities are still exploring the implications of using personal data for AI training, and how that meshes with its Digital Services Act (DSA).
Which has rankled Meta’s top brass no end.
As Clegg recently remarked in an interview:
“Given its sheer size, the European Union should do more to try and catch up with the adoption and development of new technologies in the U.S., and not confuse taking a lead on regulation with taking a lead on the technology.”
Essentially, Meta wants more freedom to be able to develop its AI tools by using all of the data at its disposal, without the regulatory shackles of the E.U.’s evolving rules.
But at the same time, users should have the right to decide how their content is utilized, or not, within these systems. And with people posting personal and family-related updates to Facebook, that’s even more relevant in this regard.
Again, Meta’s not training its systems on DMs. But even so, if, for example, you’re posting about the funeral of a family member on Facebook, you’re likely to do that publicly, in order to inform anyone who may want to pay their respects, and that could be the kind of thing that you may not feel comfortable feeding into an AI model.
Now, the chances of that appearing in a specific AI-generated response are not high, but still, it should be a choice, and thus far, tech companies developing large language models for AI training have shown little regard for this element, with many of the biggest initial models essentially stealing data from Reddit, X, YouTube, and anywhere else they could take in human interaction to train their systems.
Really, in most elements, the development of AI systems has reflected the initial growth of social media itself, in building tools quickly, with a view to dominating the market, with little consideration for the potential harms.
As such, a more cautious approach does make sense, and we should be considering the full implications of such before simply giving Meta, and others, the greenlight.
But essentially, if you don’t want your data being used, best switch your profiles to private.
Meta says that it will begin informing U.K. users about the change this week.