Whether it’s Apple, Google, OpenAI or Meta, to train a generative artificial intelligence, you have to give it something to learn from. If Apple used online media (which created a small controversy in passing), Meta used its internal resources: public publications from Facebook and Instagram. A gigantic asset that Meta had and that its main competitors did not have.
Meta was smart to train his AI
Last June, Mark Zuckerberg announced a major new initiative for Meta: mining data from two of its social networks (Facebook and Instagram) to train its generative AI systems. However, it turns out that this practice had already been underway for years, proving that Meta has not played fair and has hidden information about its use of user data.
Since 2007, all public posts on Facebook and Instagram, including messages, photos, and comments, have been used to feed Meta’s AI. Unlike ChatGPT, which relies on a smaller database, Meta draws on a vast amount of user information, far exceeding standard practices. This massive data collection, done without users’ permission, raises significant privacy concerns.
Melinda Claybaugh, Meta’s global chief privacy officer, initially denied using user posts to train the AI. However, after further investigation, she admitted that it did include public posts. While Meta claims not to have used posts from accounts of children under the age of 18, it has been shown that photos of children posted by their parents were included in the data collected.
Training generative AI is a complex process. It requires massive amounts of diverse data to train the model, advanced algorithms to interpret this information, and computing infrastructure capable of processing this data in real time. Human intervention also remains crucial to supervise and avoid bias or inappropriate content, while ensuring ethics and data privacy.
Meta’s practices reveal an unprecedented use of user data, not just limited to simple textual information, but also extending to photos and other visual content shared on its platforms. The issue of privacy is therefore more relevant than ever, as Meta continues to develop increasingly modern AI systems.