Collecting user data to train AI before triggering regulation, Meta pauses action
云海听松
发表于 2024-6-20 14:15:31
1312
0
0
21st Century Business Herald reporter Xiao Xiao reports from Beijing
This week, Meta announced the suspension of using data from EU and UK users to train AI, and postponed the launch of its own large model in Europe.
Ireland, the UK, Norway and other regulatory agencies have claimed it, and the company's move is in response to regulatory requirements. The Norwegian data protection agency stated that Meta has promised to suspend the use of posts and images on Facebook and Instagram to train large models, and it is currently uncertain how long it will be delayed. Discussions are underway with regulatory agencies in other EU countries.
Meta's plan to collect user data began last month, and the platform notified European users that it will officially launch a new privacy policy by the end of June: the company will use public content on Facebook and Instagram to train the big model, including interactive content, status, photos, and titles, excluding private chat records and minor account information. The updated privacy policy has sparked opposition, and the Austrian non-profit organization NOYB immediately filed complaints to 11 EU member states, requesting the initiation of emergency procedures.
The controversy is not unique. How to train AI through data authorization from users is a difficult problem for all Internet companies. Companies should not only grasp the compliance criteria, but also take into account the increasingly sensitive user emotions to privacy issues. The interviewed experts told 21st Century Business Herald that citing the EU's "legitimate interests" clause to obtain user data may become increasingly common in the future. However, currently, China's Personal Information Protection Law does not directly establish similar provisions, and domestic enterprises need to pay special attention to obtaining the explicit consent of users.
The "legitimate interests" clause may become a familiar face
In the complaint against Meta, NOYB identified two non compliances:
The first reason is that Meta's description of artificial intelligence is too broad, without specifying the purpose of collecting and processing user information. Meta's privacy policy only uses the term "artificial intelligence technology", which NOYB founder Max Schrems believes is equivalent to saying "we will use data in the database.".
"Meta did not specify what it would use this data for, nor did it set any restrictions. Artificial intelligence technology may refer to a simple chatbot, highly aggressive personalized advertising, or even lethal drone weapons." Max Schrems explained.
The second reason is that the user defaults to agreeing to collect data, and the rejection process is complex. Taking Facebook as an example, if users want to refuse platform collection of their data, they need to go through settings and privacy - Privacy Center - Generative AI - More Information - "Meta How to Train Big Models with Data" five level page, in order to find an opposition form at the end of the file. And only by actively filling out the form and passing it through the company can users refuse data collection.
Meta argues that the large model needs to reflect the diversity of language, geography, and cultural backgrounds of the European people, so the data collected by company users should belong to the "legitimate interests" stipulated in the General Data Protection Regulations, without the need for special user consent.
Generally speaking, the General Data Protection Regulations assume that collecting personal information is illegal, but the "legitimate interests" clause exempts some situations where data collection is necessary and does not require user consent. Such legal collection behavior can be for personal, commercial, or public interests.
"The industry generally believes that the EU has strict restrictions on personal information processing, but in fact, it leaves some room for interpretation through legitimate interest clauses." Wang Xinrui, a partner at Shihui Law Firm, has been engaged in data compliance business for many years. Wang Xinrui told 21st Century Business Herald that the setting of legitimate interest clauses is complex and flexible, and requires a series of tests. It can be said that it is a legal foundation with a large explanatory space.
Previously, Meta had also cited legitimate interests, defending the act of collecting user data to place personalized advertisements. However, the European Court of Justice ultimately refuted this claim, and Max Schrems therefore believed that legitimate interests were also difficult to apply to data capture and use in training AI. Wang Xinrui stated that for some emerging technology scenarios, other legal foundations may be difficult to establish, but there is still some room for interpretation of legitimate interests. Therefore, Meta will try to cite it, estimating that "this clause will repeatedly appear in various AI related cases in the future."
It should be noted that unlike the European Union, China's personal insurance law does not directly include "legitimate interests" in the statutory exemption situation. However, Wang Xinrui pointed out that some typical situations stipulated in the EU's General Data Protection Regulations are also covered by other provisions in China.
Lawyer Cheng Nian from Zhejiang Kenting (Beijing) Law Firm told 21st Century Business Herald that similar regulations in China include limited situations: one is sudden health emergencies or emergency situations to protect natural persons, and the other is legally confidential actions, such as collecting data without obtaining user consent due to the epidemic or anti-terrorism investigations by public security agencies, and business operations are usually difficult to fall within this scope.
User data becomes an industry sensitive point
"We are very disappointed." "This is a setback for European innovation and artificial intelligence development competition, and further delays the benefits that artificial intelligence brings to the European people." Meta complained in her blog that she is actually following the industry's approach - Google and OpenAI have already used European user data to train AI, and "compared to peers, our data collection methods are more transparent." "
However, it seems that this is not the case, and caution towards user data has gradually developed into a consensus approach. For example, ChatGPT was the first to allow users to refuse their personal data from being taken for training by the official by turning off the chat recording function, although this inevitably affects the quality of the large model's answers; On June 19th, Adobe specifically updated its service terms, explicitly stating that Adobe's software will not use the user's local or cloud content to train generative AI models.
Last year, the domestic office software WPS attempted to add a new privacy policy: "We will use the document materials you voluntarily upload as the basic materials for AI training after desensitization treatment." After being discovered by users, it triggered a collective boycott. WPS apologized to users and promised that user documents will not be used for AI training.
At present, technology giants that clearly collect user data to train AI include Google and X: in order to launch Musk's x AI company X updated its privacy policy in September last year, which stated in Regulation 2.1: "We may use collected and publicly available information to help train our machine learning or artificial intelligence models."; Last July, Google's privacy policy also added a new clause, "We may collect publicly available online information or information from other public sources to help train Google's artificial intelligence models."
However, at that time, Deng Zhisong, senior partner of Beijing Dacheng Law Firm, told 21st Century Business Herald that Google had provided a detailed explanation of the scope and purpose of collecting and processing user personal information. Even with the stricter "inform agree" rules under the EU GDPR as the standard, Google's approach was at least formally compliant.
NOYB also pointed out that Meta hopes to collect all public and non-public personal information since 2007, covering the interaction traces on Facebook and Instagram social media, which is different from the general approach of AI companies to disclose information via the Internet.
How to meet compliance requirements and develop technology while respecting user rights? Wang Xinrui emphasized to 21st Century Business Herald that for domestic companies, if they want to collect user data to train AI, they need to comply with the "Interim Measures for the Management of Generative Artificial Intelligence Services", which clearly stipulates that if personal information is involved, they should obtain personal consent or comply with the law. That is to say, special attention needs to be paid to whether the user has been clearly informed and their consent has been obtained before collecting and using their personal information. If the user's consent is not obtained in advance, there should be legal obligations, public interests, and other legal foundations, otherwise there are corresponding compliance risks.
Cheng Nian added that personal information collected and obtained based on user use of the product requires explicit consent, and sensitive information also requires separate consent; In addition, it is necessary to ensure that users can easily access, correct, delete personal information, and withdraw their consent, especially by providing them with the option to refuse to collect data for AI training, ensuring their right to know and choice.
CandyLake.com is an information publishing platform and only provides information storage space services.
Disclaimer: The views expressed in this article are those of the author only, this article does not represent the position of CandyLake.com, and does not constitute advice, please treat with caution.
Disclaimer: The views expressed in this article are those of the author only, this article does not represent the position of CandyLake.com, and does not constitute advice, please treat with caution.
You may like
- IQiyi apologizes again for the round of 18 matches of the Chinese national football team and states that it will compensate affected users
- 黄仁勋盛赞Meta AR
- Meta Platforms美股盘前上涨1.7%
- Meta再推AI大模型
- Meta据悉在WhatsApp、Instagram等部门裁员
- Meta发布新AI模型:可自查和审查其他AI模型工作
- Meta releases new AI model: capable of self checking and reviewing the work of other AI models
- Metaが新AIモデルを発表:他のAIモデルの作業を自己調査し審査できる
- 메타, 새 AI 모델 발표: 추가 AI 모델 작업 자체 조사 및 검토 가능
- Meta fined over $15 million by South Korea for collecting user data
-
11월 14일, 세계예선 아시아지역 제3단계 C조 제5라운드, 중국남자축구는 바레인남자축구와 원정경기를 가졌다.축구 국가대표팀은 바레인을 1-0으로 꺾고 예선 2연승을 거두었다. 특히 이번 경기 국내 유일한 중계 ...
- 我是来围观的逊
- 5 시간전
- Up
- Down
- Reply
- Favorite
-
계면신문기자 장우발 4분기의 영업수입이 하락한후 텐센트음악은 다시 성장으로 돌아왔다. 11월 12일, 텐센트음악은 최신 재보를 발표했다.2024년 9월 30일까지 이 회사의 3분기 총수입은 70억 2천만 위안으로 전년 ...
- 勇敢的树袋熊1
- 그저께 15:27
- Up
- Down
- Reply
- Favorite
-
본사소식 (기자 원전새): 11월 14일, 다다그룹 (나스닥코드: DADA) 은 2024년 3분기 실적보고를 발표했다. 수치가 보여준데 따르면 고품질발전전략에 지속적으로 전념하고 사용자체험을 끊임없이 최적화하며 공급을 ...
- 家养宠物繁殖
- 어제 15:21
- Up
- Down
- Reply
- Favorite
-
11월 12일 소식에 따르면 소식통에 따르면 아마존은 무료스트리밍서비스 Freevee를 페쇄하고 일부 종업원과 프로를 구독서비스 Prime Video로 이전할 계획이다. 올해 초 아마존이 내놓은 몇 편의 대형 드라마의 효 ...
- 度素告
- 그저께 13:58
- Up
- Down
- Reply
- Favorite