Return to the Library

Algorithmic Recommendations Can Finally Be Turned Off, China’s Provisions are a World First


In the wake of the publication of China’s first algorithm regulations, this article reflects on potential pitfalls and difficulties in implementing these groundbreaking policies.

FacebookTwitterLinkedInEmailPrintCopy Link
Original text
English text
See an error? Drop us a line at
View the translated and original text side-by-side

On March 1, the Provisions on the Administration of Internet Information Service Algorithmic Recommendations, jointly issued by the Cyberspace Administration of China and three other departments, formally went into effect. Requirements such as the one to “provide users with a convenient option to turn off algorithmic recommendation services” are tailored to meet the actual needs of users and set a global regulatory precedent with regard to specific algorithms. They have therefore received widespread attention. This article analyzes the ins and outs of the relevant regulations and possible future developments.


That mobile phone apps can “read your mind” has long ceased to be a secret. Many people have experienced the following: having just mentioned in a chat with a friend that they wanted to buy something, moments later they’ll see that very product appear in an app; or they browse a few news stories on a certain topic, and suddenly some app is filled with responses on that same issue.


Many internet users have long been unhappy with, or even a bit scared of, this kind of situation.


More recently, a switch has “finally appeared after a chorus of requests.” On the “Settings” page of many frequently used apps you should now be able to find something like a “turn off personalized recommendations” option.


The author found the relevant switch on “Zhihu” and “Taobao” and personally tested it out.


Why are all the familiar apps quietly giving an option to turn off personalized recommendations? The reason is that the Provisions on the Administration of Internet Information Service Algorithmic Recommendations (hereinafter “Provisions”) announced on January 4 contain the following article:


Article 17 Algorithmic recommendation service providers shall provide users with an option of not having their personal characteristics targeted, or provide users with a convenient option to turn off algorithmic recommendation services. If a user chooses to turn off an algorithmic recommendation service, the algorithmic recommendation service provider shall immediately stop providing such a service.

第十七条 算法推荐服务提供者应当向用户提供不针对其个人特征的选项,或者向用户提供便捷的关闭算法推荐服务的选项。用户选择关闭算法推荐服务的,算法推荐服务提供者应当立即停止提供相关服务。

These Provisions, which were jointly issued by the Cyberspace Administration of China, the Ministry of Industry and Information Technology, the Ministry of Public Security, and the State Administration for Market Regulation, will officially go into effect on March 1.


To the best of the author’s knowledge, this is the first law in the world to impose specific constraints on algorithmic recommendation-related conduct.


Europe’s Digital Markets Act (DMA) also contains a stipulation on “not permit[ting] the exploitation of data advantages to show users targeted advertisements without obtaining explicit permission from users.” But the DMA was only passed by the European Parliament’s Internal Market Committee last November and negotiations with European governments have yet to begin. There is a way to go before it becomes law.


In China, however, the major internet enterprises have already taken action to implement the Provisions.


Potential Hazards of Algorithmic Recommendations


In his 2006 book Infotopia, Harvard University Professor Cass Sunstein introduced the notion of a kind of “information cocoon” in human society. He argued that in information dissemination, people’s own demands are not comprehensive, and instead they will just pay attention to selecting the information that they want and that can make them happy. Over time, the information they come into contact with becomes increasingly restricted. They become tightly wrapped up as though by the silk spat out by a silkworm, and ultimately shut themselves up inside an “information cocoon,” thus losing opportunities to encounter and understand different viewpoints.


Algorithmic recommendation may actually reinforce the information cocoon effect: the more interested you are in something, or the more inclined you are to a certain viewpoint, the more likely an algorithm is to recommend materials concerning that thing or materials supporting that viewpoint so that you are constantly reinforcing your own interests and inclinations. Moreover, algorithmic recommendations may also be used to purposefully guide populations and thereby influence public thinking and even influence political decisions. U.S. mathematician Cathy O’Neil thus refers to recommendation algorithms as “weapons of math destruction” in her book of the same name. These “weapons of mass destruction” have had a real-world impact many times in the past few years.

算法推荐则有可能强化信息茧房效应:你越是对某种事物感兴趣、倾向于某种观念,算法就会越是给你推荐关于这种事物、支持这种观念的材料,让你不断强化自己的兴趣和倾向。而且,算法推荐还可能被有目的性地引导人群,从而影响公众观念,甚至影响政治决策。因此,美国数学家凯西·奥尼尔在《算法霸权》一书中将推荐算法称作“数学大杀器”(weapons of math destruction)。在过去几年中,这件“大杀器”已经在现实世界中多次产生效果。

In 2016, consultants from the main social-networking platforms such as Facebook, Google, and Twitter worked together in a San Antonio office as part of Project Alamo, which was supporting Trump’s re-election campaign. They put roughly 90 million U.S. dollars into digital ads. Project Alamo made use of subtle algorithmic recommendation techniques to influence voters: when an internet user was recognized as a “pivotal voter” (e.g., a swing voter in a swing county of a swing state), the social networks would direct content of a leading nature to such a voter and in this way influence the election outcome at relatively little cost.

2016年,在支持特朗普竞选的“阿拉莫项目”(Project Alamo)中,来自脸书、谷歌、推特等几个主要社交网络平台的顾问在圣安东尼奥的同一间办公室并肩工作,在数字广告上投放了大约9千万美元。阿拉莫项目采用了精妙的算法推荐技术来影响选民:当一位互联网用户被识别为“关键选民”(例如摇摆州的摇摆县的摇摆选民),社交网络就会给这样的用户定向投放具有引导性的内容,从而用相对不多的经费影响竞选结果。

A few months before the U.S. general election, Cambridge Analytica in the United Kingdom used Facebook user data to manipulate the U.K.’s Brexit referendum, with the result that the pro-Brexit camp unexpectedly won—exactly as Donald Trump was unexpectedly elected.

就在美国大选前几个月,英国的剑桥分析(Cambridge Analytica)公司使用来自脸书的用户数据操纵了英国脱欧(Brexit)公投,令脱欧派意外获胜——与特朗普意外当选如出一辙。

Prior to their official use in influencing U.K. and U.S. politics, similar methods had already been tested in many developing countries. In 2010, in Trinidad and Tobago, a “Do So” movement originating from Facebook led a large number of voters of African descent to refuse to vote, thereby aiding the ethnic Indian-dominated United National Congress in the general election. In 2015, some Nigerian users saw short bloody and violent anti-Muslim videos on Facebook. Their purpose was to frighten voters and manipulate the election. Once algorithmic recommendations are abused, they truly can become a “weapon of mass destruction.”

在正式用于影响英美政局之前,类似的手段已经在多个发展中国家做过实验。2010年,在特立尼达和多巴哥,一场起源于脸书的“Do So”运动让大批非洲裔选民拒绝投票,从而使印度裔主导的联合民族大会(UNC)在大选中受益。2015年,部分尼日利亚用户在脸书上看到暴力血腥、仇视穆斯林的视频短片,其目的是恐吓选民、操纵选举。算法推荐一旦被滥用,真的可以成为“大杀器”。

Even if not intentionally abused, algorithmic recommendations may implicitly contain social bias and discrimination. In October last year, Twitter’s recommendation algorithms were found to have “unintentionally amplified the dissemination of right-wing group content”: Tweets posted by elected right-wing officials were algorithmically amplified more than those of their left-wing counterparts; the influence of right-wing media was greater than that of left-wing media.


Earlier still, the search algorithm of the professional networking website LinkedIn (which can be seen as a form of recommendation algorithm: “most suitable” content is recommended based on search keywords) was found to entail sex discrimination: male job seekers would be placed in higher positions. Google’s advertising platform AdSense was discovered to be racially biased. If a search keyword seemed like a black person’s name, AdSense would be more likely to recommend advertisements relating to criminal record checks.


Because algorithmic recommendations come with such risks of potential harm, some European and American researchers have long proposed subjecting recommendation algorithms to controls. The measures required by these Provisions, such as algorithmic mechanism auditing, science and technology (S&T) ethics reviews, and the granting of permission to users to turn off algorithmic recommendations, have all already been proposed and recommended outside of China. However, the international giants of the internet never put these suggestions into practice. They still commonly argue that “deep learning-based algorithms cannot be audited.” To help readers comprehend the importance of the Provisions, the author will concisely introduce the technical principles behind algorithmic recommendation.


Technical Principles of Algorithmic Recommendation


The various forms of algorithmic recommendation include the forms enumerated in the Provisions, namely “generative or synthetic, personalized pushing, ranked selection, search filter, and decision-making and scheduling.” All the current mainstream methods of implementation use machine learning, and the principles behind them involve predictions based on Bayesian statistics—which sounds deep and unfathomable, but is in fact easily understood through a simple example.

各种形式的算法推荐,包括《规定》中列举的“生成合成、个性化推送、排序精选、检索过滤、调度决策”等形式,当下主流的实现方式都是采用机器学习(machine learning),背后的原理都是基于贝叶斯统计(Bayesian statistics)方法的预测——听起来很高深,其实通过一个简单的例子很容易就能理解。

Suppose that you cast a dice that has never been used. What do you think the probability is of getting a 6? Of course, in the absence of any additional information, your prediction would be “1 in 6.” Then you throw the dice 20 times in a row. Each throw results in a 6. At this point, what would you think the probability would be of getting a 6 on the next throw? Classical probability theory holds that each throw of the dice is an independent, random event—the results of previous throws do not affect the results of future throws—so your forecast should still be “1 in 6.” But obviously a normal person would not think in this way.


The information that “this dice came up a 6 in 20 straight throws” would clearly affect future decisions (for example, it might indicate that this was a leaded dice) and so you would predict a high probability of a 6 on the next throw. Simply put, Bayesian statistics “predict events that are to occur in the future based on events that have already occurred in the past.” All algorithmic recommendations are making these kinds of predictions:


Zhihu’s personalized recommendations predict what questions and answers users might like to see.


Baidu’s search filter predicts which search results users might be interested in.


Taobao’s ranked selection predicts which products users might purchase.


The “events that have already occurred in the past” on which these predictions are based are a very broad, user-related dataset, which not only contains direct user actions such as “which answers the user saw/liked/bookmarked,” but also contains a large volume of the user’s own attribute information: age, sex, region, level of education, occupation, internet connection device, things they’ve purchased, remarks they’ve made, the size of home they live in, how many people are in their household, that they like Jeff Chang, that they don’t like Cai Xukun…All this information can be used to predict the user’s preferences.


Each piece of attribute information like this is also called a “feature.” An internet company generally possesses thousands or tens of thousands of pieces of feature information on a single ordinary user. Some of this feature information comes from the company’s own business, but more feature information comes from other platforms. Enterprises such as the three main operators [China Mobile, China Telecom, and China Unicom], Weibo, Tencent, Alibaba, and mobile phone manufacturers all share users’ personal feature information with other internet apps in the form of software development kits.


Given a specific prediction, some of these pieces of feature information will have higher relevance to the prediction while other features will be of less relevance. If it is possible to trace back from the prediction result to those features that had an important effect, we can say that this algorithm is “auditable.” For example, the principle of a linear regression, which is the simplest, most basic machine learning algorithm, is to assign a weight to each feature based on past events and then to predict future events based on these weights. From a linear regression prediction model, it is possible to directly see the weight of each feature in the prediction. Therefore, a linear regression is an algorithm that is especially easy to audit.

在所有这些特征信息中,给定一项具体的预测,有些特征与这项预测的相关度较高,有些特征的相关度则较低。如果能从预测的结果回溯到哪些特征产生了重要的影响,我们就可以说这种算法“具备可被审核性”(auditable)。例如最简单、最基础的机器学习算法线性回归(linear regression),其原理就是根据过去的事件给每项特征打一个权重分数,然后根据这些权重分数预测未来的事件。从一个线性回归的预测模型中,可以直观地看到每项特征的在预测中的权重,因此线性回归是特别容易审核的一种算法。

Of course, the simplest, most basic algorithm suffers from the problem of insufficient predictive ability. Figuratively speaking, it is not possible to squeeze out all the information implicitly contained in feature values just by using a simple linear regression and so prediction results might not be especially good. Hence, scientists and engineers have thought up many methods for squeezing information from feature values. One method is called “feature engineering,” which put plainly means deriving new feature values from known feature values. For example, assigning a new label of “considerable buying power” or “fad follower” to a user based on the user’s mobile phone model or shopping list would be a simple instance of feature engineering.

当然,最简单、最基础的算法,也就存在预测能力不够强的问题。形象地说,只用简单的线性回归,无法把特征值里隐含的信息全都榨取出来,所以预测效果不见得特别好。于是科学家和工程师们想了很多办法来压榨特征值里的信息。一种办法叫“特征工程”(feature engineering),说白了就是从已知的特征值推导出新的特征值,例如根据用户的手机型号、购物清单给用户打上“购买力强”或者“时尚潮人”的新标签,这就是一种简单的特征工程。

Another way to squeeze feature values is to regard initial feature information as a “layer” of input and then to use various mathematical methods to convert the input layer into new information nodes and in this way form a multi-layer “network.” This process of conversion may be repeated, and the greater the number of conversion layers, the “deeper” the network is said to be—this is where the term “deep learning” comes from.

另一种压榨特征值的办法是把起初的特征信息视为一“层”输入,然后用各种数学方法把输入层变换成新的信息节点,从而形成一个多层的“网络”。这个变换的过程可以重复进行,变换的层数越多,就说这个网络越“深”——这就是“深度学习”(deep learning)这个词的由来。

Although scientists often use “neuron” and “neural network” to analogize the results of these mathematical conversions, the information nodes resulting from these conversions often have almost no real world meaning, and are purely a product of mathematical tools. So there is a saying in the profession: deep learning is like alchemy (or, in China, “like concocting immortality pills”). You toss data into a neural network, and out come the results for reasons unknown to you. If the results are less than ideal, just add a few layers to the neural network.


Because deep learning is often cloaked in an alchemic-like aura of mystery, the engineers who use it often don’t know themselves why an algorithm is effective. For example, Google published a paper introducing their image recognition algorithm, which used a 19-layer-deep neural network (VGG19). However, Google’s photo service (Google Photos) was exposed on several occasions as having implicit racial bias. It even identified black people as “gorillas.” Subsequently, Google found it impossible to locate the problem in the algorithm. All it could do was delete the “gorilla” labeling action.

正因为深度学习常有“炼金术”的神秘感,使用它们的工程师经常自己都不知道为什么一个算法有效。例如谷歌曾发表过一篇论文介绍他们的图像识别算法,其中使用了一个深达19层的神经网络(VGG19)。然而谷歌的照片服务(Google Photos)却多次被曝暗含种族歧视,甚至把黑人照片识别为“大猩猩”。事后谷歌根本无法找出算法中的问题出在哪里,只好删除“大猩猩”标签了事。

In spite of the warning offered by Google’s embarrassing failure, similar problems continue to pop up in the products of all the internet giants. In 2020, some Facebook users, while watching a video that featured a black protagonist, received a recommendation prompt asking them whether they wished to “continue watching videos about primates.” In 2018, Joy Buolamwini, a researcher in the MIT media laboratory, discovered that Microsoft, IBM, and Face++ facial recognition algorithms had a much higher gender identification error rate for black people than for white people. Moreover, the darker the skin color was, the lower the identification rate. The identification error rate for black females was as high 35%. An over-reliance on alchemic-type deep learning algorithms is the reason that these internet giants have a lackadaisical attitude towards algorithm audits. It also makes the job of amending implicit systemic discrimination more difficult.

尽管有谷歌的前车之鉴,类似的问题仍然在各家互联网巨头的产品中反复出现。2020年,部分脸书用户在观看一段以黑人为主角的视频时收到推荐提示,询问他们是否愿意“继续观看有关灵长类动物的视频”。2018年MIT媒体实验室的研究员Joy Buolamwini发现,微软、IBM和Face++的人脸识别算法在识别黑色人种的性别的错误率要远远高于白色人种,而且肤色越黑,识别率就越低,黑人女性的识别错误率高达35%。过度依赖“炼金术”式的深度学习算法,是这些互联网巨头对算法审核态度冷淡的原因,同时也导致它们难以修正其算法中隐含的系统性歧视。

China’s Provisions: Significance and Concerns


It is precisely due to the industry’s reliance on algorithmic recommendations and deep learning technology that these Provisions appear to be especially important. The author believes that the announcement of the Provisions has, on the one hand, compelled internet enterprises to constrain their own behavior and use algorithmic recommendations for good purposes, i.e., upholding a mainstream value orientation and actively spreading positive energy, and not building information cocoons or inducing user addiction. On the other hand, it has compelled internet enterprises to strengthen the building of internal capabilities, i.e., establishing algorithm auditability and proactively selecting and optimizing recommendation algorithms that are comprehensible and auditable while avoiding “technology-only-ism” and over-reliance on alchemic-type recommendation algorithms.


However, being that these Provisions are a world first, the author still has some specific concerns relating to their implementation.


First, the question of how to implement algorithmic mechanism auditing and S&T ethics reviews may be a new challenge for regulators. Although the Provisions require that “algorithmic recommendation service providers shall regularly audit, evaluate, and verify algorithmic mechanisms, models, data, and application outcomes,” there may be a lot of space for fuzziness as to whether this requirement is actually put into practice, whether internet enterprises have really carried out audits, evaluations, and verifications, and whether algorithm results meet requirements. After all, an algorithmic recommendation audit is not like an illegal harmful information audit, where the discovery of illegal harmful information is sufficient to know immediately that there is a problem in the auditing process. The effects of algorithmic recommendations are manifested statistically over a long period of time and across a broad scope. How to determine whether an audit was truly implemented might in itself be a technical puzzle.


Second, on explaining the provision of algorithmic recommendation services and giving options for users to turn off personalized recommendations, although all the main internet enterprises have already created these functions, it is hard to say that that they are informing users “in a conspicuous manner.” The author, who is an IT professional and who was purposefully searching, still spent a fair amount of time before managing to locate the different places where the “turn off algorithmic recommendations” option was hidden in several of the major apps.


Of course, internet enterprises definitely hope to hide these functions where the majority of users cannot find them. After all, a function that the vast majority of users can’t find might as well be a non-existent function. If regulators are to prevent the right of “users to turn off algorithmic recommendation services” from becoming hollow words, shouldn’t they consider something along the lines of the [General Data Protection Regulation] (GDPR) and require that personalized recommendations be provided only after “express permission” from the user?


The GDPR requires that websites obtain users’ express permission before recording user information via cookies. It compels websites to solicit user permission in a truly conspicuous manner.


Lastly, though internet enterprises will not be able to forcibly provide personalized recommendations under the Provisions, they may still coerce users into turning on (or being unable to turn off) personalized recommendations by engaging in “slowdowns.”


To take Zhihu as an example, once one has selected the “turn off personalized recommendations” option in the privacy center, information cocoon-type recommended information will indeed stop, but there will also be a sudden drop in the volume of information that appears in the app. After turning off personalized recommendations, the author made a rough count and found that no more than three new items appeared daily on the Zhihu app’s “Select” page, while responses from many days earlier were still continuously appearing. On multiple occasions, two pieces of duplicate information appeared on the first page, yet content from the “Zhihu Hot List,” which gets the most traffic, was never pushed to the “Select” page. Is it possible that such a large company as Zhihu has no content to push to users after they turn off personalized recommendations? It seems likely that they are planning to use reduced content volume to present users with a combination of incentives and disincentives to keep them from turning off personalized recommendations. I believe that internet enterprises are capable of coming up with many more such slowdown methods. In implementing the Provisions, regulators will be faced with the new challenge of how they should discover and cope with these tricks.


These practical concerns notwithstanding, the Provisions have, after all, set a new precedent by regulating specific internet algorithms. Internet technology has long been regarded by the government and the public as a mysterious black box where only the externally visible results could be regulated. Capitalists and technologists were given too much space in which to play little games. I hope that the Provisions prove to be the first step in opening the black box and that, together with other future regulatory policies and implementation measures, they rip away the veil covering internet technology and expose the games hidden in the box to reasonable, regulatory sunlight. I also hope that the relevant regulatory authorities establish sufficient technological capabilities as soon as possible so as to truly put these regulatory measures into practice.


To top

Cite This Page

熊节 (Xiong Jie). "Algorithmic Recommendations Can Finally Be Turned Off, China’s Provisions are a World First [算法推荐终于可关闭,中国规定开世界先河]". CSIS Interpret: China, original work published in Guancha [观察者网], March 2, 2022

FacebookTwitterLinkedInEmailPrintCopy Link