Posts tagged “Filtering Technology”

CleanFeed Canada

There is now an FAQ on CleanFeed Canada as well as information on the appeal process.


MSN’s search engine at detects the HTTP header, “Accept-Language”, and then sets, via a Cookie, your “market”. Currently, there are three Chinese options zh-cn (China), zh-hk (Hong Kong) and zh-tw (Taiwan). Your “market” will be set depending on which one of these your browser sends to the server. If your browser send the more generic “zh” without specifying a region, will default to zh-cn.

Unlike Google which uses geolocation by IP address (e.g. if your IP adddress is allocated to Canada you’ll be directed to to redirect you to your localized Google (with the exception of China in which case you are redirected to the Chinese-language version of, if your default setting is Chinese Simplified but you are not in China you will also be redirected to the zh-cn version of This is significant because the zh-cn version of is the censored version for China. This means that people outside of China whose browsers are set to Chinese Simplified will receive, by default, the censored version of the search engine.

You can, of course, go to the settings page and manually specify your market.

Still, this appears to be a problem because the English version of seems to do a very poor job of indexing Chinese sites. I am not a Chinese speaker, so I would appreciate feedback on this. Also, are the HK and TW versions compatible (give the simplified vs. traditional and so forth)? Is it sufficient to expect Chinese simplified users to use the HK or TW versions of

Internet Filtering in India

India is not new to Internet filtering. Back in 2004 India’s Ministry of Communications & Information Technologyordered ISP’s to start blocking web sites. The target was a particular Yahoo! Group, but the ISP’s blocked access to the IP address (see Why Block by IP?) of the domain causing all Yahoo! Groups to be blocked illustrating one of Internet filtering’s unintended consequences. India subsequently ordered the extremist HinduUnity site to be blocked as well (which caused additional “over-blocking”). There were variations in compliance but large ISPs such as VSNL did comply.

So India’s new filtering is not surprising. Once again the Ministry ordered sites to be blocked, some of which are blogs hosted on Blogspot and Typepad. The ISP’s blocked the IP addresses of the sites causing all the blogs hosted on them to be blocked.
More… »

GFW Discussion

I previously commented on the Ignoring the “Great Firewall of China” paper but 0day pointed me to a good critique of the paper. The critique makes several points on the technical side including noting that circumventing keyword by blocking RST packets would not work for “‘hard’ IP filtering”, that China could upgrade its system to account for other noted deficiencies and that ignoring reset packets is detectable and has security concerns.
More… »

Ignoring the “Great Firewall of China”

Richard Clayton, Steven J. Murdoch, and Robert N. M. Watson have released a paper “Ignoring the ‘Great Firewall of China‘” that looks at China’s keyword filtering. Previous ONI work discovered the way in which China uses tcp reset packets to terminate connections based on keywords, that after triggering the filtering mechanism further connections between the two hosts will also be blocked for a varying period of time and that the filtering was bi-directional — it affects in-bound traffic to China as well as outbound traffic from China.

But my technical analysis of the packet level filtering was less than comprehensive and this new research by our colleages at Cambride provides an amazing in-depth analysis of China’s keyword filtering at the packet level. The observed behaviours I previously reported have been explained in skillful detail and this paper has also has provided some new insights into the GFW:

1. China is likely using an Intrusion Detection System (IDS) – possibly Cisco’s “Secure
Intrusion Detection System” for keyword filtering seperate from their routers.

2. It is possible to “ignore” China’s keyword filtering. If both endpoints in a connection ignore tcp reset packets the data intended to be blocked by keyword will go through the GFW.

3. The way in which China filters by keyword can be exploited for Denial-of-Service atacks if one forges the source address and issues requests that contein banned keywords. In this way communication between two targeted endpoints can be blocked.

There are a few ways in which this research can be extended:

– Similar tests can be run from within China to see if the keyword filtering is entirely symetrical.

– Pad the request so that the bad packet appears at different points in the connection. This may help identify if China is only filtering by keyword in the URL path or in the body content of a page. Pehaps only the first X number of bytes are inspected?

– Build a list of terms that trigger the blocking mechanism (I have notice that domain names are often treated as “keywords”).

– While there may be some impact on circumvention technologies the problem is that both sides of the connection must be blocking tcp reset packets. Also, since the GFW blocks key sites by IP address — it is not just a keyword blocking system — these sites would remain blocked.

– It would be great to investigate with such expertise and detail the way in which China blocks IP addresses (and to correlate IP address blocking with the use of domain names hosted on that IP being used as keywords)

Tom-Skype Filtering in China

UPDATE: See the 2008 report which shows that:

  • The full text chat messages of TOM-Skype users, along with Skype users who have
    communicated with TOM-Skype users, are regularly scanned for sensitive keywords, and
    if present, the resulting data are uploaded and stored on servers in China.
  • These text messages, along with millions of records containing personal information, are
    stored on insecure publicly-accessible web servers together with the encryption key required to
    decrypt the data.

Skype’s partner in China, Tom Online, has implemented filtering of Skype’s text chat for Chinese users. Skype is not being transparent about the filtering fucntionality that has been introduced. Here is my initial attempt at trying to figure out Tom-Skype’s filtering.

Tom-Skype can be downloaded from and I installed in in Chinese and English. I also installed the 2.5 beta version, all appeared to function the same. The tests below are from Tom-Skype 2.0 installed in English.

The first thing I noticed is that Tom-Skype is bundled with an executable called ContentFilter.exe. It is an application developed by Tom Online called Tom Word Review. It is digitally signed by Skype.

Tom’s ContentFilter.exe loads after one logs into Skype and runs in the background. It is visible in the process list.

After logging in to Skype several plain text connections are made to Tom’s web server, in addition to some to Skype’s server. Some are just to get the version number of Skype the user is running while others are for the content that appears in the Tom content tab — mine had to do with the FIFA World Cup :) . But there are two rather odd connections:

Connection 1

GET /agent/skypever.php?md5=nofile HTTP/1.1
Content-Type: text/html
Accept: text/html, */*
User-Agent: Mozilla/3.0 (compatible; Indy Library)

HTTP/1.1 200 OK
Date: Tue, 13 Jun 2006 17:39:14 GMT
Server: Apache
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html


This looks like its checking for a version number.

Connection 2

GET /agent/keyfile HTTP/1.1
Content-Type: text/html
Accept: text/html, */*
User-Agent: Mozilla/3.0 (compatible; Indy Library)

HTTP/1.1 200 OK
Date: Tue, 13 Jun 2006 17:39:15 GMT
Server: Apache
Last-Modified: Mon, 07 Nov 2005 11:02:50 GMT
ETag: "1a73b-8166-436f345a"
Accept-Ranges: bytes
Content-Length: 33126
Connection: close
Content-Type: text/plain

This connection downloads a file called “keyfile” into Skype’s installation directory. I assume its a keyword list file of some sort. It has 123 lines that look like this:


I have not been able to decode this. It looks like hex but does not convert nicely (utf-8, utf-16, gb18030, gb2312 etc…)

(By the way, when you uninstall Tom-Skype the “keyfile” is not removed form your computer.)

However, I could only trigger the filtering with the work “fuck”. I tried all the words from the QQ list plus list of political words and phrases. I also tried some mandarin slang and some Chinese sex related words.

Here is what it looks like. I sent text chat from one Skype account using Tom-Skype on one computer to another with the same set-up. (I was able to make a Skype user name through Tom-Skype called “falun99”, I thought they may want to filter screen names, but they do not seem to.) When you receive any text — a word, a sentence or a paragraph — that contains a keyword, in this case “fuck”, the entire message is not displayed to the user using Tom_Skype.

The message sender, using Tom-Skype, can see the text, including the banned keyword. And if that message is sent to a normal Skype user, the receiver can also see it.

However, if a message with banned words is received by a Tom-Skype users (from a normal Skype user or a Tom-Skype user) the message will not be displayed at all.

  • Tom-Skype is bundled with ContentFilter.exe which makes two connectiopns to Tom Online’s web server, one appears to download a keyword file.
  • Tom-Skype message blocking is done on the client side while receiving messages, normal Skype users can receive messages from Tom-Skype users that contain banned keywords.
  • The total amount of keywords appears to be low, so far only “fuck” has been found.

Microsoft responds with omissions

Amnesty International’s campaign urging Microsoft not to engage in human rights abuses has triggered a response from Microsoft. Microsoft claims it “has increased the ability of Chinese citizens to engage in free expression” and that Amnesty’s claims of Microsoft’s censorship are misleading. A first glance indicates that Microsoft may have a few points:

1. Microsoft says it has not signed “Public Pledge of Self Regulation” for the Chinese Internet industry (Microsoft is not listed as a member of ISOC). Still, Microsoft does self-censor its blog and search services in China.

2. Microsoft says that its beta search engine in China “does not block searches for particular key words, including ‘democracy,’ ‘freedom,’ ‘human rights'”. Indeed, searches for these terms (in English and Chinese) do produce results. (The results do seem to be weighted in favour of content hosted in China, but I’ll leave it to others to investigate that further.)

However, does, in fact, censor its search results. Rather than restrict what keywords a user can search for MSN simply removes specific web sites from the results displayed to the user. Following,’s example, MSN indicates that results have been removed. MSN provides a link to a page that explains their filtering policy which states that sometimes, in accordance with local law, certain results will not be shown.

(* Click for larger image)

For example, using the “site:” modifier which restricts results to a particular website, a search for “” returns a page that indicates that although there are millions of items there were no results found and that some results have been removed. MSN China has removed the BBC News website from its results set.

(* Click for larger image)

The reality is that Microsoft does censor is MSN China search engine.

3. Microsoft says that users of its MSN Spaces blogging service in China “are not prohibited from using the words ‘democracy,’ ‘freedom,’ or ‘human rights’ in blog titles or blog content.” But it admits that there are restricted terms when it comes to the “account name, space name, or space sub-title – or in photo captions.” Microsoft claims that the key words ‘democracy,’ ‘freedom,’ and ‘human rights’ are not on their restricted term list. Microsoft states that “MSN Spaces does not filter blog content in any way.”

Microsoft is choosing the terms used very carefully, ostensibly to obfuscate the fact that MSN spaces does censor users. Note the distinction between blog titles and blog content. Blog content seems to refer to the “body” of a blog post, which does not appear to be filtered, but blog titles are in fact filtered. Although the specific words noted are no (or no longer) filtered, terms such as 天安门事件 (“tiananman massacre” in Chinese) are in fact filtered. If a blog post title contains such terms the user receives a warning indicating that the post contains prohibited language and the blog entry is not posted.

(* Click for larger image)

MSN Spaces content is in fact censored, just not in the “body” of a blog post.

Microsoft appears to be trying to divert attention from their censorship practices focusing on the specifics of their filtering system. Researchers are at a distinct disadvantage as Microsoft keeps the exact list of censored terms secret and can modify the list at anytime. In fact, Microsoft’s main claim that Amnesty International is inaccurate by stating that the words ‘democracy,’ ‘freedom,’ and ‘human rights’ (presumably they mean in Chinese) are censored by MSN is because MSN modified their original restricted term list.

Research conducted by Rebecca MacKinnon in June 2005 clearly shows that MSN Spaces prevented a blog titled 我爱言论自由人权和民主, which translates to “I love freedom of speech, human rights and democracy” from being created.

While the words identified by Amnesty International are not filtered for specific blog entries or in the MSN China search engine they were used as part of MSN Spaces’ filtering and Amnesty is rightly drawing attention to this. Microsoft, on the other hand, is using precision to deflect criticism and make it appear that they don’t censor their services at all.

This underscores the need for anti-censorship community to be thorough in our research. Since these companies (and countries) can change how and what they filter at anytime they may use this to attempt to discredit critics. It is very important for free speech advocates to accurately identify companies that are complicit is censorship world wide.

Microsoft claims that it “has increased the ability of Chinese citizens to engage in free expression” when in fact all they have done is introduced censored services that domestic Chinese firms already provide. Instead, Microsoft is, as Amnesty International states “aiding repression, censorship, and violation of fundamental freedoms”. By introducing yet another censored service in China — to compete for market share with other censored services — Microsoft is normalizing censorship. Rather than being the exception, censorship is becoming the rule and when the largest and most powerful technology companies on earth support it, it becomes increasingly difficult to fight against.

Amnesty Campaign and Censorship Map

Amnesty International is currently working with the OpenNet Initiative (ONI) to help raise awareness of internet censorship around the world. Amnesty International is launching a campaign to show that online or offline the human voice and human rights are impossible to repress.

The aim of the ONI is to document empirically patterns of Internet content filtering and surveillance worldwide behind national firewalls over an extended period of time. Its reports have documented the scope, scale and sophistication of numerous filtering regimes worldwide, and have helped verify the use of commercial filtering technologies that are used to underpin these regimes. The ONI’s flash map of global filtering shows the results of these investigations.

Who chooses what to filter? Part 2

Here is another recent case that illustrates the transparency and accountability issues that affect national level filtering.

The School District of Palm Beach County uses BlueCoat’s WebFilter to block content that BlueCoat classifies as “Gay/Lesbian”, including prominent advocacy groups, but not sites that denounce homosexuality. There are two issues here: 1) the decision to block Gay/Lesbian content and 2) the classification of web sites as Gay/Lesbian.

The Children’s Internet Protection Act (CIPA) requires schools and libraries to install filtering systems in order to receive E-Rate funding. The filtering systems are to block access to “visual depictions” that are “(a) are obscene, (b) are child pornography, or (c) are harmful to minors” (from FCC Fact Sheet).

The leap from pictures to entire web sites is one that i did not notice earlier, but is a significant case of mission creep. It is even more significant in the context of the decision of what specific content categories to filter.

The decision to block Gay/Lesbian content category, supplied by BlueCoat, seems to have been made by the system adminstrator based on “common sense”:

Shawn Brinkman, a systems specialist at the District, explained the decision to block gay and lesbian issues websites by citing concerns about the appropriateness of such websites for younger students.

“We have to make judgments for all our users, which include Pre-K users,” Brinkman said. “We don’t have the technology to disallow and allow for certain age groups.”

Brinkman said that the decision was one made on “common sense.”

“I think common sense says [these websites are not] appropriate for four- or five-year-olds,” he said, adding that these topics featured on those websites are ones parents would probably want to discuss with their children.

Now, there isn’t much “common sense” in blocking human rights sites. And it certainly extends beyond the CIPA requirements. It shows that although the system was put into place based on a narrow set of guidelines with the expressed purpose of protecting children from content such as pornography it has been extended without oversight and left to basically ad-hoc decisions based on the “common sense” of an administrator.

The second part of this concerns how BlueCoat classifies sites. Sites that are against homophobia are classified as “Gay/Lesbian” while homophobic sites are classified as “Health” and “Political/Activist Groups.” BlueCoat’s classification of sites into secret block lists also determine what sites the School District of Palm Beach County’s students can have access to.

Exporting Censorship

Xeni Jardin has a nice peice in the NY Times about exporting censorship:

American technology companies are taking heat for helping China’s government police the Internet. But this controversy extends well beyond China and the so- called Internet Gang of Four: Google, Yahoo, Cisco and Microsoft. Just how many American companies are complicit hit home for me last month when readers of e-mailed us to say they had been suddenly denied access.

The cause was SmartFilter, a product from a Silicon Valley company, Secure Computing.

SF classified BoingBoing as “nudity” a category which most filtering admins block. This affects many schools, libraries, corporate offices, and even entire countries. (In fact ICE is classified as “hacking” and they won’t change it).

Secure Computing refused to provide me with a list of the governments that use its filters. However, the OpenNet Initiative, a partnership between the University of Toronto, Cambridge University and Harvard Law School, has compiled data on how such products are used in foreign nations where censorship is easy because the governments control all Internet service providers.

The initiative found that SmartFilter has been used by government-controlled monopoly providers in Kuwait, Oman, Saudi Arabia, Sudan, Tunisia and the United Arab Emirates. It has also been used by state-controlled providers in Iran, even though American companies are banned from selling technology products there. (Secure Computing denies selling products or updates to Iran, which is probably using pirated versions.)

Internet Filtering in Yemen

The OpenNet Initiative has released Internet Filtering in Yemen 2004-2005, a country study that documents the degree and extent to which the Republic of Yemen controls the information environment in which its citizens live. Drawing from technical, legal, and political sources, ONI’s research finds that Yemen limits its citizens’ access to Internet content by using commercially available American filtering technology from Websense. The ISPs primarily target pornography, gambling, proxy servers, and gay/lesbian materials, and neither blocks political material. Click Here for the full report.

Internet Filtering in Syria

Elijah pointed me to a very interesting post on Internet censorship in Syria. In the post, Ayman Hourieh talks about the blocking of Wired as well as major email providers Hotmail and Yahoo.

In addition, a Syrian blog was blocked but access seems to have been restored.

For more information on internet censorship in Syria see the Human Rights Watch report.

Google Compare

To help understand how the results of and differ, the OpenNet Initiative has assembled a tool that lets you simultaneously compare search results. The tool is accessible here.

Human Rights and the Internet

Testimony of Nart Villeneuve at the Congressional Human Rights Caucus Members’ Briefing: Human Rights and the Internet – The People’s Republic of China Wednesday, February 1, 2006.

Mr. Chairman and Members of the Caucus:

On behalf of the Citizen Lab, I would like to thank the Congressional Human Rights Caucus for inviting me to speak on the issue of Human Rights and the Internet. As Director of Technical Research for the Citizen Lab I have worked extensively over recent years on the OpenNet Initiative, a collaboration between the Munk Centre for International Studies at the University of Toronto, the Berkman Center for Internet & Society at Harvard Law School, and the Cambridge Security Programme at Cambridge University focused on the study of Internet filtering and surveillance worldwide.
More… »

Google Media

I joined the BBC’s “World Have Your Say” program the other day to talk about’s censorship practices. You can listen to the show here.

Also, I was on CBC’s “The Hour” talking about the COPA case and the US gov’t attempt to acquire some of Google’s search records.