A recent article in Spectrum discusses Internet filtering in China and has some nice quotes from the OpenNet Initiative’s Derek Bambauer. The article is quite good, although the bit about China “using proxy servers to inspect URLs themselves for words that indicate banned topics” is not accurate in my opinion. I really like Seth’s quote:
“There’s a famous saying, ‘The Internet considers censorship to be damage, and routes around it.’ I say, what if censorship is in the router?” — Seth Finkelstein
This is a great quote. And I agree completely. In China, the censorship IS in the router. Nearly all modern routers come with the ability to configure Access Control Lists (ACLs). These are commonly used to combat Denial of Service (DoS) attacks, slow the spread of worms/viruses, to block phishing sites, and to block the addresses of known spammers. In addition to blocking IP addresses, routers can be configured to block specific strings in HTTP GET requests. For example, Cisco 12000 series routers, which have been sold to China in 1998 & 2004), have this capability. From my experience, this is precisely how China is implementing keyword filtering.
Here is how Cisco describes the keyword/packet filtering capabilty:
Among the rich feature set, the Cisco 12000 Series ISE provides security against DoS attacks. Using the service engine’s classification and rate-limiting features, service providers can also control the amount of control plane information at any point in the network and prevent some DoS attacks. ISE technology allows prevention and detection of DoS attacks through edge policing functions including ACLs, extended ACLs, unicast Reverse Packet Filtering (RPF), and rate-limiting. The Cisco 12000 Series is unique in its ability to deploy up to 750,000 filters to the traffic at line-rate. This feature enables service providers to configure bidirectional packet filter classification using any IP or MPLS packet header information at Layer 2 or Layer 3. Traffic can be classified with such granularity that a service provider can capture and even stop unpredictable “trigger” packets.
Thats right, up to 750,000 filters!
An applied use for this capabilty was to stop the spread of the Code Red worm in 2001. Here’s the GET request for one of the versions of Code Red:
To block Code Red traffic, routers can be configured to block access to any GET requests that contain a keyword:
Router(config)#class-map match-any http-hacks
Router(config-cmap)#match protocol http url “*default.ida*”
Router(config-cmap)#match protocol http url “*cmd.exe*”
Router(config-cmap)#match protocol http url “*root.exe*”
Now and URL that contains “default.ida” in it will be blocked. China’s Computer Emergency Response Teams are aware of this technique. It’s just as easy to add match protocol http url “*falun*” or a domain “*voanews.com*” or match the host match protocol http host “*voanews.com*” . I belive that the edge or core routers near the international gateway connection employ this keyword filtering technique as well as blocking by IP address.
Here is a traceroute to HRIC:
* Reverse DNS for 184.108.40.206 is (incorrectly) hrchina.org, forward DNS is hrichina.org.
The regional routers properly route the request, however, when it hits the edge router the packets are lost. In this example, the blocking is occuring by IP address — no 3-way TCP handshake is completed. No response is received from the server. The packets are discarded by the router. Here is an ethereal log from a computer in China trying to request
*Source IP removed
In cases where the blocking occurs because of a keyword the TCP handshake occurs, and whan the GET request of the HOST: header is passed through the connection is terminated with an RST packet. Here’s a rquest to a non-blocked domain & ip that contains the domain voanews.com as a string in the URL path — this specific request is blocked.
*IP’s removed, but you get the idea :)
When a “block” occurs, the connection is disrupted. The TCP connection is terminated with an RST packet, this almost usually creates a ZeroWindow condition — when the host advertises a non-zero window size. When the blocking is triggered by a “keyword” in URL (the host is otherwise accessible) the effect is that the requesting IP (the user) cannot connect to the host (the blocked website) untill the host advertises a non-zero window size. This is what is generally referred to as being “banned” or being in the “penalty box”, although people unfortunetly suggest that this applies to the user’s Internet access, when it does not — it only applies to connections to the specific IP which was subject to blocking. When single domains have multiple IP’s, Google for example, this can lead to weird behaviour where the connection to Google seems fine (because the conection was to a different IP) but later searching is disrupted (a connection is issued to the original “blocked” IP).
As mentioned in the Cisco documentation about the 12000 series router, the filtering can be emplyed bi-directionally. You can experiment with this on inbound connections to websites hosted in China.