Posts tagged “Filtering Technology”

Tor Website blocked at My Hotel



picture-2

My hotel uses OpenDNS to block access to the Tor website. Google Translate is also blocked. They are categorized as “Proxy/anonymizer”. This is one of the most annoying things about filtering. I just wanted to quickly translate some text from Russia to English and then read the Tor blog and ….

picture-1

Yes, in order to block the Tor Blog, which uses HTTPS, they are MITM’ing SSL. (If you accept the bad certificate, the Tor Blog is blocked.) It doesn’t look like they are MITM’ing *all* SSL but just connections to selected IP addresses.

It’s funny, because I often recommend OpenDNS to people in order to avoid filtering, but OpenDNS also has a filtering service.

Keyword Lists



A while back I put together various lists of keywords that have been found to be censored in some way in China. I noticed that they’ve been floating around the Net so here’s a post explaining where each of the lists came from.

  • badwords.txt – This is a list that was found on 163.com, a popular Chinese portal. It is unclear to me what the exact purpose of the list is.
    Date: November 6, 2008
    Source: http://sports.163.com/special/00051DT9/badwords.txt
  • banword.txt – This is a list that I found on TOM Online’s Skype servers. This is is not the list used in Tom-Skype (the actual Skype client), but appears to be part of another product, possibly “web chat” software of some kind.
    Date: September 17, 2008
    Source: http://tcc.skype.tom.com/
  • keyword.txt – This is a keyword list from a blog provider in China.
    Date: March 18, 2005
    Source: Blog Provider in China
  • condopper.txt – This is the list of keywords found to be censored at the “gateway” level by the Concept Doppler project.
    Date: June 18, 2008
    Source: http://www.cs.unm.edu/~crandall/cd/GETRequestBlocked18June.html
  • qqdll.txt – This is the list of keywords found in QQ (Program Files\Tencent\QQGame\COMToolKit.dll) a popular Chinese instant messaging program.
    Date: July 31, 2004
    Source: http://bbs.omnitalk.org/arts/messages/3824.html

Alternative Explanation Redux



livejournal.com is now accessible in Kazakhstan and Kyrgyzstan. Why? Because it appears that Livejournal was not actually blocked by ISP’s in those countries in the first place. Instead, it appears that the Sixpart network, on which Livejournal was formerly hosted, firewalled requests from IP addresses in those countries.

Livejournal was reported as blocked by ISP’s in Kazakhstan after bloggers noted that the site had become inaccessible. There was speculation that the blocking was politically motivated and linked to the Livejournal blog of the the Kazakh President’s former son-in-law who is very critical of the government.

The Kazakh ISP’s denied that they had blocked access to it.

“We do not block access in Kazakhstan to any internet resource, including this portal. As a profitable company, our primary concern is to have our subscribers provided with Internet services to the fullest extent,” head of Kazakhtelecom’s PR department Balzhan Ilbisinova told Interfax…

“We have found out that Internet users in Kyrgyzstan and Uzbekistan also do not have access to this resource. Therefore, I think the lack of access may be attributed to technical problems at LiveJournal?s end,” Ilbisinova indicated.

Last year I wrote about the case of dailymotion.com being temporarily blocked in Tunisia as a result of a mis-categorization by SmartFilter. I suggested that sometimes “there are often mundane, alternative explanations” that explain blocking, let alone inaccessibility.

Here is a traceroute to livejournal from KG. It is interesting because it passes through KZ (other traceroutes from the same ISP do not pass through KZ but display the same behavior) but even more so because the last hop is not in KZ or KG but on the first hop on Sixapart’s network. That is the traceroute suggests the problem is on Livejournal/Sixapart’s end.

Tracing route to livejournal.com [204.9.177.18]
over a maximum of 30 hops:

1 237 ms 226 ms 226 ms *.elcat.kg [212.42.*.*]
2 227 ms 226 ms 226 ms *.elcat.kg [212.42.*.*]
3 226 ms 227 ms 226 ms 213.145.131.145
4 229 ms 235 ms 229 ms 92.46.59.161
5 246 ms 246 ms 245 ms alma-core-l2-6.online.kz [92.47.151.157]
6 246 ms 246 ms 245 ms alma-core-l1-6.online.kz [92.47.145.17]
7 246 ms 246 ms 245 ms asta-core-l1-1.online.kz [92.47.145.10]
8 246 ms 246 ms 245 ms asta-core-l2-1-2.online.kz [92.47.145.42]
9 246 ms * 246 ms asta-gate-1.online.kz [92.47.151.166]
10 258 ms 258 ms 258 ms clk15.transtelecom.net [217.150.58.70]
11 350 ms 351 ms 351 ms xe-3-3.r01.londen05.uk.bb.gin.ntt.net [83.231.146.85]
12 667 ms 529 ms 544 ms xe-3-2.r01.londen03.uk.bb.gin.ntt.net [129.250.2.72]
13 352 ms 351 ms 351 ms xe-2-3-0.r22.londen03.uk.bb.gin.ntt.net [129.250.2.65]
14 350 ms 350 ms 351 ms ae-0.r23.londen03.uk.bb.gin.ntt.net [129.250.4.86]
15 358 ms 358 ms 358 ms p64-2-0-0.r22.amstnl02.nl.bb.gin.ntt.net [129.250.4.105]
16 358 ms 360 ms 364 ms ae-1.r23.amstnl02.nl.bb.gin.ntt.net [129.250.4.222]
17 443 ms 440 ms 436 ms as-0.r20.asbnva01.us.bb.gin.ntt.net [129.250.5.46]
18 437 ms 441 ms 441 ms ae-0.r20.asbnva02.us.bb.gin.ntt.net [129.250.2.61]
19 478 ms * 482 ms as-1.r20.dllstx09.us.bb.gin.ntt.net [129.250.3.42]
20 483 ms 482 ms 482 ms ae-0.r21.dllstx09.us.bb.gin.ntt.net [129.250.2.59]
21 512 ms 513 ms 517 ms as-3.r21.snjsca04.us.bb.gin.ntt.net [129.250.4.25]
22 528 ms 524 ms 525 ms ae-0.r20.plalca01.us.bb.gin.ntt.net [129.250.4.118]
23 523 ms 523 ms 520 ms ae-0.r21.plalca01.us.bb.gin.ntt.net [129.250.5.118]
24 520 ms 523 ms 523 ms xe-3-4.r03.plalca01.us.bb.gin.ntt.net [129.250.4.246]
25 529 ms 523 ms 523 ms 140.174.28.110
26 526 ms 525 ms 525 ms v102-sf-core1.sixapart.com [204.9.176.19]
27 * * * Request timed out.
28 * * * Request timed out.
29 * * * Request timed out.
30 * * * Request timed out.

This behavior matches many traceroutes to livejournal from KZ that I have seen posted on forums and blogs.(I don’t have direct access to KZ myself.)

[root@localhost ~]# traceroute livejournal.com
traceroute to livejournal.com (204.9.177.18), 30 hops max, 40 byte packets
1 192.168.0.1 (192.168.0.1) 5.056 ms 4.973 ms 6.254 ms
2 92.46.31.32 (92.46.31.32) 37.094 ms * *
3 92.46.31.9 (92.46.31.9) 39.176 ms 42.572 ms 42.705 ms
4 alma-core-l2-6.online.kz (92.47.150.5) 45.835 ms 46.633 ms 49.597 ms
5 alma-gate-6-2.online.kz (92.47.151.158) 49.578 ms 51.314 ms 51.293 ms
6 62.105.145.81 (62.105.145.81) 120.029 ms 81.211.8.53 (81.211.8.53) 119.194 ms 62.105.145.81 (62.105.145.81) 85.843 ms
7 cat01.Frankfurt.gldn.net (194.186.157.138) 150.126 ms 152.341 ms 141.199 ms
8 TenGigabitEthernet7-4.ar1.FRA4.gblx.net (64.208.222.201) 203.801 ms 203.788 ms 206.627 ms
9 te7-4-10G.ar3.FRA3.gblx.net (67.17.111.178) 348.909 ms 350.545 ms 350.517 ms
10 ge-6-11.car2.Frankfurt1.Level3.net (195.122.136.245) 187.929 ms 201.878 ms 191.595 ms
11 ae-32-56.ebr2.Frankfurt1.Level3.net (4.68.118.190) 189.552 ms ae-32-52.ebr2.Frankfurt1.Level3.net (4.68.118.62) 215.396 ms ae-32-54.ebr2.Frankfurt1.Level3.net (4.68.118.126) 148.874 ms
12 ae-2.ebr1.Dusseldorf1.Level3.net (4.69.132.137) 195.665 ms 187.249 ms 190.874 ms
13 * * *
14 ae-2.ebr1.Amsterdam1.Level3.net (4.69.133.89) 217.906 ms 215.069 ms 211.648 ms
15 ae-1-100.ebr2.Amsterdam1.Level3.net (4.69.133.86) 203.764 ms 205.056 ms 214.713 ms
16 ae-2.ebr2.London1.Level3.net (4.69.132.133) 160.643 ms 173.771 ms 174.483 ms
17 ae-42.ebr1.NewYork1.Level3.net (4.69.137.70) 235.561 ms ae-43.ebr1.NewYork1.Level3.net (4.69.137.74) 242.350 ms ae-44.ebr1.NewYork1.Level3.net (4.69.137.78) 242.053 ms
18 ae-61-61.csw1.NewYork1.Level3.net (4.69.134.66) 243.298 ms ae-71-71.csw2.NewYork1.Level3.net (4.69.134.70) 254.401 ms ae-81-81.csw3.NewYork1.Level3.net (4.69.134.74) 231.585 ms
19 ae-94-94.ebr4.NewYork1.Level3.net (4.69.134.125) 233.414 ms ae-64-64.ebr4.NewYork1.Level3.net (4.69.134.113) 224.338 ms ae-74-74.ebr4.NewYork1.Level3.net (4.69.134.117) 243.520 ms
20 ae-2.ebr4.SanJose1.Level3.net (4.69.135.185) 304.158 ms 312.587 ms 314.754 ms
21 ae-74-74.csw2.SanJose1.Level3.net (4.69.134.246) 301.834 ms ae-84-84.csw3.SanJose1.Level3.net (4.69.134.250) 306.405 ms ae-94-94.csw4.SanJose1.Level3.net (4.69.134.254) 297.681 ms
22 ae-62-62.ebr2.SanJose1.Level3.net (4.69.134.209) 300.430 ms ae-82-82.ebr2.SanJose1.Level3.net (4.69.134.217) 315.784 ms ae-92-92.ebr2.SanJose1.Level3.net (4.69.134.221) 300.655 ms
23 ae-5-5.car1.Oakland1.Level3.net (4.69.134.37) 306.771 ms 306.578 ms 294.600 ms
24 SIX-APART-L.car1.Oakland1.Level3.net (4.71.200.18) 306.504 ms 294.604 ms 295.527 ms
25 v102-oak-core2.sixapart.com (204.9.176.82) 310.375 ms 321.391 ms 316.173 ms
26 * * *
27 * * *
28 * * *
29 * * *
30 * * *

In both cases the last hop is on Sixapart’s network.

On November 18 2008 Livejournal moved off Sixpart’s network and is now accessible in KG and KZ. Since they have new IP addresses they would be accessible until the KZ ISP updated their blocking, but so far this has not occurred. Since the traceroutes clearly show that packets passed through KG and KZ to Sixpart’s network my sense is that some network admin at Sixapart firewalled some IP addresses (or ranges of IPs) that corresponded to ISPs in KG and KZ, perhaps due to “bad” behaviour, such as scans, originating from those IP’s. In any case it appears that the KZ and KG ISP’s had nothing to do with the inaccessibility of Livejournal in those countries.

In any case I’m glad it is now accessible and hope it remains that way.

DNS and the GFW



While the ability to the GFW to send RST packets in an attempt to terminate a connection between a source IP and a destination IP based on keywords appearing in packets (keyword in GET requests and possibly the HTML responses) has been documented in http://www.cl.cam.ac.uk/~rnc1/ignoring.pdf and http://www.cs.unm.edu/~crandall/concept_doppler_ccs07.pdf China also employs a similar system to interfere with DNS. If a DNS request to resolve a hostname is sent in to an IP in China, an intermediary will respond with a DNS response containing an incorrect IP. This is not totally new, it has been documented from inside China already.

I start with a “UDP Traceroute” (DNS packets with no qname with incrementing TTL’s) in order to find the first hop inside China. The IP address of contained in the ICMP response is checked in Team Cymru‘s IP lookup service to find the AS, Country and Network Name.

1|192.168.2.1|time-exceeded  NA
2|64.230.*.*|time-exceeded CA NA
3|64.230.*.*|time-exceeded CA NA
4|64.230.*.*|time-exceeded CA NA
5|64.230.*.*|time-exceeded CA NA
6|64.230.147.14|time-exceeded CA NA
7|206.108.103.138|time-exceeded CA NA
8|160.81.109.193|time-exceeded US SPRINTLINK - Sprint
9|144.232.10.19|time-exceeded US SPRINTLINK - Sprint
10|144.232.8.169|time-exceeded US SPRINTLINK - Sprint
11|144.232.9.224|time-exceeded US SPRINTLINK - Sprint
12|144.232.9.32|time-exceeded US SPRINTLINK - Sprint
13|144.232.2.171|time-exceeded US SPRINTLINK - Sprint
14|144.223.148.2|time-exceeded US SPRINTLINK - Sprint
15|219.158.4.193|time-exceeded CN CHINA169-BACKBONE CNCGROUP China169 Backbone

For me the first CN hop to the IP address 202.165.102.247 (www.yahoo.cn) is 15. So I send a DNS request for “www.citizenlab.org” to 202.165.102.247 (which is not a DNS server) with a TTL of 15, its IP is 219.158.4.193 (CHINA169-BACKBONE CNCGROUP China169 Backbone).

###[ IP ]###
  version   = 4
  ihl       = 0
  tos       = 0x0
  len       = 0
  id        = 1
  flags     = 
  frag      = 0
  ttl       = 15
  proto     = udp
  chksum    = 0x0
  src       = 192.168.2.11
  dst       = 202.165.102.247
  options   = ''
###[ UDP ]###
     sport     = domain
     dport     = domain
     len       = 0
     chksum    = 0x0
###[ DNS ]###
        id        = 0
        qr        = 0
        opcode    = QUERY
        aa        = 0
        tc        = 0
        rd        = 1
        ra        = 0
        z         = 0
        rcode     = ok
        qdcount   = 0
        ancount   = 0
        nscount   = 0
        arcount   = 0
        \qd        \
         |###[ DNS Question Record ]###
         |  qname     = 'www.citizenlab.org'
         |  qtype     = A
         |  qclass    = IN
        an        = 0
        ns        = 0
        ar        = 0

The ICMP response comes back from hop 15:

###[ IP ]###
  version   = 4L
  ihl       = 5L
  tos       = 0x0
  len       = 56
  id        = 5984
  flags     = 
  frag      = 0L
  ttl       = 241
  proto     = icmp
  chksum    = 0xf52
  src       = 219.158.4.193
  dst       = 192.168.2.11
  options   = ''
###[ ICMP ]###
     type      = time-exceeded
     code      = 0
     chksum    = 0xc2d7
     id        = 0xeacf
     seq       = 0x3af8
###[ IP in ICMP ]###
        version   = 4L
        ihl       = 5L
        tos       = 0x0
        len       = 64
        id        = 1
        flags     = 
        frag      = 0L
        ttl       = 1
        proto     = udp
        chksum    = 0xc55c
        src       = 192.168.2.11
        dst       = 202.165.102.247
        options   = ''
###[ UDP in ICMP ]###
           sport     = domain
           dport     = domain
           len       = 44
           chksum    = 0xbca

While this is occurring I also sniff the wire to see if other packets are being sent my way, and they are. Four bad DNS responses were sent my way claiming to be from 202.165.102.247.

###[ IP ]###
     version   = 4L
     ihl       = 5L
     tos       = 0x10
     len       = 98
     id        = 45372
     flags     = 
     frag      = 0L
     ttl       = 45
     proto     = udp
     chksum    = 0xe7ee
     src       = 202.165.102.247
     dst       = 192.168.2.11
     options   = ''
###[ UDP ]###
        sport     = domain
        dport     = domain
        len       = 78
        chksum    = 0xe286
###[ DNS ]###
           id        = 0
           qr        = 1L
           opcode    = QUERY
           aa        = 1L
           tc        = 0L
           rd        = 1L
           ra        = 1L
           z         = 0L
           rcode     = ok
           qdcount   = 1
           ancount   = 1
           nscount   = 0
           arcount   = 0
           \qd        \
            |###[ DNS Question Record ]###
            |  qname     = 'www.citizenlab.org.'
            |  qtype     = A
            |  qclass    = IN
           \an        \
            |###[ DNS Resource Record ]###
            |  rrname    = 'www.citizenlab.org.'
            |  type      = A
            |  rclass    = IN
            |  ttl       = 86400
            |  rdlen     = 0
            |  rdata     = '216.234.179.13'
           ns        = 0
           ar        = 0

Summary:

192.168.2.11 > 202.165.102.247 <DNSQR  qname='www.citizenlab.org.' qtype=A qclass=IN |> 0
219.158.4.193 > 192.168.2.11   time-exceeded
202.165.102.247 > 192.168.2.11 <DNSQR  qname='www.citizenlab.org.' qtype=A qclass=IN |> 
    <DNSRR  rrname='www.citizenlab.org.' type=A rclass=IN ttl=300 rdata='64.33.88.161' |>
202.165.102.247 > 192.168.2.11 <DNSQR  qname='www.citizenlab.org.' qtype=A qclass=IN |> 
    <DNSRR  rrname='www.citizenlab.org.' type=A rclass=IN ttl=86400 rdata='216.234.179.13' |>
202.165.102.247 > 192.168.2.11 <DNSQR  qname='www.citizenlab.org.' qtype=A qclass=IN |> 
    <DNSRR  rrname='www.citizenlab.org.' type=A rclass=IN ttl=86400 rdata='216.234.179.13' |>
202.165.102.247 > 192.168.2.11 <DNSQR  qname='www.citizenlab.org.' qtype=A qclass=IN |> 
    <DNSRR  rrname='www.citizenlab.org.' type=A rclass=IN ttl=86400 rdata='216.234.179.13' |>

64.33.88.161 and 216.234.179.13 are not IP addresses that “www.citizenlab.org” should resolve to.

I used 38 IP addresses on 38 different AS’s in China as targets. A DNS packet was sent to the first CN hop from a udp traceroute to each of these IPs. The IP’s returned from the ICMP packet received from each hop are distributed across 11 AS’s in China.

In total, I received 8 unique bad IP addresses.

211.94.66.147 24403 CN CNNIC-CNCITYNET-AP Beijing Kuanjie Net communication technology Ltd
209.145.54.50 6428 US CDM - CDM
203.161.230.171 9925 HK HKTHOST-AP Powerbase DataCenter Services (HK) Ltd.
64.33.88.161 19916 US ASTRUM-0001 - OLM LLC
202.181.7.85 7489 AU FIRSTLINK-AS-AP First Link Internet Services
4.36.66.178 3356 US LEVEL3 Level 3 Communications
216.234.179.13 13911 CA TERA-BYTE - Tera-byte Online Services
202.106.1.2 4808 CN CHINA169-BJ CNCGROUP IP network China169 Beijing Province Network

Two of the IP’s are in Mainland China and one is in Hong Kong; three are in the US and one in Australia. Only one of the CN IP’s, 211.94.66.147, has a web server running when I checked which means that this server could log IP addresses that connect to it and host name in the requests. Why these IPs?

I don’t know. It is pretty strange.

64.33.88.161 was the IP for falundafa.ca, the IP was blocked so an domains that resolved to it were also blocked. Seems to be legacy blocking.

If you $host bbs.hygung.com you’ll get back most of these IP’s, along with a bunch of others. Many of these IP’s also appear on some kind of IP blocking list (another one), RobotDog anyone? Seems to be a list for a Router OS by http://www.mikrotik.com.cn/. Another site has a post about dns cache poisoning/phishing and one of these IP’s, this time affecting an ISP in Taiwan.

Anyone?

Facebook and China



There have been some reports suggesting that Facebook may be blocked in China, however, Facebook is not blocked in China. In fact, I experienced Facebook outages myself — from Canada — on July 1 too. At therecent Global Voices Summit I gave a presentation on detecting Internet filtering. While it is sometimes easy to detect, sometimes it is not — often there are alternative explanations.

www.facebook.com (and zh-cn.facebook.com) resolves to a variety of IP addresses, 69.63.176.140, 69.63.184.11 and 69.63.178.12 and a few of them. DNS servers in China and resolving www.facebook.com properly and these IP addresses are accessible when directly accessed from China.

However, while facebook is loading you have probably seen a domain like this, static.ak.fbcdn.net or like this static.ak.facebook.com, flash by in your browser’s status bar. Domains such as these resolve to IP addresses assigned to Akamai. Akamai is a mirroring service that has servers all over the world so depending on where you are you’ll be accessing the same content but from a different server.

One scenario is that there was some temporary issue with Akamai.

Another is that Chia may have blocked one of Akamai’s IP addresses. (Pakistan, for example, once disrupted access to numerous sites because they blocked portions of the Akamai network. Apparently, they did not realize that in trying to a few sites on Akamai they ended up blocking thousands of the world’s most popular sites.)

I tested a variety of Akamai IP addresses that Chinese DNS servers resolved the “static” facebook domains too and all were acessible from multiple points in the country.

YouTube,Geolocation & China



After reading this great post on the ONI blog, did a bit of testing myself. As Youtomb discovered There is a tag available through the YouTube API the indicates the country (or countries in some cases) to which YouTube will restrict access to the video. These videos are not (necessarily) blocked by the country itself, but by Youtube.

<media :restriction type=”country” relationship=”deny”>
TH
</media>

I’ve updated blockpage.com and started a new album for geolocation blockpages. In this case there is a pink line near the top which states “This video is not available in your country.”

As ONI and Youtomb note, there a variety of videos that have this tag. I’ve been able to confirm that the same behavior reported from Thailand occurs when flagged video as accessed from Germany and France. One of the videos about Thailand is marked:

“PL TH DE FR”,”http://www.youtube.com/watch?v=oU9iT3vEdWo”

I checked it from Thailand, Germany and France all experienced the same blocking behaviour. Here’s what I’ve found blocked so far based on the info in the ONI blog:

“TH”,”http://www.youtube.com/watch?v=A1USDXkaJFM”
“TH”,”http://www.youtube.com/watch?v=L4RX2cIDa4E”
“PL TH DE FR”,”http://www.youtube.com/watch?v=oU9iT3vEdWo”
“TH”,”http://www.youtube.com/watch?v=jVbUx4TPkVs”
“TH”,”http://www.youtube.com/watch?v=70m1ncXQjXA”
“TH”,”http://www.youtube.com/watch?v=4dFjO4ZJNDE”
“PF TF YT GP DE RE FR GF MQ PM PL”,”http://www.youtube.com/watch?v=lt2Zsr9bwlE”
“CN”,”http://www.youtube.com/watch?v=3Roy0BFaUtc”
“CN”,”http://www.youtube.com/watch?v=Ffw4-OMmchY”
“CN”,”http://www.youtube.com/watch?v=tzz9rZwFENA”
“CN”,”http://www.youtube.com/watch?v=C1oBcPtH5aU”
“CN”,”http://www.youtube.com/watch?v=liwgfyc1Im4″
“CN”,”http://www.youtube.com/watch?v=FeXZY4eVLlo”
“CN”,”http://www.youtube.com/watch?v=mnIuu73X8es”
“CN”,”http://www.youtube.com/watch?v=kmlDqPtHV-E”
“CN”,”http://www.youtube.com/watch?v=aPg1yvj7thA”
“CN”,”http://www.youtube.com/watch?v=-0D_oGgAGmI”
“CN”,”http://www.youtube.com/watch?v=53QwPeImmAA”
“CN”,”http://www.youtube.com/watch?v=XThGzqBYrh0″
“CN”,”http://www.youtube.com/watch?v=_FnwTj0OuFE”
“CN”,”http://www.youtube.com/watch?v=kdEULgZYxK8″

I’ve been unable to check out China because China is currently blocking all of Youtube. In short the 3 YouTube IP’s are blocked and “www.youtube.com” has been added as a “keyword”.

Although the detailed reference guide for the API does not contain information about the blocking tag, another section of the API has some information about the restrictions:

The restriction parameter identifies the IP address that should be used to filter videos that can only be played in specific countries. By default, the API filters out videos that cannot be played in the country from which you send API requests. This restriction is based on your client application’s IP address.

To request videos playable from a specific computer, include the restriction parameter in your request and set the parameter value to the IP address of the computer where the videos will be played – e.g. restriction=255.255.255.255.

To request videos that are playable in a specific country, include the restriction parameter in your request and set the parameter value to the ISO 3166 two-letter country code of the country where the videos will be played – e.g. restriction=DE.

ISP Filtering



After reading this great enumeration of various efforts to block accidental access to images of child sexual abuse I updated updated blockpage.com to include the blockpages from Sweden, Switzerland and Denmark.

This document notes many of the unintended consequences of filtering, especially overblocking, and it challenges the wisdom of making the blocking look like an error, as opposed to presenting the user with a blockpage:

Providing such a notice seems far more likely to achieve the intended objective of discouraging access to material that is illegal to possess, and raising public awareness of the fact that such a law exists, than merely providing a ‘page not found’ notice.

In the context of Sweden it also discusses threats to block the bit torrent tracker Pirate Bay by adding it to the child pornography blocklist. Mission creep is always present.

I’ve updated blockpage.com with the blockpage that users in Denmark see when they try to access Pirate Bay.

Filtering for the reason of copyright violation is reportedly gaining in Europe:

To recap, the Commission saw great merit in an anti-piracy system where Internet Service Providers (“ISPs”) would voluntarily agree to monitor their users and report the infringers to the industry reps or to the authorities, as well as possibly cut off their internet connection. From what we have heard from our sources at the Commission, a lot of the feedback they have currently received has been very supportive of the idea of filtering and monitoring. This has now emboldened some officials to push forward with plans to implement such voluntary EU-wide proposals, although nothing has yet been firmly decided. EU law clearly states that ISPs have no obligation to monitor and filter content, but the carrot they get from participating is that they are less likely to be sued by IFPI and others.

This is something that the Copyright Lobby has been slowly moving toward here in Canada.

Australia: Filtering in Test Phase



The Australian government is moving ahead with plans to introduce filtering at the ISP level. They’ll block content that is deemed to be harmful to children.

ISP-based filters will block inappropriate web pages at service provider level and automatically relay a clean feed to households.

To be exempted, users will have to individually contact their ISPs.

The exemption is an interesting approach.

Another interesting point in the article is that much of the harm children face online comes from “web 2.0″ services, which the article suggests cannot (or won’t) be filtered.

The GFW of Comcast?



There have been a number of recent reports stating that Comcast is interfering with file-sharing traffic including BitTorent, Gnutella, and Lotus Notes. The reports state that the technique used is the TCP RST packet technique that the GFW of China has made (in)famous. (An intermediary send RST packets to both ends of a connection, effectively terminating it. For more technical info see Ignoring the Great Firewall and ConceptDoppler.)

Interesting.

US-made ‘censorware’ ends up in iron fists



CS Monitor reports:

The software companies involved sell this technology primarily to private companies in the US and abroad. Companies use these tools to keep employees from accessing pornography sites and websites infected with viruses.

Repressive governments also turn to these American systems, not only to filter out porn and viruses, but also to block political, religious, and other websites.

A waste of everyone’s time and money?



A column in The Guardian asks if web filters are just a waste of everyone’s time and money and a 16 year old Australian cracked the countries filtering software. Rather than implement a national firewall-type filtering system Australia makes filtering software available for free hoping that parents will restrict what their children can view online.

Which raises the question: wouldn’t traditional parenting skills – such as talking to your children and teaching them about safe searching and surfing and being honest about sex and internet content – be a lot more effective, better-tailored and cheaper than huge government initiatives?

DNS tampering in China



So, I was doing some searching in google and baidu and noticed two sites (that appeared to be the same) voanews.cn and voanews.com.cn. Upon visiting voanews.com.cn I was surprised to find myself end up at google. voanews.com.cn, like voanews.cn should resolve to 218.25.59.214, not google.

The other thing that stood out was that these sites did not appear to be the Voice of America. And they are not. You can lookup the registrar here. The Registrant Name is 慢速英语 which babel translates as “Slow English” which gave me a chuckle.

I did some more tweaking and voanews.com.cn is being subjected to a form of DNS tampering because it has “voanews.com” in it. It looks like China is bringing back an improved version of their old DNS spoofing. Rather than messing around with individual DNS servers, China has implemented a system which appears to operate like the RST/Keyword filtering system (see this paper for technical details).

DNS lookups for voanews.com (or voanews.com.cn) will return one or more of the following 4 IP’s:

voanews.com has address 213.169.251.35
voanews.com has address 209.36.73.33
voanews.com has address 72.14.205.99
voanews.com has address 72.14.205.104

The last two by the way are google IP addresses. Weird.

But if you sniff the connection you’ll see that what happens is after the request is made 4 spoofed results are received although eventually the correct result is received. But by the time the true result is received applications relying on a dns lookup (e.g. a web browser) have already accepted the initial spoofed result.

ME	->	CN	DNS	Standard query ANY voanews.com
CN	->	ME	DNS	Standard query response A 72.14.205.99
...
CN	->	ME	DNS	Standard query response SOA auth00.ns.uu.net MX 20 ibb2.ibb.gov MX 30 ibb1.ibb.gov MX 10 voa2.voa.gov A 128.11.143.113 NS auth00.ns.uu.net NS auth100.ns.uu.net

Domain Name System (response)
        voanews.com: type SOA, class IN, mname auth00.ns.uu.net
        voanews.com: type MX, class IN, preference 20, mx ibb2.ibb.gov
        voanews.com: type MX, class IN, preference 30, mx ibb1.ibb.gov
        voanews.com: type MX, class IN, preference 10, mx voa2.voa.gov
        voanews.com: type A, class IN, addr 128.11.143.113
        voanews.com: type NS, class IN, ns auth00.ns.uu.net
        voanews.com: type NS, class IN, ns auth100.ns.uu.net

ME	->	CN	ICMP	Destination unreachable (Port unreachable)

A variety of other domain names are affected, not just voanews.com.

“.yahoo.com” briefly blocked in China



For the most part* the GFW blocks in two ways:

1) IP blocking
2) Keyword in url blocking

IP blocking is pretty easy to spot, traceroute will fail at the backbone level in China, and there will only be outgoing syn packets to the IP, the 3-way tcp handshake will never be established. (Note: all domains hosted on that IP are affected).

“Keyword-in-URL” blocking is different and sometime s a bit awkward. First, the keyword-in-url filtering is bi-direction so you can trigger it from outside -> to -> China or from China -> to -> outside.

Second, “keywords” can be domains themselves, I’ve even seen URLs used as a “keyword”. If these keywords appear in the HTTP Host header or in the GET request they will be “blocked”.

Third, the way the blocking works is that the 3-way TCP handshake will be established but when the GET request goes through the GFW sends RST packets to both the requester and the host (spoofed to appear as if they were from one another) to tear down the connection then host and the requester respond to each other with more RST packets. (There is some additional variation, but thats the basic version, see Steven Murdoch et al’s paper http://www.cl.cam.ac.uk/~rnc1/ignoring.pdf for more details).

The tricky part is that depending on the GFW (maybe related to the load) some of the transaction will go through. So for example, you may get half (or more!) of the html before the RST packet. Also, part of the page may load because, for example, it is not until an image with a keyword in its file name is loaded that the RST packet is sent.

Finally, the most tricky part. Because of the combination of additional RST packets from the GFW (and then the RST from the requester and host in response) further connections between the requester and host (not the internet as often reported) are disrupted for sometime. This means that if you are in China and you connect to Google (hosted outside of China) and you search for a banned keyword (the keyword goes into the GET request) you’ll be blocked. If you hit the back button in your browser and get the cached copy of Google and then search for a NOT blocked keyword it will appear to be blocked because your connection to Google is still being subjected to RST packets. This sometimes results in reports that certain keywords are blocked when in fact they are not.

Another important point to recognize is that this is dependant upon IP address. So, if the site you connect to has multiple IP addresses the behaviour may seem even more consistent and you requests may be being server by different IP addresses. For testing purposes it is best to connect directly to an IP rather than a domain name to ensure that you are always connecting to the same IP.

On June 27 2007, I captured traffic between myself and yahoo.cn (hosted in China, as well as some other hosts in China) using “.yahoo.com” (yes, that starts with a period, e.g. if affects all *.yahoo.com domains including mail.yahoo.com) and can confirm that it was subjected to the “keyword-in-url” blocking behaviour with “.yahoo.com” as the keyword.

However, and this is my opinion, the RST packet were quite slow to respond. In some cases the RST did not come until after the page loaded successfully (future connection were subjected to RST’s). It is possible that many requests for “.yahoo.com” were causing the GFW to slow down, anecdotaly the RST packets were not being received as fast as they usually are.

On June 28 2007″.yahoo.com” is no longer blocked by China.

Tunisia, SmartFilter & DailyMotion



Tunisia uses commercial filtering software called SmartFilter , which is produced by the U.S. company Secure Computing, to filter Internet access in Tunisia. This software is configured to blocked pre-defined categories of content – content classified by SmartFilter – including at least four SmartFilter categories: Anonymizers, Nudity, Pornography, and Sexual Materials.

Tunisia’s Internet filtering is done in a non-transparent way. When users attempt to access a blocked page, they are not informed that the page is filtered, but instead merely receive a standard error message, a 404 “File Not Found” error. However, the actual HTTP header, is not a 404, but a 403 Forbidden error generated by the filtering system SmartFilter, in conjunction with NetCache caching servers. SmartFilter can be configured with a blockpage that indicates to users that the site has been blocked and why, however, unlike other countries using this exact same filtering system, Tunisia has copied the text from the Internet Explorer 404 page, and used this as a blockpage to make the filtering appear to be an error.

(See Astrubal’s detailed explanation here).

Recently, the video-sharing web site dailymotion.com was blocked in Tunisia. It was blocked because SmartFilter categorized the web site as pornography, and, since Tunisia blocks the pornography category the web site was blocked. Some time bewteen April 4, 2007 and April 9, 2007 SmartFilter removed dailymotion.com from the pornography category.

It is being reported that access to the site is available through at least one ISP in Tunisia. Depending on how frequently the various filtering and cache server’s update there wil likely be some variation acroos ISPs for sometime. Eventually, full acess should be restored. (Tunisia could, as they do with a varity of other content including humrn rights information, add the website as a custom url on top of their SmartFilter categories and intentionally block the site if they choose to do so).

This is a very significant case as it demontrates how the decisions made by filtering companies affect Internet access in entire countries.

Mission Creep



Mission Creep:

Regardless of the initial reason for implementing Internet filtering, there is increasing pressure to expand its use once the filtering infrastructure is in place.

Norway has been filtering child abuse websites since October 2004. The system operates similarly to others (Canada, UK, Sweden) where people can report illegal websites, the the websites are reviewed and if they are deemed it to be illegal they are added to the block list.

These programs, especially since they are restricted to web-based filtering, do not impact a person determined to view child abuse images on the Internet. At best, they prevent casual or inadvertent access to designated websites. Filtering systems can be easily circumvented by those actively seeking and disseminating child pornography. Moreover, web-based child pornography is only the tip of the iceberg. Much of it moves by P2P or private groups online.

Also, none of these filtering systems contribute to the prosecution of those trafficking in child pornography. These Internet filtering programs that have been implemented contribute neither to the arrest and prosecution of individuals within these countries who access child pornography nor to the people that create or traffic these images outside of the country.

Norway’s system differs slightly in that it is a partnership between the police and the ISP’s whereas other similar programs in Canada and the UK are private partnerships between NGO’s and ISP’s (although there government was involved in bringing the partnership together).

In Norway, Telenor and KRIPOS, the Norwegian National Criminal Investigation Service, have introduced a filtering system to block child abuse websites on the Internet. Telenor is responsible for the technical solutions and KRIPOS provides updated lists of web sites that distribute such material. The cooperation stemmed from the efforts of the Norwegian Minister of Justice who brought the two together.

In communication with the authorities involved I was told that block lists are produced by the police through a combination of browsing known newsgroups where such web sites are promoted and in conjunction with a police operated tip line. Domain names (and very few IP addresses) that contain child abuse images are added to the filter. The filtered websites are reviewed at least once a month.

Telenor does not store any data or logs on users that trigger the filter. When users trigger the filter they are presented with a webpage that indicates that the site is blocked because it contains illegal content. It also contains links to Norwegian law as well as an email address users can write to if they believe the content has been blocked in error. Also, the filter does not affect file sharing services or email.

I was also told that any websites related to payment for access to child abuse websites are added to the block list. The stated goal is to disrupt the ability of such organizations to sell and distribute child pornography to a Norwegian client base. (I am not sure of what exactly these sites are, I assume they don’t mean paypal).

Now 3 years later a new filtering proposal is being floated in Norway. (The original article is only available in Norwegian). Via the translation provided by luni.net the proposal calls for expanding the system to block sites that are:

* Foreign gambling sites (preserving the very lucrative Government owned monopoly on gambling)
* P2P sites offering illegal downloads such as MP3s, TV shows and movies
* Sites desecrating the Flag or Coat of Arms of a foreign nation
* Sites promoting hatred towards public authorities, racism and hate speech
* Sites offering pornography that may cause offence

I am not sure if this proposal will actually be implemented, however, it clearly indicates that once the filtering system is in place there will be pressure to expand its use beyond the original stated goal.

Recently, ISP’s in Canada announced that they would begin filtering child abuse websites and in response of public pressure (and here) and the work of Michael Geist documentation has been posted that clarifies some key concerns such as appeals process, continuing review and potential overblocking.

While many questions have been answered some key ones remain, such as:

* What is actually blocked, IP’s, domains, urls or url’s to specific images? The FAQ says the collect URL’s, but does not how ISPs are choosing to block.

* Is there any effort made to contact equivalent organizations to Cybertip (or law enforcement) in the country in which the offending content is hosted? The FAQ says “Reports deemed potentially illegal are forwarded to the appropriate law enforcement jurisdiction” but it is unclear if that means Canada or the world.

* Have the ISPs sought/received permission from the CRTC to block sites? Do they need permission from the CRTC? the FAQ says that “ISPs’ AUPs and Terms of Service permit this action” but it does not address CRTC authorization.

and …

Is it possible to have a test research site without any illegal content added to the blocklist for research purposes?

In terms of mission creep, Michael Geist warned:

Canadian law would currently prohibit extending the block list to other forms of content, however, ISPs and the Internet community must be vigilant to ensure that fears of a “slippery slope” do not come to fruition.

I think the current Norway case is indicative of the direction that filtering takes. It often begins with a serious issue that demands concrete action but then becomes a blunt instrument to pursue other content areas. We do need to be vigilant.

Since filtering programs do not facilitate the removal of child abuse images and do not facilitate prosecution of those who create and traffic in them they only conceal the problem. While there may never be an easy answer, I believe that this serious issue demands solution that is not predicated on blocking “accidental access”, especially since the risk of mission creep is so substantial.

The issue of child abuse images on the Internet is a significant international challenge. However, this challenge should not deter us from fighting the trafficking of child abuse images. We have to make a serious effort to coordinate our efforts nationally and internationally within an institutional framework that has the expertise and authority to identify and remove child pornography and prosecute those who are creating, distributing, buying and accessing it.

I don’t have a good enough answer to this serious and important issue, but I know that filtering is not the solution.