One of the tips that security professionals love to give is to use a VPN on public wifi networks. This is great advice and (I personally like PrivateInternetAccess and NordVPN). Recently I noticed nike.com blocks traffic from TOR and VPN providers:
That got me wondering what other websites were blocking traffic from these sources so I decided to test the Alexa Top 1000 websites.
First I needed to get a list of the Top 1000 websites. To do this I used this line of command line kung fu that grabs a CSV of the top 1 million websites and puts the top 1000 in a urls.txt file:
curl -s -O s3.amazonaws.com/alexa-static/top-1m.csv.zip ; unzip -q -o top-1m.csv.zip top-1m.csv ; head -1000 top-1m.csv | cut -d, -f2 | cut -d/ -f1 > urls.txt
Here is the output from this command.
I now needed to automatically take a screenshot of 1000 websites. I had started to write my own terrible python script using selenium until Chris Truncer pointed me to his amazing project called EyeWitness.
The command I used was:
./Eyewitness.py --web -f urls.txt
During my first test using PrivateInternetAccess I found 11 of 1000* blocked access with a 401/404:
With craigslist.org, nike.com, ticketmaster.com and hilton.com being the most inpactful websites on that list:
I then ran the test again through tor (using the tor container I built) and found 40 of 1000* blocked access with a 401/404: :
With many more asking for a captcha before gaining access:
Epilogue: I play defense in my day job. I understand the need stop malicious traffic from reaching your website. This isn’t an indictment just an academic exercise although if more and more websites take this approach tools like TOR and commercial VPNs will become less useful.
I was surprised at how many porn websites are in the top 1000 overall websites.
It takes 1.8 gigs of storage to screenshot the top 1000 websites.
*Your results will vary on what is blocked based on exit node, VPN, time you test and what color shirt you have one.