Exploring Cisco’s Top 1 Million Domains Data

Cisco offers a daily list of the million most queried domain names from Umbrella (OpenDNS) users.    I had some time this weekend so decided to spend some time playing around with the data to see what I could find so I spun up a lightsail server and got to work.
Grabbing the file is as simple as:
wget http://s3-us-west-1.amazonaws.com/umbrella-static/top-1m.csv.zip
You can retrieve a specific date like this:
wget http://s3-us-west-1.amazonaws.com/umbrella-static/top-1m-yyyy-mm-dd.csv.zip
(Looks like 2017-01-20 is the earliest they have online).
Once you get that downloaded and unzipped (unzip top-1m.csv.zip) you can start exploring.
You can pull out the top 10 domains with this command:
head -n 10 top-1m.csv

1,google.com
2,www.google.com
3,microsoft.com
4,facebook.com
5,doubleclick.net
6,g.doubleclick.net
7,clients4.google.com
8,googleads.g.doubleclick.net
9,apple.com
10,fbcdn.net

(Full Output)

You can search for keywords with this command:
cat top-1m.csv | grep "opendns"

437,opendns.com
719,hydra.opendns.com
720,sync.hydra.opendns.com
1314,disthost.opendns.com
2756,api.opendns.com
4565,cacerts.opendns.com
5569,ipf.opendns.com
5699,block.opendns.com
7024,updates.opendns.com
8482,bpb.opendns.com

(Full Output)

To count the domain levels use this command:
awk -F, '{count=split($2,a,"."); print count}' top-1m.csv | sort | uniq -c | awk '{print $2,$1}' | sort -k1,1n

1 1086
2 263509
3 469756
4 193802
5 54281
6 13698
7 2952
8 689
9 172
10 16
11 26
12 2
13 1
14 1
15 1
16 1
17 1
18 1
19 1
20 1
21 1
22 1
23 1

(Full Output)
Notice anything strange here? Hint: A domain name requires at least two levels to be valid.

To find the broken DNS names in this list this command works:
cat top-1m.csv | awk -F, 'BEGIN {file="top-1m.csv" ; while ((getline line < file) > 0) {if (line ~ /#/) continue; tld[tolower(line)] = 1}} {foo=split($2,a,"."); if (foo == 1) {if (!(a[1] in tld)) {print $0}}}'  

1200,home
1490,local
2082,za
3916,lan
6350,url
10173,belkin
10869,uop
11187,localdomain
12887,localhost

(Full Output)

Find domains added to the list for today.
I  wrote a script to download the last two days of files and compare them for new domains:
https://gist.github.com/jgamblin/184590e2ba64371730e435ab2977e4cf

You can find the output for April 24, 2017 here.

Overall I am really impressed with this data and will be using it to do more research and to track trends across the internet.  They have some more to do but it is an amazingly valuable free tool.
Also recently I have feel in love with sprunge to push data to an ad free “pastebin” from the command line:

cat file.txt | curl -F 'sprunge=<-' http://sprunge.us

Top 1000 Websites Blocking VPN & TOR Users

One of the tips that security professionals love to give is to use a VPN on public wifi networks.   This is great advice and  (I personally like PrivateInternetAccess and NordVPN). Recently I noticed nike.com blocks traffic from TOR and VPN providers:
Screen Shot 2016-07-06 at 6.36.19 AM
That got me wondering what other websites were  blocking traffic from these sources so I decided to test the Alexa Top 1000 websites.
First I needed to get a list of the Top 1000 websites.   To do this I used this line of command line kung fu that grabs a CSV of the top 1 million websites and puts the top 1000 in a urls.txt file:
curl -s -O s3.amazonaws.com/alexa-static/top-1m.csv.zip ; unzip -q -o top-1m.csv.zip top-1m.csv ; head -1000 top-1m.csv | cut -d, -f2 | cut -d/ -f1 > urls.txt
Here is the output from this command.
I now needed to automatically take a screenshot of 1000 websites.   I had started to write my own terrible python script using selenium until Chris Truncer pointed me to his amazing project called EyeWitness.
The command I used was:
./Eyewitness.py --web -f urls.txt
Screen Shot 2016-07-06 at 8.45.38 AM
During my first test using  PrivateInternetAccess I found  11 of 1000* blocked access with a 401/404:
hilton.com
nike.com
craigslist.org
tickermaster.com
tradeadexchange.com
blog-newstime.com
brightonclick.com
adnetworkperformance.com
kissanime.to
neobux.com
loading-delivery2.com
With craigslist.org, nike.com, ticketmaster.com and hilton.com being the most inpactful websites on that list:

I then ran the test again through tor (using the tor container I built) and found 40 of 1000* blocked access with a 401/404: :
adnetworkperformance.com
nordstrom.com
overstock.com
asos.com
prjcq.com
avito.ru
quikr.com
bestbuy.com
retailmenot.com
blog-newstime.com
secureserver.net
brightonclick.com
shopclues.com
craigslist.org
ticketmaster.com
expedia.com
tradeadexchange.com
foxnews.com
trulia.com
garmin.com
tube8.com
groupon.com
usbank.com
ticketmaster.com
irs.gov
usps.com
justdial.com
walmart.com
kohls.com
wayfair.com
lowes.com
hilton.com
whitepages.com
macys.com
xbox.com
newegg.com
zara.com
nike.com
zhihu.com
With many more asking for a captcha before gaining access:
http.amazon.com
Epilogue:  I play defense in my day job.  I understand the need stop malicious traffic from reaching your website.  This isn’t an indictment just an academic exercise although if more and more websites take this  approach tools like TOR and commercial VPNs will become less useful.
Final Notes: 
I was surprised at how many porn websites are in the top 1000 overall websites.
It takes 1.8 gigs of storage to screenshot the top 1000 websites.
*Your results will vary on what is blocked based on exit node,  VPN, time you test and what color shirt you have one.

LetsEncrypt.org TLS Certificate With Nessus

Letsencrypt.org is a new project that offers free TLS certificates to allow people to encrypt their traffic.
The project is in a limited beta so I decided that a good test would be to install one of their certificates on to a Nessus scanner I host in AWS.
The install wasn’t complicated and only took about 15 minutes and 9 commands:
cd ~
git clone https://github.com/letsencrypt/letsencrypt
cd letsencrypt
./letsencrypt-auto --agree-dev-preview --server https://acme-v01.api.letsencrypt.org/directory auth
sudo service nessusd stop
sudo cp -i /etc/letsencrypt/live/scan.jerrygamblin.com/fullchain.pem /opt/nessus/com/nessus/CA/servercert.pem
sudo cp -i /etc/letsencrypt/live/scan.jerrygamblin.com/privkey.pem /opt/nessus/var/nessus/CA/serverkey.pem
sudo cp -i /etc/letsencrypt/live/scan.jerrygamblin.com/chain.pem /opt/nessus/com/nessus/CA/cacert.pem
sudo service nessusd start

Now my padlock is green and my traffic is secure:

The 10 Minute Free* Private VPN

People ask me all time what the one thing they should do to “stay safe” on the internet is.  If I had to pick one it would be to use a VPN when you are on a network you dont own or trust. 

It has always taken a little bit of technical skill to setup a private VPN but my friends at WebDigi have done an amazing job of making setting up a free (if you dont use it too much) private VPN on AWS easy. 

Here is the blog post on how to set it up. 

Here is the projects github page. 

Here is the walkthrough video:

Here are some tips from me: 

  • Use LT2P and not PPTP.  It is safer. 
  • Try to delete and rebuild the image twice a month to delete the logs and get a new IP address (Yes, I am paranoid).  
  • It is free to start but if you send a lot of traffic through the VPN it can end up costing you a few bucks a month. Setup billing alerts. 

Lessons I Learned In 2014

As 2014 draws to a close here is a (not nearly complete) list of the lesson I learned this past year:

Ignore the sign: Jump in the bouncy castle.

There are two ways you can look at your life: What happened to you or What you did. You only get to pick one.

If you want the truth ask a 5 year old.

Find ways to forgive mistakes.

Not every problem has an entirely acceptable solution.

To get things done tell an amazing story.

Travel every chance you get. Travel makes you brave.

Be grateful for every moment you have. Every single one.

10 Books That Influenced My Life

image

I was challenged by my friend on Facebook to name 10 books the influenced my life.  I figured if I was going to put together a list I might as well put it on my blog.

So here are the 10 books in alphabetical order that have influenced my life:

48 Laws of Power
I read this book 4 or 5 years ago and decided that if this is what it took to be successful l didn’t want to be.  I would rather be a nice guy and be “unsuccessful” then to base my life on this book.

Augustus: The Life of Rome’s First Emperor
You didn’t think I could make a list of my favorite books and not include one on roman history did you? Augustus found Rome made of clay and left it made of marble. As Rome’s first emperor, Augustus transformed the unruly Republic into the greatest empire the world and laid the foundation for all of Western history to follow.

Maniac Magee
I read this book when I was in 5th grade and having a hard time fitting in.  It really made a huge difference in the outlook in my life. I still love this book.  

Mere Christianity
I first read this book when I was so teetering on unbelief.  I’ve reread it many times since but that first read through was life altering.

Paddington Bear
I bought this book for my son when I was in London. He and I like to read it and laugh at Paddington. This book will always be special to me.

The Outsiders
I read this book in 8th grade English. It is one of the best books I have read about class warfare and about how we all really just want to fit in.

The Pursuit of Happyness
A great story about how a man can drastically change his life if he never gives up.   One of the most inspirational stories I’ve ever read.

Titan
Titan is a biography of John D. Rockefeller, the founder of Standard Oil and the world’s first billionaire. At its core it is about work ethic and about taking what you have and making something out of it without anyone’s help.

To Kill A Mocking Bird
To quote Homer Simpson ‘To Kill a Mockingbird’ gave me no useful advice on killing mockingbirds but it did teach me not to judge a man based on the color of his skin.

You are not so smart
This book is a fun read. It talks about 48 things we do that don’t make any sense. After reading this book I started catching myself making a lot of irrational decisions on a daily basis.

Minipwner

At Derbycon I picked up a Minipwner and have been spending the last few days playing with it and came up with a few awesome scripts that if you have one you need to install all your device if you have one.

The First one is a wireless picker script that I found and edited from here

killall -9 wpa_supplicant
iwlist wlan0 scanning > /tmp/wifiscan #save scan results to a temp file
scan_ok=$(grep “wlan” /tmp/wifiscan) #check if the scanning was ok with wlan0
if [ -z “$scan_ok” ]; then
    killall -9 wpa_supplicant
    iwlist wlan0-1 scanning > /tmp/wifiscan
fi
scan_ok=$(grep “wlan” /tmp/wifiscan) #check if the scanning was ok
if [ -z “$scan_ok” ]; then #if scan was not ok, finish the script
    echo -n “
WIFI scanning failed.


    exit
fi
if [ -f /tmp/ssids ]; then
    rm /tmp/ssids
fi
n_results=$(grep -c “ESSID:” /tmp/wifiscan) #save number of scanned cell
i=1
while [ “$i” -le “$n_results” ]; do
        if [ $i -lt 10 ]; then
                cell=$(echo “Cell 0$i – Address:”)
        else
                cell=$(echo “Cell $i – Address:”)
        fi
        j=`expr $i + 1`
        if [ $j -lt 10 ]; then
                nextcell=$(echo “Cell 0$j – Address:”)
        else
                nextcell=$(echo “Cell $j – Address:”)
        fi
        awk -v v1=“$cell” ’$0 ~ v1 {p=1}p’ /tmp/wifiscan | awk -v v2=“$nextcell” ’$0 ~ v2 {exit}1’ > /tmp/onecell #store only one cell info in a temp file
        onessid=$(grep “ESSID:” /tmp/onecell | awk ’{ sub(/^[ t]+/, “”); print }’ | awk ’{gsub(“ESSID:”, “”);print}’)
        oneencryption=$(grep “Encryption key:” /tmp/onecell | awk ’{ sub(/^[ t]+/, “”); print }’ | awk ’{gsub(“Encryption key:on”, “(secure)”);print}’ | awk ’{gsub(“Encryption key:off”, “(open)  ”);print}’)
        onepower=$(grep “Quality=” /tmp/onecell | awk ’{ sub(/^[ t]+/, “”); print }’ | awk ’{gsub(“Quality=”, “”);print}’ | awk -F ’/70’ ’{print $1}’)
        onepower=$(awk -v v3=$onepower ‘BEGIN{ print v3 / 14}’)
        onepower=${onepower:0:3}
        onepower=“(Signal strength: $onepower of 5)”
        echo “$onessid    $oneencryption $onepower” >> /tmp/ssids
        i=`expr $i + 1`
done
rm /tmp/onecell
awk ’{printf(“%5d : %sn”, NR,$0)}’ /tmp/ssids > /tmp/sec_ssids #add numbers at beginning of line
grep ESSID /tmp/wifiscan | awk ’{ sub(/^[ t]+/, “”); print }’ | awk ’{printf(“%5d : %sn”, NR,$0)}’ | awk ’{gsub(“ESSID:”, “”);print}’ > /tmp/ssids #generate file with only numbers and names
echo -n “Available WIFI networks:

cat /tmp/sec_ssids #show ssids list
echo -n “Enter the numeric option for your selected network: ”
read nsel
pattern=$(echo “ $nsel : ”)
wifissid=$(grep “$pattern” /tmp/ssids)
wifissid=$(echo “$wifissid” | awk -v pat=“$pattern” ’{gsub(pat, “”);print}’ | awk ’{ sub(/^[ t]+/, “”); print }’)
wifissid=${wifissid:1:`expr ${#wifissid} – 2`}  #several commands to get clean name of ssid
if [ $nsel -lt 10 ]; then
    cell=$(echo “Cell 0$nsel – Address:”)
else
    cell=$(echo “Cell $nsel – Address:”)
fi
nextsel=`expr $nsel + 1`
if [ $nextsel -lt 10 ]; then
    nextcell=$(echo “Cell 0$nextsel – Address:”)
else
    nextcell=$(echo “Cell $nextsel – Address:”)
fi
awk -v v1=“$cell” ’$0 ~ v1 {p=1}p’ /tmp/wifiscan | awk -v v2=“$nextcell” ’$0 ~ v2 {exit}1’ > /tmp/cellinfo0 #store only the selected cell info in a temp file
grep -v ESSID /tmp/cellinfo0 > /tmp/cellinfo # delete ESSID line to avoid later grep mistakes
rm /tmp/cellinfo0
wifichannel=$(grep “ Channel:” /tmp/cellinfo)
wifichannel=$(echo “$wifichannel” | awk ’{gsub(“ Channel:”, “”);print}’ | awk ’{ sub(/^[ t]+/, “”); print }’) #get clean wifi channel
wifimode=$(grep “ WEP” /tmp/cellinfo) #check if encryption mode is WEP
if [ -n “$wifimode” ]; then   #check if $wifimode is not an empty string
    wifimode=“wep”
else
    wifimode=$(grep “WPA2 ” /tmp/cellinfo) #check if encryption mode is WPA2
    if [ -n “$wifimode” ]; then
        wifimode=“psk2”
    else
        wifimode=$(grep “WPA ” /tmp/cellinfo) #check if encryption mode is WPA
        if [ -n “$wifimode” ]; then
            wifimode=“psk”
        else
            wifimode=“none”
        fi
    fi
fi
if [ “$wifimode” != “none” ]; then #ask for passwork when needed
    echo -n “Enter password of the selected WIFI network: ”
    read wifipass
fi
rm /tmp/cellinfo
rm /tmp/ssids
rm /tmp/sec_ssids
rm /tmp/wifiscan
#write results in the wireless config file and reset wifi interface
uci set wireless.@wifi-device[0].channel=$wifichannel
uci set wireless.@wifi-iface[0].ssid=“$wifissid”
uci set wireless.@wifi-iface[0].encryption=$wifimode
uci set wireless.@wifi-iface[0].key=$wifipass
uci commit wireless
echo -n “

Trying to connect to WIFI network.
(Wait a few seconds and check status with: iwconfig )


wifi down
wifi0

The Second script is a reverse SSH Script that will allow you to connect to your Minipwner from anywhere on the net.

set -x

TARGET_HOST=ec2-XX-XX-XX-XX.compute-1.amazonaws.com
if test -n “${2}”
then
    TARGET_PORT=${2}
else
    TARGET_PORT=1111
fi
TARGET_USER=’UserName’

while true
do
    echo “establishing reverse ssh tunnel to ${TARGET_HOST}:${TARGET_PORT}”
    ssh -R ${TARGET_PORT}:localhost:22 -N ${TARGET_HOST} -l ${TARGET_USER} -o ServerAliveInterval=30
    sleep 1
done

Yes this is super geeky but I just wanted to have them somewhere I could find them.

Site Footer