Blog Posts

Finding Additions To The Umbrella DNS Popularity List

Since I have started looking at the Umbrella DNS Popularity List I was interested in seeing how much the data changes day to day.  I fired up RStuido and wrote some terrible code but finally got it to work with some help.

Yesterday there were 80937 new DNS names on the list that were not on the list the day before.
(Update: Here is a CSV of the 169366 domains that were not on the list April 1st but was on the May 1st list.)

Here are the new additions on a map:

Link to the full screen map.

Here is a CSV of the data with GEOIP information added. 

Here is code I ended up with if you want to build your own:

Up next is to run these domains through Virustotal to see if any of them are bad.

Here is a picture semi related to this blog post to make it look pretty when I share it on social media. 

Big Data’ing The Umbrella DNS Popularity List

Recently I started looking at the Umbrella DNS Popularity List and did a blog post about it here. The data seemed valuable and lacking at the same time so I spent my *limited* free time this week learning about R and RStudio.

Protip:  If you want to play along at home there is an RStudio docker container so all you need to do is:

docker run -d -p 8787:8787 -e USER=<username> -e PASSWORD=<password> rocker/rstudio

Getting today’s list loaded into R is as simple as:

# Get Todays List
if (file.exists(fn)) file.remove(fn)
temp <- tempfile()
unzip(temp, "top-1m.csv")
today <- read_csv("top-1m.csv", col_names = FALSE)

Now you have the Top 1 million DNS requests from Umbrella ready to be “big data’ed”.

At the start of this project I wanted to do the following:
Search the DNS names for keywords. (Done).
Map all the DNS records on a map. (Done, Kinda).
Compare today’s and yesterday’s records for new DNS records.
Check all the DNS records against Censys and record open ports, and software.
Check all the DNS records against VirusTotal and see if any of them are known bad.
Check all the DNS records against SSLLabs and record SSL grade.
Take a nap.

My limited results so far follow with hopefully more to come.

Search The DNS Names

I wanted to do this to be able to search the list for a keyword and build a table and map of the data.  This was fairly easy and with help of leaflet and datatables here is the output of searching today’s data for cisco.

Here is the map:

Here is a link to the data. 

Here is the R code I wrote:

Map All The DNS Records On A Map.

I got started on this and quickly realized that looking up the GEOIP information and mapping a million DNS records was going to take a week so I decided to do the Top 25,000 as a POC and come back and do all 1,000,000 later (maybe).

Here is the 25,000 Map:

Here is the R code I wrote:

I also built a map with the Top 100K on it but it is huge (Load at your own risk).

…More to come.

I will be spending some more time on this over the next couple of weeks but cant think @EngelhardtCR and @hrbrmstr enough for all the help they have been over the last week as.   They are true data scientist and I am just a hacker with a blog.  : )

If you have any questions or suggestions please let me know on twitter at @jgamblin.

Here is a picture semi related to this blog post to make it look pretty when I share it on social media. 

Exploring Cisco’s Top 1 Million Domains Data

Cisco offers a daily list of the million most queried domain names from Umbrella (OpenDNS) users.    I had some time this weekend so decided to spend some time playing around with the data to see what I could find so I spun up a lightsail server and got to work.

Grabbing the file is as simple as:

You can retrieve a specific date like this:
(Looks like 2017-01-20 is the earliest they have online).

Once you get that downloaded and unzipped (unzip you can start exploring.

You can pull out the top 10 domains with this command:
head -n 10 top-1m.csv


(Full Output)

You can search for keywords with this command:
cat top-1m.csv | grep "opendns"


(Full Output)

To count the domain levels use this command:
awk -F, '{count=split($2,a,"."); print count}' top-1m.csv | sort | uniq -c | awk '{print $2,$1}' | sort -k1,1n

1 1086
2 263509
3 469756
4 193802
5 54281
6 13698
7 2952
8 689
9 172
10 16
11 26
12 2
13 1
14 1
15 1
16 1
17 1
18 1
19 1
20 1
21 1
22 1
23 1

(Full Output)
Notice anything strange here? Hint: A domain name requires at least two levels to be valid.

To find the broken DNS names in this list this command works:
cat top-1m.csv | awk -F, 'BEGIN {file="top-1m.csv" ; while ((getline line < file) > 0) {if (line ~ /#/) continue; tld[tolower(line)] = 1}} {foo=split($2,a,"."); if (foo == 1) {if (!(a[1] in tld)) {print $0}}}'  


(Full Output)

Find domains added to the list for today.
I  wrote a script to download the last two days of files and compare them for new domains:

You can find the output for April 24, 2017 here.

Overall I am really impressed with this data and will be using it to do more research and to track trends across the internet.  They have some more to do but it is an amazingly valuable free tool.

Also recently I have feel in love with sprunge to push data to an ad free “pastebin” from the command line:

cat file.txt | curl -F 'sprunge=<-'

Burp Settings File

I am a huge fan of Tim Tomes and his Burp Suite Configuration Suggestions blog post.   The problem is that I only use Burp a couple times a month and end up facing this screen and have to re-configure burp on every launch:

So I built burpsettings.json that:

  • Disables Browsers XSS Protection
  • Disables Burp Collaborator Server
  • Disables Intercept by Default
  • Changes Scan Mode to Thorough
  • Turns Off Anonymous Feedback

This will help make my burp startup time a lot faster and I thought I would share the config file so it could help someone else also.

Newly Registered Domain Name Keyword Search

Today I was asked if it was possible to generate a list of domain names registered everyday with a keyword in the record (company name, city, trademark, etc).   There are a few paid services that do this and has a web based tool that will do this but I wanted to automate it so I could use it with a slackbot so I put together this 4 line bash script:

./ keyword

This is super simple script but as they say “simplicity is the ultimate sophistication“.

Site Footer