NVD CVE Analysis Timing

The National Vulnerability Database plays a vital role in the CVE publication process that many people may overlook or not know they are responsible for. After MITRE publishes a CVE, the NVD enriches it with data points that make it actionable by security companies and professionals.

Some of these data points include:
CVSS 3.1 Base Score

I was recently asked how long, on average, it takes NVD to complete its initial analysis. I looked around for a bit and couldn’t find the information, so I built this jupyter notebook to find out using the NVD Change History API and the very helpful {Initial Analysis} filter the API provides.

After running the notebook for all CVEs published between January 1st, 2022, and November 15th, 2022, the average analysis time takes 5 days 22:14:22 (the Standard Deviation is 3 days 15:42:25). I will leave it to the math geeks to decide which number is more valuable, but either number is excellent considering the growth of CVEs.

The quickest analyzed CVE this year is CVE-2022-43488 at 27 minutes, and the slowest is CVE-2021-35036 at over 213 days.

Since everyone loves a good visualization, here is a quick graph (and code) of the analysis timing.

Note: The notebook is not the fastest, and with the API rate limiting, it took just over 7 hours to run the 22,619 requests. If you are interested in looking at the data, it is all here in a CSV.


For a recent project, I needed all the NVD CVE and EPSS data in Elasticsearch and couldn’t find an easy way to do it, so I built CVElk. CVElk quickly builds a local Elastic Stack using docker compose with the help of a simple shell and python script.

Philipp Krenn from Elastic also contributed an updated dashboard to the project to help with the data visualization.

Want to Help?

The project’s code is hosted on GitHub, and I am always happy to try to implement any improvements and would love to see this become useful to a wide group of people.

Security Summer Camp Talks I Can’t Wait To See

Security Summer Camp, as it is colloquially known, is three security conferences that occur during the same week in Las Vegas.

The three conferences that make up Security Summer Camp are:

While preparing for these conferences, I dug through their schedules and picked out the talks I was interested in catching.

BSides Las Vegas

BSides Las Vegas is back with a fantastic schedule and one of the best community events of the year.
Here are a few fantastic talks I will try to catch.

Blackhat USA

Blackhat USA is the “commercial conference” of the three but still has a great lineup of talks.
Here are the talks I will be catching (and presenting at)


DEF CON is probably the most well-known hacker conference in the world, and rightfully so with the quality of talks this year.

Along with these talks, they have these interest-specific villages where I will spend a lot of time. Here are the villages where I know I will be spending time.

Wrapping Up

Did I miss anything you are interested in seeing, or will you be in Las Vegas and want to chat? Let me know on Twitter at @JGamblin.

On the road again…

Covid restrictions are starting to be relaxed, so I am beginning to feel like Willie and am getting on the road again, and in the next six weeks, I am attending and presenting at these four amazing events.


I am amazingly excited to attend bsides.ky in the Cayman Islands at the end of May, where I will be leading a workshop on using Jupyter Notebooks for Security Professionals.


BSidesSF is one of my favorite BSides events, from my first time in 2014 when it was held at the DNA Lounge in SOMA to recently when it moved to City View at Metreon to be closer to RSA. I am looking forward to catching up with everyone at the event.


The RSA conference has always been the start of the security conference year, and the fact it is happening in early June is going to be interesting to see how it changes the conference. I will attend and give multiple talks on CVE data and Risk Based Vulnerability Management.


I am lucky to be able to attend Cisco Live for the first to talk to our customers and attendees about Risk Based Vulnerability Management.

If you are attending any of these events, I would like to grab some time and hear what in security is interesting to you. Please ping me on Twitter or drop me an email if you will be around.

Introducing CVE.ICU

I have spent a lot of time this year working with CVE data and most of that time in Jupyter notebooks. Over the holiday season, I decided to build a website from these notebooks using Github Actions, Github Pages and NBConvert.

CVE.ICU ended up being the end product, and here is the source code. It is still an early work in progress, but please let me know if you see anything I should add.

Tracking CPE Data Quality Issues

In a Study in Scarlet, Sherlock Holmes said, “It is a capital mistake to theorize before one has data,” which is one of my favorite Sherlock quotes. For the last month or so, my team has been dealing with missing CPE data points in the Mitre CPE data, and it finally forced me to set down and put together a new tool to analyze the data.

What Are CPEs?

CPE is an acronym for Common Platform Enumeration. It is a standardized method of describing and identifying classes of applications and operating systems in a common format as described in this NIST document.

How Are They Used?

The most common use case for CPE data is fairly straightforward; you want to find all CVEs affecting either a software package or an operating system you run. The NVD actually provides an API to allow you to do these lookups programmatically.

What Are The CPE Data Quality Issues?

When a company attaches a CPE to CVE, it has four optional data points in the JSON Scheme:

  • VersionStartIncluding
  • VersionStartExcluding
  • VersionEndIncluding
  • VersionEndExcluding

These data points allow you to narrow down the version of the software that is vulnerable to the CVE.

The correct usage of these fields is present in CVE-2020-6572. Looking at the data provided, we know anything that is Chrome 81.0.4044.92 or older is vulnerable and should be patched.

The incorrect usage of this field is present in CVE-2015-8960. Looking at the data provided, we have to assume that all versions of Chrome (IE, Firefox, Safari, and Opera) are still vulnerable.

How Many CVEs Have This Problem?

As of today, 71,811 CVEs have at least one CPE that does not include version information. Not all of these are wrong, as if your CPE is mapped to a unique version, you can find an upgrade path to remove the vulnerability.

Real World Example:

To test the data, I decided to see how many CVEs with open CPE for the 3 major browsers (Chrome, Firefox, Edge) existed.

Web BrowserCVE CountCVEs
(10 newest listed)

What Can Be Done?

In a perfect world, Mitre and NVD would make these fields mandatory and remove the ability to assign a non-versioned CPE (ex: cpe:2.3:a:microsoft:edge:-:*:*:*:*:*:*:*) to a CVE.

Where is the Code?

The code is in this jupyter notebook and can be run on Colab:


CVE Prophet

I was recently asked if I had ever thought about trying to predict CVE growth. I had not, or really didn’t even know where to start, but after some research, I found the Prophet project that is a forecasting algorithm open-sourced by Facebook and uses the GAM family of algorithms.

Using prophet with the NVD data in a Jupyter notebook was a lot easier than I expected, and for the first iteration, I am thrilled with the outcome.


This is the default prediction graph from Prophet.
This is the default prediction graph from Prophet with change points added.


Looking at the individual data points is extremely interesting. Here are the top 10 predicted days for the rest of the year, and it will be interesting to see how close the prediction is.

DatePredictionPrediction LowPrediction High


I have put the Jupyter notebook in this Github Repo and will continue to make updates and tweaks to explore time series prediction.

Exploited in the Wild? What Does That Even Mean?

The first quarter of 2021 has been a busy quarter for the Project Zero (P0) team as they announced 16 “in the wild” zeros days. That is one new announcement a week on average. This is great for driving news cycles or if you’re in marketing and need some FUD to help sales. This isn’t so great if you are on a security team and have to deal with the buzz these announcements cause every week; redirecting time and resources that could otherwise be used by your team to remove the existing risk on your network. 

Here is a quick breakdown of the 16 CVEs that P0 has released this year: 

CVEProductKnown Exploit
CVE-2021-1647Microsoft DefenderTRUE
CVE-2021-21148Google ChromeFALSE
CVE-2021-21017Acrobat ReaderFALSE
CVE-2021-1732Microsoft WindowsTRUE
CVE-2021-26855Microsoft ExchangeTRUE
CVE-2021-26857Microsoft ExchangeTRUE
CVE-2021-26858Microsoft ExchangeTRUE
CVE-2021-27065Microsoft ExchangeTRUE
CVE-2021-21166Google ChromeFALSE
CVE-2021-26411Microsoft IEFALSE
CVE-2021-21193Google ChromeFALSE

Of the 16 announcements by P0, only 6 of them have publicly available proof of concept code and only the Exchange CVEs have been weaponized as far as I can tell.  That means a lot of companies have spent a lot of resources rushing emergency patches out to their systems to defend against zero-days that make huge news headlines like these:  

The problem is while I am sure that this is a legitimate iOS security vulnerability and P0 probably did observe one group of actors using it against another group of actors; but what risk does it pose to the average system and person on the internet?

It’s important that security teams know that they need to put out the proverbial “fire” when the  “exploited in the wild” alarm is sounding. Unfortunately, a lot of these disclosures are like a fire alarm that sounds anytime there is a fire anywhere in your city versus in your actual building. If this happens too often, teams will lose faith in the “in the wild” moniker and may skip critical vulnerabilities; or alternatively, teams may exert time fixing low-risk vulnerabilities that make the headlines instead of the widely exploited vulnerabilities that are actively being used by cybercriminals.  


Vulnerabilities likely to introduce the most likely risk to your environment are vulnerabilities that have high volume (Windows vulnerabilities)  and vulnerabilities with a high velocity of exploitations (Notpetya ransomware and Mirai botnet) and should be treated differently than vulnerabilities that are low volume targeted attacks that make up the vast majority of these P0 CVEs. 

To be blunt, if an exploit is being used to target a group of people by a nation-state, it should be reported, but it is not the same as a widespread automated exploit with public code and many groups exploiting it, and it shouldn’t be treated as such. Even if we added a modifier like “privately exploited in the wild” and “publically exploited in the wild” it would be easier for security teams to understand the true risk and when they need to quickly patch their systems. 

Until we figure this out I am going to go reboot my iPhone because I have to protect myself from another zero-day.

What Day Had The Most CVEs Published?

That was the simple question I asked myself on Saturday morning, thinking the answer would likely be simple to find. It wasn’t and ended up 48 hours later with me building this jupyter notebook to find out.

I really thought it would be as easy as pulling down the NVD data feeds and running a simple nvd['Published'].value_counts().head(10) to find out that 1098 of 146450 CVEs were published on 2004-12-31.

I even produced a nice little graph:

Except, looking at it, that data didn’t make much sense. With some more research and help, it became clear the data quality from NVD is pretty poor.

Using a tool called MissingNo to get a visualization makes it obvious that only about half the CVEs in the data are complete:

White Space is Missing Data

When you drop CVEs that are missing the CVSS BaseScore to clean up the data here is what the new graph looks like:

The “best” answer to What Day Had The Most CVEs published appears to be 2020-04-15 with 508 of 72964 CVEs published that date.

Here is what the top 10 days looks like:

2020-04-15    508
2018-07-09    431
2019-12-18    364
2018-06-11    349
2018-02-15    340
2017-08-08    316
2019-09-27    309
2020-03-12    307
2018-04-18    281
2017-04-24    281

All that being said, I am not a Data Scientist so I am open to any pull requests or suggestions on how to improve the data in the notebook I built.

CVE Stuffing

I monitor the @CVENew Twitter feed to keep up with any interesting new vulnerabilities that are released. On December 11th CVE-2020-29589 was published claiming that “the kapacitor Docker images through 1.5.0-alpine contain a blank password for the root user” and that it has a CVSS score of 9.8.

This CVE was just a re-report of CVE-2019-5021, which I researched last year when it came out. AlpineLinux rightfully claims in their write up that “You are not affected unless you have shadow or Linux-pam packages installed.” Checking the DockerFile for the Kapactior image, it has neither package installed, so this container is not affected by either the root CVE-2019-5021 vulnerability or even the new CVE-2020-29589 it was just given. Mistakes happen, so I reached out to InfluxData to ask them to dispute the CVE and moved on with my day.

Then it started to happen. Over the last 7 days, the following CVEs were filed claiming the same issue with no verification or even attempting to reach out to the container owners to let them know a CVE was filed.

The descriptions have even started to worsen as with CVE-2020-35466, which lists the affected product as “Blackfire Docker image – store/blackfire/blackfire“, making it impossible even to check if the vulnerability exists.

With the expansion of CNAs, I know that the overall amount of CVEs will explode, with XSS bugs in specialty software like CVE-2019-14478 becoming more common. However, as long as there is some effort to verify the vulnerability, the data is still useful. If we get to the point where you can not even trust the data in a CVE is accurate, security teams’ ability to mitigate vulnerabilities becomes impossible. As Michael Roytman told me, “The only thing worse than no data is bad data,” and that is what is happening here; the CVE database is being stuffed with bad data. I have not found a way to contact the NVD or Mitre about these CVEs and am only having mixed luck letting the container owners know to dispute the CVEs.

Site Footer