What a Marketer Learned During a CVE Curation “Party”
Aug 25, 2020By Stacey Bucklin
When I joined Anaconda as a marketing manager just over a year ago, I was immediately in awe of the incredibly smart and even harder-working people surrounding me. From my seat in the marketing department, I watched our technical teams build and improve upon the powerful data science tools that over 20 million users know and love.
Our commitment to bettering the data science community was quickly apparent. At Anaconda, we passionately champion the value of open-source software. As the originators of Python for data science, the open-source community is at the core of everything we do.
Recently, I was asked to join in a company-wide project that brings additional value to our products and customers. Although I have to admit that I was a bit intimidated at first, I was intrigued to go behind the curtain and learn more about the important work Anaconda does to allow our customers to use open-source with confidence. Here is what I learned.
A cautionary tale
Mistakes and oversights are a fact of life, and writing code is no exception. Just like any other software, open-source software is not immune from vulnerabilities, whether benign or malicious. If an organization is not actively monitoring for and mitigating risk, it is very likely that potentially dangerous vulnerabilities will creep into their models and applications over time.
Unpatched software vulnerabilities have allowed bad actors to execute some of the highest-profile data breaches in the world. Hackers have gained access to extremely sensitive information and caused hundreds of millions of dollars in damages. Many of these breaches, and the associated human and financial impact that occurred as a result, could have been prevented. Vulnerability patches are often available, but they must be identified and implemented in order to be effective.
Clearly, there is value in knowing when and how the applications and code you use are vulnerable to attacks. Armed with this knowledge, you can proactively mitigate damage to users, brands, and the community at large. But where do you start?
Common Vulnerabilities and Exposures (CVEs)
CVEs, or Common Vulnerabilities and Exposures, are software vulnerabilities identified by the National Institute of Standards and Technology (NIST). Each CVE is scored in severity on a scale of 0 to 10 using a formula that takes into account factors such as impact and exploitability. Knowing the CVE scores and details of software enables you to assess its risk and determine if it can be safely used in your environment.
Software vulnerabilities should always be taken seriously, but in many cases, manually chasing down patches, references and release notes can be a time-consuming endeavor that can quickly lead you down the rabbit hole.
False positives further complicate the matter. CVE data is often messy, and vulnerabilities may be flagged that in reality have no impact on the code your team is using. These false alerts may seem benign; it’s best to be cautious, right? But consider the fact that many large organizations are inherently risk averse. They often block software that is exposed to CVEs from going into production, meaning false positives can quickly translate into significant monetary impact. (And we haven’t even discussed the topic of false negatives, which could fill an entire blog on its own!)
Suddenly, curating CVE data is not as straightforward as it once seemed. It’s one thing for a single individual to dig into this project and divert valuable attention from revenue-generating activities. It’s another thing entirely for multiple individuals serving on different teams throughout an organization to go through this exercise. Imagine the man-hours (and dollars) invested. As a wise woman once said, “Ain’t nobody got time for that.”
Enabling open-source with confidence
As leaders in the open-source data science community, Anaconda is in a unique position to determine and distribute CVE data related to our packages, and to help our customers manage risk.
In early August, we invited the entire Anaconda team to a full day of CVE curation (really, we called it a “CVE curation party”). Non-technical team members like me shadowed our engineering colleagues, allowing everyone in the company to see first-hand just what goes into chasing down vulnerability data and determining what’s valid and what’s not.
We dove deep into dozens of CVEs associated with our Conda packages to curate the most accurate and relevant data available. Sometimes, this took us to the far reaches of the Internet or to web references from decades ago. It was an eye-opening experience for those of us outside of engineering that aren’t intimately involved with Conda on a daily basis. There’s nothing like hands-on experience to make you appreciate the complexity of a task like CVE curation. I’ll personally never look at CVEs the same way again.
At the end of our day-long event, we had manually curated hundreds of CVEs! And more importantly, everyone in the company has a newfound appreciation for the level of effort that goes into maintaining Conda packages and CVE data.
We’ll continue to pull together the most reliable data available in order to achieve our goal of providing our customers with justifiable information about the security state of the software installed in their Conda environments. Armed with the information our team has curated, Anaconda customers can use open-source with confidence.