Maker Blog Series
An Introduction to the Seaborn Objects System
Nov 22, 2022By Joshua Ebner
There’s recently been a major update to the Python data science ecosystem. There’s a new data visualization toolkit available: the new Seaborn objects system. It’s still very new, but I’d argue that it’s likely to become one of the best, most user-friendly data visualization toolkits for Python. In this tutorial, I’m going to give you a quick overview of the Seaborn objects interface. I’ll explain what it is, why I like it, and how it works, and I’ll show you some examples.
I’ll remind you that the Seaborn objects system is very new. That being the case, I recommend that you read the whole tutorial; everything will make more sense that way.
What Is the Seaborn Objects System?
As you may know, Seaborn is a data visualization toolkit that has existed for several years. In its original form, it had a variety of functions for creating standard data visualizations like scatter plots, line charts, bar charts, etc.
In September of 2022, the Seaborn team released a new version of the Seaborn data visualization package. Along with this new package version, they released an all-new system for doing data visualization in Python. This new system is called the Seaborn objects system, and it’s based on the Grammar of Graphics, like Tableau and ggplot2 from R. This new system represents a new way to build data visualizations in Python. It’s powerful, flexible, and easy to use. And because data visualization is so important, I’d argue that this is an important development for the Python data science ecosystem.
How Seaborn Objects Is Different From the Old Seaborn System
Before we get into more of the details of this new system, let me quickly explain how it’s different from the previous Seaborn toolkit.
Perhaps the most important feature of this new system is that it’s modular. Let me explain.
The traditional way of building a data visualization in Seaborn and many other visualization systems entails one function for each plot type: There’s a function to create scatter plots (i.e., sns.scatter), a function to create bar charts(i.e., sns.barplot), and a function to create line charts (i.e., sns.lineplot). And if you need to plot a line on top of a scatter plot, there’s another specialized function that you need to use (i.e., sns.regplot).
This is fine if you’re building run-of-the-mill scatter plots, bar charts, or simple visualizations. But it makes things complicated if you’re trying to build more complex plots, particularly plots with multiple layers and mark types.
So the traditional Seaborn system was simple, but somewhat inflexible.
Seaborn Objects: A Modular System for Data Visualization
In contrast, the new Seaborn objects system is a modular system for building visualizations. So instead of having multiple different functions to create different kinds of plots, there’s one generalized Plot function, which initializes plotting for all visualizations.
Then, there are different functions that you can use to add specific mark types. Want a scatter plot? Then you add dots. Want a line chart? Add lines. And so on.
And if you want a regression plot—where you have a scatter plot with a trend line over it—then you can add dots and lines. It’s totally modular, which enables you to build complex, multi-layer plots with a very simple syntax.
A Simple Syntax for Modifying Your Plots
Moreover, this new Seaborn system provides an intuitive system for making modifications to your plots. There’s a simple syntax for modifying axes and scales (much more intuitive than Matplotlib).
And there’s a simple syntax for making multi-panel plots. Now if you’ve read any of my past work at my blog, you’ll know that I absolutely love small multiple charts. One of my big frustrations with Matplotlib and Seaborn was that they didn’t always work great for creating small multiples.
Well, this new Seaborn objects system fixes that. Creating small multiples and pair plots is as simple as calling an additional method, as you’ll see in example 5 of my tutorial. It’s very simple, which makes it powerful.
Having said all of this, as I explain it here, it might seem a little abstract. Therefore, let’s dive into the syntax, so you can see it. After that, I’ll show you some examples to really help you understand it in a concrete way.
→ This post is continued on Anaconda Nucleus.
About the Author
Joshua Ebner is the Founder and Chief Data Scientist at Sharp Sight, a data science training company. Before founding Sharp Sight, Joshua did data science and analytics at Apple, Bank of America, and other Fortune 500 firms. He has a degree in physics from Cornell University.
About the Maker Blog Series
Anaconda is amplifying the voices of some of its most active and cherished community members in a monthly blog series. If you’re a Maker who has been looking for a chance to tell your story, elaborate on a favorite project, educate your peers, and build your personal brand, consider submitting an abstract. For more details and to access a wealth of educational data science resources and discussion threads, visit Anaconda Nucleus.