Intake: Discovering and Exploring Data in a Graphical Interface

 

Motivation

Do you have data that you’d like people to be able to explore on their own? Are you always passing around snippets of code to load specific data files? These are problems that people encounter all the time when working in groups and using the same datasources or when trying to distribute data to the public. Some users are comfortable interacting with data entirely programatically, but often it is helpful to use a GUI (Graphical User Interface) instead. With that in mind we have reimplemented the Intake GUI so that in addition to working in a jupyter notebook, it can be served as a web application next to your data, or at any endpoint.

What is Intake?

Intake is a lightweight set of tools for loading and sharing data. You might have seen earlier blog posts. Intake separates the concepts of the data engineer – the person curating, managing, and disseminating data – from the data user – the person analyzing and visualizing the data. The data engineer sets up catalog files describing data sources and the data user loads data without needing to know how it is stored.

Try it out

Binder

To use the GUI locally, pip/conda install intake and panel. Then run:

python -c "import intake; intake.gui.show()"

This command will open a browser tab containing a panel application (more on this in the “How it Works” section)

If you’d rather work in the notebook open a notebook and run:

import intake 
intake.gui

How to use it

Once you have a GUI, you can add and remove catalogs from the GUI using the + and – buttons, select data sources from within catalogs, read the yaml that those describes those sources. Use the plot button to see the pre-defined plots that the data engineer has defined (if you would like to define some plots see the plotting docs)

For more information on how to interact with the new GUI, see the GUI docs.

How it works (hint: Panel!)

The intake GUI is built using the latest project in the PyViz ecosystem: Panel. Panel provides a composable dashboarding solution for displaying plots, images, tables, or text and widgets for interacting with them. Panel works both in a Jupyter notebook and in a standalone secure web server using the same source code.

From the data engineer perspective this means that you can serve the intake GUI at an endpoint and use it as a data exploration tool for your data users. It also means that it is straightforward to customize and reorganize the GUI to insert your own logo, reuse parts of it in your own applications, or add new functionality.

What’s next

We have ambitious plans for the future of this GUI. In the near future, we anticipate adding a mechanism for inputting user parameters, adding some basic custom plotting, and allowing the editing and saving of catalogs. See GH#225 for a full discussion of the vision.


You May Also Like

News
Anaconda Expert to Discuss Data Science Best Practices at MIT CDOIQ Symposium
CAMBRIDGE, MA— July 11, 2017—Continuum Analytics, the creator and driving force behind Anaconda, the leading Open Data Science platform powered by Python, today announced ...
Read More
Enterprise Data Science
The Human Element in AI
The over 45 speakers at AnacondaCON 2019 delved into how machine learning, artificial intelligence, enterprise, and open source communities are accomplishing great things with...
Read More
News
Anaconda Enterprise Receives Honors in Fourth Annual Datanami Readers’ and Editors’ Choice Awards
SAN DIEGO, Sept. 23, 2019 — Anaconda’s enterprise data science platform has been recognized in the fourth annual Datanami Readers’ and Editors’ Choice Awards, pres...
Read More