Intake released on Conda-Forge

 

Intake is a package for cataloging, finding and loading your data. It has been developed recently by Anaconda, Inc., and continues to gain new features. To read general information about Intake and how to use it, please refer to the documentation.

Until recently, Intake was only available for installation via Conda and the intake channel on anaconda.org. This reflected the rapid development and short release cycle of the project. Conda-Forge is an effort to provide an automated path for releasing open-source projects to the public, in the context of the Conda package manager ecosystem. From now on, Intake is available on Conda-Forge, and the recommended installation command is:

conda install -c conda-forge intake

Preparations for release

In order to make Intake and its data drivers more widely available, we have first packaged and released the project on PyPI for installation by pip. This is a general precursor step to release for Conda on Conda-Forge, but it also allows a separate installation route directly with pip:

pip install intake

The above line is generally believed to work well, but this method of installation is not yet as well tested as the Conda path.

Furthermore, we have explicitly added tests for Intake running under Windows. After fixing a number of path-syntax-related bugs, we are confident that Intake should now work well for all Windows users.

Python 2 support?

Intake does not currently run under Python 2, and this has been a design choice in order to be able to develop more quickly. As much of the python stack (numpy, pandas, etc.) is dropping Python 2 support, it did not seem too important to put in the effort to add it to Intake, which is, of course, a new project without the worries of backwards compatibility.

If it turns out that there is significant pressure to be able to support Python 2 also, then this decision may be reversed. There is not much in the codebase that is unfriendly to Python 2, it is purely a question of developer time. However, some drivers depend on packages that are Python 3-only, and these will never be back-ported.

Release in Conda defaults

As of the time of writing, Intake is also being released on the Conda defaults channel (i.e., the one that is automatically available to any Ana/Conda install). This is a seal of approval that is much appreciated, and has the practical upshot that the following simpler installation command will work, and also will likely happen much faster for not having to download the metadata of the Conda-Forge channel. The process of preparing packages for defaults is somewhat more involved, since it requires that dependencies are also on defaults (otherwise there would be no point!), and so the versions of packages there are not necessarily as recent as on conda-forge.

To install from defaults:

conda install intake

Status of drivers

Currently (early February, 2019), the following packages have been released on Conda-Forge:

  • intake
  • intake-elasticsearch
  • intake-accumulo
  • intake-astro
  • intake-avro
  • intake-parquet
  • intake-spark
  • intake-sql
  • intake-xarray

and the following on defaults:

  • intake
  • intake-xarray

Please see the Intake project dashboard for details of releases of each package.


You May Also Like

Data Science Blog
Anaconda Debuts Data Science Certification Program
Certification to Standardize Data Science Skill Set among Employers and Professionals AnacondaCON, Austin, TX—April 9, 2018 — Anaconda, the most popular Python data scien...
Read More
Company Blog
The Dominion: An Open Data Science Film
The inaugural AndacondaCON event was full of surprises. Personalized legos, delicious Texan BBQ, Anaconda “swag” and a preview of what might be the most dramatic, life-alt...
Read More
Data Science Blog
InfoWorld: 5 essential Python tools for data science—now improved
If you want to master, or even just use, data analysis, Python is the place to do it. Python is easy to learn, it has vast and deep support, and most every data science librar...
Read More