At Anaconda, we value open source software
We believe it is a privilege to be able to share ideas-as-code with people around the world. We continuously seek productive, sustainable ways to strengthen the open source foundation and create the architecture of the future. Through our work, we aim to empower people to improve lives and solve the world’s greatest challenges.
Innovating with New Projects to Meet Enterprise Needs
While much of the software we write at Anaconda is open source from the beginning, some of our software is not immediately freely available. Our software provides livelihoods for developers, which allows them to focus on writing software contributions to open source. We hope you will download our software and be satisfied with both the software and the knowledge that you are contributing to the present and future ecosystem of open source.
Bokeh scales visualization to Big Data
Interactive and real-time streaming visualization framework that scales to Big Data with data shading
There are many excellent plotting packages for Python, but they generally do not optimize for the particular needs of statistical plotting or multidimensional datasets. Additionally, advanced visual customization is typically difficult for non-programmers, and most libraries do not build a reified data processing pipeline that supports rich interactivity like linked brushing. Bokeh addresses these problems at their core by using a declarative data transformation scheme, and is engineered to operate in a client/server model for the modern web.
Datashader is a graphics pipeline system for creating meaningful representations of large amounts of data
HoloViews is a library for analyzing and visualizing scientific or engineering data
GeoViews is a Python library that makes it easy to explore and visualize any data that includes geographic locations
GeoViews has particularly powerful support for multidimensional meteorological and oceanographic datasets, such as those used in weather, climate, and remote sensing research, but is useful for almost anything that you would want to plot on a map! You can see lots of example notebooks at geo.holoviews.org, and a good overview is in our blog post announcement.
GeoViews is built on the HoloViews library for building flexible visualizations of multidimensional data. GeoViews adds a family of geographic plot types based on the Cartopy library, plotted using either the Matplotlib or Bokeh packages.
matplotlib is an easy-to-use interactive tool for publication-quality scientific plotting
Jupyter Notebooks allow you to create and share documents that contain live code, equations, visualizations and explanatory text
PhosphorJS simplifies and speeds up web apps
Fast, flexible, and efficient web framework
PhosphorJS is a framework for building high performance, pluggable, desktop style web applications that integrates easily with existing web frameworks. The PhosphorJS framework has well-defined, efficient widgets and layouts that allow a developer to design high performance, responsive desktop style apps for the web that consistently achieve sub-millisecond layouts. This efficient design maximizes the execution speed of business logic.
Spyder is the scientific python development environment
Powerful interactive development and numerical computing environment for Python
Spyder is a powerful interactive development environment for the Python language with advanced editing, interactive testing, debugging and introspection features and a numerical computing environment thanks to the support of IPython (enhanced interactive Python interpreter) and popular Python libraries such as NumPy (linear algebra), SciPy (signal and image processing) or matplotlib (interactive 2D/3D plotting).
Spyder may also be used as a library providing powerful console-related widgets for your PyQt-based applications – for example, it may be used to integrate a debugging console directly in the layout of your graphical user interface.
Dask parallelizes data science workloads on multi-cores and distributed clusters
Makes it easy to write complex parallel algorithms for task execution
Dask is a framework used to easily parallelize algorithms that takes advantage of the available memory and computer power to maximize memory, execution time and performance of complex algorithms. Dask creates a task graph based on the data and then intelligently schedules the execution of the tasks to optimize throughput.
While developers can parallelize Python manually, Dask helps to automate the task with rich primitives that are aware of the execution environment and optimize the analytic execution. Dask collections build on Dask to provide dask.array and dask.dataframe, collections that mimic NumPy and pandas but operate in parallel and on larger-than-memory datasets.
Report bugs and make feature requests through the GitHub issue tracker. For community discussion, please use [email protected]
Numba speeds up NumPy and SciPy
Compiles Python into machine code for lightning fast execution
NumPy provides fast vectors, matrices, and arrays in Python
- a powerful N-dimensional array object
- sophisticated (broadcasting) functions
- tools for integrating C/C++ and Fortran code
- useful linear algebra, Fourier transform, and random number capabilities
Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.
NumPy is licensed under the BSD license, enabling reuse with few restrictions.
pandas is a fast, flexible, and expressive data structures for working with relational or labeled data
SciPy is a rich, powerful library for scientific computing
SciPy is a Python-based ecosystem of open-source software for mathematics, science, and engineering. In particular, these are some of the core packages:
- NumPy Base N-dimensional array package
- SciPy library Fundamental library for scientific computing
- Matplotlib Comprehensive 2D Plotting
- IPython Enhanced Interactive Console
- Sympy Symbolic mathematics
- pandas Data structures & analysis
conda easily packages Python, R, NumPy, SciPy & more
Eliminates package dependency and version control issues
Conda is an innovative package manager tool that allows users to mix-and-match different versions of Python, NumPy, SciPy and other packages in isolated environments and easily switch between them.
The conda command is the primary interface for managing Anaconda installations. It is great for solving enterprise integration and application deployment challenges. It can query and search the Anaconda package index and current Anaconda installation, create new Anaconda environments, and install and update packages into existing Anaconda environments.
Blaze scales Python analytics to Big Data on multiple compute engines
Fast, scalable out-of-core computations on Big Data
Blaze extends successful model of array-oriented programming of NumPy and pandas to out-of-core, distributed and streaming data. Blaze allows analysts and scientists to productively write robust and efficient code, without getting bogged down in the details of how to distribute computation for all kinds of data, but especially semi-structured, sparse, and columnar data.
Blaze supports data stores and stream engines including:
- Bcolz compressed columnar
- MongoDB NoSQL store
- SQLAlchemy SQL store
- Apache Spark cluster computing framework
- PyTables high performance HDF5
- Streaming Python streaming data