The Ultimate Guide to Open-Source Security with Python and R

Open-source software (OSS) has emerged as a powerful force, revolutionizing the way organizations approach data science and machine learning development, collaboration, and innovation. With a wealth of benefits including transparency, cost-effectiveness, and a vast community of contributors, open-source software has garnered widespread adoption across industries. However, open-source security brings challenges and threats every day that […]

2022 State of Data Science

2022 State of Data Science This year, we conducted our State of Data Science survey to gather demographic information about our community, ascertain how that community works, and collect insights into big questions and trends that are top of mind within the community. 3,493 individuals from 133 countries and regions took part in the online […]

Podcast: Data Engineering as a Scientific Tool

Show Notes In this episode, host Peter Wang is joined by Dr. Patrick Kavanagh, an astrophysicist and software developer at the Dublin Institute for Advanced Studies. Patrick works on the James Webb Space Telescope (JWST), helping to write code that allows scientists to interpret the raw data they receive from space. Patrick talks to Peter about cleaning telescope data sets […]

Optimizing Python for Speed and Compatibility

Show Notes In the penultimate episode of season one, host Peter Wang and Carl Meyer, Software Engineer at Instagram (owned by Meta), discuss considerations around making Python faster while maximizing compatibility and performance. Several years ago, Carl and his team started working on a project called Cinder in an effort to improve CPU efficiency across Meta’s servers by “[optimizing] things at the […]

Climate Science, Scientific Computing, and Data Accessibility

Show Notes This episode’s conversation between host Peter Wang and Ryan Abernathey, Associate Professor at Columbia University in the City of New York, explores climate science, scientific computing, data accessibility, and more. Topics that Peter and Ryan cover include: Cloud computing Open data and collaboration Climate science and the private sector Open-source projects like Pangeo Forge and Xarray Climate data […]

Shaping Best Practices for Monitoring ML Models

Show Notes In this episode, host Peter Wang is joined by Elena Samuylova, CEO and Co-Founder of Evidently AI. Peter and Elena discuss how Evidently AI’s open-source tooling is helping users monitor machine learning (ML) models, and why that’s important. Elena has found that Evidently AI’s open-source approach is attractive to data scientists and ML engineers who are ramping […]

Podcast: Snowflake and Advanced Analytics

Show Notes anaconda-snowflake-and-advanced-analytics-podcast-episode6-transcriptIn this episode, host Peter Wang speaks with Torsten Grabs, Director of Product Management at Snowflake, about how Snowflake solutions support professionals in data science, machine learning, and advanced analytics. Torsten has worked with data throughout his entire career. At Snowflake, he focuses on Snowflake’s data lake, data pipelines, and data science workloads, as well as Snowflake’s […]

Podcast: Modern Complexity and the Cybernetic Future

Show Notes In “Autopoiesis in Systems of People and Machines,” Peter Wang welcomes Paco Nathan. Paco is a Managing Partner at Derwen, Inc., a company that offers enterprise customers full-stack engineering for AI applications at scale, with an emphasis on open-source integrations. Paco forged a career in artificial intelligence when many people were skeptical of it and now boasts […]