From Notebooks to Production Data Science Systems
Briefly

In this episode, Catherine Nelson shares her expertise on advancing data science practices from local exploratory work to robust production workflows. With a background in geology and a shift to data science, she discusses essential tools and techniques for moving beyond Jupyter notebooks. Catherine highlights the importance of mastering Python basics, dependency management, and version control to support data scientists in refining their coding quality and efficiency, ultimately enhancing their work in production settings. She also offers practical resources for newcomers to Python to grasp necessary concepts before deep diving into the discussion.
She emphasized the idea that moving from exploratory data analysis in Jupyter notebooks to production involves not just technical skills, but also leveraging software engineering principles.
Catherine underscored the need for familiarity with version control systems like Git, as it plays a crucial role in collaborative work and transitioning notebook-based projects into production environments.
Dr. Nelson highlighted the importance of understanding Python packages and dependency management tools as vital for ensuring smooth transitions from local notebooks to deployable applications.
Addressing newcomers, she advised having a solid foundation in Python basics to effectively engage in the discussion of refactoring Jupyter notebooks for production.
Read at Talkpython
[
|
]