A few weeks ago, I attended the Software Sustainability Institute’s Collaborations Workshop 2018. The workshop was a great opportunity to meet fellow research scientists and discuss best practice for scientific software development. During the workshop, I attended an impromptu mini-workshop on the newly-released JupyterLab.

At the time, I was a few weeks away from submitting my thesis, and so was spending a lot of time re-running my various codes and plotting the results. I’d got into the habit of doing this plotting in Jupyter notebooks, and as soon as JupyterLab was released, I installed it and started playing around. I was instantly won over by the fact that I could have multiple panes open in a single window, and so no longer needed to have multiple tabs open on my browser in order to run multiple notebooks. By the time I attended the workshop, I’d been using JupyterLab pretty extensively for about three weeks.

I was therefore pretty surprised to find that of the 20 or so people in the room, I was the only one who had yet done anything substantial with JupyterLab. I also seemed to be in the minority when it came to using Jupyter notebooks for anything more than teaching or simple examples.

During the workshop, I ended up demo-ing in JupyterLab with some of the notebooks I was using at the time to generate plots for my thesis. In this post, I will outline how I use Jupyter in my work and why I think it’s so useful. I’ll also describe my switch the JupyterLab and why I am not going back to standard notebooks.

Jupyter

Example notebook from R3D2 documentation.

Jupyter Notebook is an open-source web application that allows you to create documents that contain both live code and rich text. Blocks of code or rich text are contained within cells. These cells can then be run individually (and not necessarily in sequence). This mix of code blocks and rich text make them much more human-readable than source code alone: the code itself can be presented right alongside its output (e.g. numbers, plots, tables) plus prose, images and equations to describe it.

For this reason, one of the main uses of Jupyter notebooks so far has been for teaching (I did this myself when I gave a course on scientific programming last year). Small code snippets can be given, interspersed with large chunks of explanatory text and diagrams. The fact that cells can be re-run individually makes them ideal for experimenting. The graphical, web-based interface of the notebooks is also visually more similar to the type of programs they use on a daily basis and so can be less intimidating for beginners than if they started with an IDE or the terminal.

Using Jupyter for research

I’ve used Jupyter notebooks for various things throughout my PhD. The two places I’ve found them most useful are for experimenting during code development and for plotting.

Experimentation & exploration

When I first start developing code, I often like to try out a number of different things before settling on the final implementation. Regardless of which language the ‘real’ code will be written in, I like using python for these experiments due to its speed and relative simplicity (I don’t want to be worrying about invalid memory accesses when I’m trying to work out whether my algorithm works).

I find the notebook is ideal for recording these experiments. I can lay out my different attempts in one document and present my thoughts and working in a way that is easy to look back at later. The code cells allow me to change small parts of the code and re-run sections without having to run the whole code again.

Plotting

Using the notebook for plotting.

The main thing I’ve used Jupyter for over the course of my PhD has been for plotting. Especially when generating plots for my thesis, I often wanted to rerun the same plotting script many times with small tweaks until I got the figure that I wanted. Being able to see the resulting figure right next to the code that generates it is incredibly useful for this. Again, the code cells are also helpful here: I only need to load in the (sometimes quite large) datasets once. They then exist in the notebook variable space until I overwrite them or shut down the notebook. Plots for the same dataset can be generated in the same document, making it easier to explore the data and find the best way to present it.

JupyterLab

Jupyter lab.

The beta release of JupyterLab was released at the end of February. Alongside Jupyter Notebooks, JupyterLab allows you to use text editors, terminals, data file viewers and other custom components in a tabbed work area. It therefore feels much more like a full IDE than the notebooks. As mentioned above, I often have multiple notebooks open at once. The ability to have these instead contained within multiple tabs (and multiple panes within the same window) pretty much sold the Lab for me instantly - the redesigned work area is much more convenient for those who like having everything open at once. Similarly, I love being able to have a terminal open in the same window so that I can run my code to generate the data I then plot in the notebook without having to continuously switch and resize windows.

The notebooks themselves have also had a bit of an update. Cells can now be reordered by dragging and dropping (before this required the use of clicking on arrow buttons). They can also be copied and pasted between notebooks. Multiple instances of the same notebook can be opened at once, allowing you to compare different parts side by side.

JupyterLab is built on top of an extension system, which means that it is incredibly customisable. These extensions can provide themes, file editors & viewers, renderers and advanced settings for the Lab environment. Already, there are a number of community-developed extensions including support for live-editing of LaTeX documents, viewers for ipywidgets, Bokeh and plotly, and GitHub browsing. It will be interesting to see how these develop in the future - I think this extensibility gives the Lab the potential to seriously rival standard IDEs as the development environment of choice.

Summary

I have found Jupyter notebooks to be an invaluable tool during the course of my PhD for experimentation and data visualisation. The newly released JupyterLab provides an updated environment for Jupyter Notebooks, generally improving their usability. I believe they have much potential to turn into a tool that could (in the not so distant future) rival standard IDEs.