Jupyter Notebook - Materials Informatics

Jupyter Notebooks are interactive documents that combine:

Code cells: Executable Python code
Markdown cells: Rich text documentation
Output cells: Results of code execution

They are ideal for:

Data exploration and analysis
Teaching and learning
Research reproducibility
Creating reports with live code

Jupyter Notebook Structure¶

Cell Types¶

1. Code Cells¶

Code cells contain executable Python code. You run them and the output appears below.

# Code cell - executes Python
import numpy as np
import matplotlib.pyplot as plt

# Create data
x = np.linspace(0, 10, 100)
y = np.sin(x)

print(f"Created {len(x)} data points")

Created 100 data points

2. Markdown Cells¶

Markdown cells contain formatted text, equations, and documentation.

# This is a code cell showing an example
message = "This is from a code cell"
print(message)

This is from a code cell

Jupyter Features¶

Cell Execution Shortcuts¶

Shortcut	Action
`Shift + Enter`	Run cell and move to next
`Ctrl + Enter`	Run cell and stay
`Alt + Enter`	Run cell and insert new cell below
`D, D` (press twice)	Delete cell
`A`	Insert cell above
`B`	Insert cell below
`M`	Change cell to Markdown
`Y`	Change cell to Code

Magic Commands¶

Jupyter has special commands that start with % or %%:

# Line magic - affects single line
%timeit sum(range(1000))

22.5 μs ± 179 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

%%timeit
# Cell magic - affects entire cell

x = sum(range(1000))

25.4 μs ± 6.8 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

Common magic commands:

%timeit: Measure execution time
%%time: Time entire cell
%pwd: Print working directory
%ls: List files
%load: Load code from file
%matplotlib inline: Display plots inline (default)

More details about magic commands can be found in the official documentation.

Debugging in Jupyter¶

Print Debugging¶

Use print statements to trace execution flow:

def calculate_energy(atomic_positions):
    print(f"Input shape: {atomic_positions.shape}")
    print(f"First position: {atomic_positions[0]}")
    
    distances = np.linalg.norm(atomic_positions, axis=1)
    print(f"Distance statistics:")
    print(f"  Mean: {np.mean(distances):.3f}")
    print(f"  Std: {np.std(distances):.3f}")
    
    return distances

# Test function
positions = np.array([[0, 0, 0], [1, 0, 0], [0, 1, 0]])
calculate_energy(positions)

Input shape: (3, 3)
First position: [0 0 0]
Distance statistics:
  Mean: 0.667
  Std: 0.471

array([0., 1., 1.])

Checking Variable State¶

List all variables in memory:

# List all variables
%who

# List all variables with details
%whos

# Display value with type
value = 42
print(f"Value: {value}, Type: {type(value).__name__}")

calculate_energy	 message	 np	 plt	 positions	 x	 y	 
Variable           Type        Data/Info
----------------------------------------
calculate_energy   function    <function calculate_energy at 0x116a1ede0>
message            str         This is from a code cell
np                 module      <module 'numpy' from '/Us<...>kages/numpy/__init__.py'>
plt                module      <module 'matplotlib.pyplo<...>es/matplotlib/pyplot.py'>
positions          ndarray     3x3: 9 elems, type `int64`, 72 bytes
x                  ndarray     100: 100 elems, type `float64`, 800 bytes
y                  ndarray     100: 100 elems, type `float64`, 800 bytes
Value: 42, Type: int

Markdown for Documentation¶

Basic Syntax¶

# Heading 1
## Heading 2
### Heading 3
**Bold** and *italic*
Unordered lists with -
Ordered lists with 1.
[Link](url)
`Inline code`
Code blocks with triple backticks

# Example of formatting in a markdown cell (this is a comment in code cell)
print("Run this cell to test Jupyter setup")

Math in Markdown¶

Inline math: $E = mc^2$

Display math:

F = ma

(1)

Array notation:

\begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix}

(2)

Common Pitfalls¶

Cell execution order matters: Always use “Restart & Run All” before sharing
Global state pollution: Variables from previous cells may interfere
Hidden state: Restart kernel when analysis changes significantly
Large outputs: Clear output cells before committing to Git
Forgotten imports: Put all imports at the top of the notebook