Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Python Recap

We will start with the basics of Python programming. This includes setting up the programming environment, writing simple Python scripts, and using essential Python libraries for data manipulation and visualization. Since most of the students already know Python, this section will be a quick recap.

Introduction to Python

Python is a high-level, interpreted programming language known for its simplicity and readability. It was created by Guido van Rossum and first released in 1991. Python’s design philosophy emphasizes code readability and simplicity, making it an excellent choice for both beginners and experienced programmers.

Key Features of Python

  • Easy to Learn and Use: Python’s syntax is straightforward and easy to understand, which makes it an ideal language for beginners.

  • Interpreted Language: Python is an interpreted language, meaning that code is executed line by line, which makes debugging easier.

  • Dynamically Typed: Python does not require explicit declaration of variable types, as the interpreter infers the type based on the value assigned.

  • Extensive Standard Library: Python comes with a rich standard library that provides modules and functions for various tasks, such as file handling, web development, and data manipulation.

  • Cross-Platform: Python is available on multiple platforms, including Windows, macOS, and Linux, allowing for cross-platform development.

  • Community Support: Python has a large and active community that contributes to its development and provides support through forums, tutorials, and documentation.

Applications of Python

Python is a versatile language used in various domains, including:

  • Web Development: Frameworks like Django and Flask make it easy to develop web applications.

  • Data Science and Machine Learning: Libraries such as Pandas, NumPy, and scikit-learn are widely used for data analysis and machine learning.

  • Automation and Scripting: Python is often used for writing scripts to automate repetitive tasks.

  • Scientific Computing: Libraries like SciPy and Matplotlib are used for scientific research and visualization.

  • Game Development: Libraries such as Pygame are used for developing games.

  • Embedded Systems: Python can be used in embedded systems and IoT devices with platforms like MicroPython.

Syntax and Semantics

Python’s syntax is designed to be readable and straightforward. Here are some key points:

  • Indentation: Python uses indentation to define blocks of code. This makes the code visually clear and easy to understand.

  • Comments: Use # for single-line comments and triple quotes (‘’’ or “”") for multi-line comments.

# This is a single-line comment

"""
This is a multi-line comment
spanning multiple lines.
"""

Variables and Data Types

Variables in Python are used to store data. Python is dynamically typed, meaning you don’t need to declare the type of a variable explicitly; Python infers it automatically based on the value assigned. You can use type() function to check the data type of a variable. You can use print() function to display the value of a variable.

Common Data Types:

  • Integer: Whole numbers, e.g., x = 5

  • Float: Decimal numbers, e.g., y = 3.14

  • String: Sequence of characters, e.g., name = "Alice"; f-string: f"Hello, {name}" (Python 3.6+)

  • Boolean: Represents True or False, e.g., is_student = True

x = 5           # Integer
y = 3.14        # Float
name = "Alice"  # String
is_student = True  # Boolean

hello_str = f"Hello, {name}!" # usage of f-string

print(x, y, name, is_student, hello_str) # print multiple variables
print(type(x), type(y), type(name), type(is_student)) # print types of variables
5 3.14 Alice True Hello, Alice!
<class 'int'> <class 'float'> <class 'str'> <class 'bool'>

Sequence Data Types:

Python has built-in data types for storing multiple elements, including lists, tuples, sets, and dictionaries. Here we cover the basics of each, but keep in mind that for practical data analysis in this course, we’ll use NumPy arrays and Pandas DataFrames, which are more powerful and flexible.

  • List: Ordered, mutable collection of items.

fruits = ["apple", "banana", "cherry"]
print(fruits[1])  # Output: banana
print(fruits[-1]) # Output: cherry

# Adding more examples for list
fruits.append("date")
print(fruits)  # Output: ['apple', 'banana', 'cherry', 'date']

# Append data to a list using for loop
numbers = []
for i in range(5):
    numbers.append(i)
print(numbers)  # Output: [0, 1, 2, 3, 4]

# Show every 2nd element in a list
print(numbers[::2])  # Output: [0, 2, 4]
print(numbers[::-2]) # Output: [4, 2, 0]
print(numbers[1::2]) # Output: [1, 3]

# List comprehension
squares = [x**2 for x in range(10)] # range(10) returns a generator that generates numbers from 0 to 9
print(range(10))  # Output: range(0, 10)
print(type(range(10)))  # Output: <class 'range'>
print(squares)  # Output: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

# Nested list comprehension
matrix = [[j for j in range(5)] for i in range(3)]
print(matrix)  # Output: [[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]
banana
['apple', 'banana', 'cherry', 'date']
[0, 1, 2, 3, 4]
range(0, 10)
<class 'range'>
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]
  • Tuple: Ordered, immutable collection of items.

coordinates = (10, 20)
print(coordinates[0])  # Output: 10
squared_coordinates = tuple(coord**2 for coord in coordinates) # tuple is immutable, so we can't use the comprehension to change the values
print(squared_coordinates)  # Output: [100, 400]
10
(100, 400)
  • Set: Unordered collection of unique items.

unique_numbers = {1, 2, 3, 3, 4,4,5}
print(unique_numbers)  # Output: {1, 2, 3, 4}

# Set comprehension
squared_numbers = {x**2 for x in unique_numbers}
print(squared_numbers)  # Output: {16, 1, 4, 9}
{1, 2, 3, 4, 5}
{1, 4, 9, 16, 25}
  • Dictionary: Collection of key-value pairs.

student = {"name": "Alice", "age": 18}
print(student["name"])  # Output: Alice

# Dictionary comprehension
squared_dict = {num: num**2 for num in unique_numbers}
print(squared_dict)  # Output: {1: 1, 2: 4, 3: 9, 4: 16}

# nested dictionary
students = {
    "Alice": {"age": 18, "grade": 12},
    "Bob": {"age": 19, "grade": 11}
}
print(students["Alice"]["age"])  # Output: 18

# show the keys of the dictionary
print(student.keys())  # Output: dict_keys(['name', 'age'])

# show the values of the dictionary
print(student.values())  # Output: dict_values(['Alice', 18])
Alice
{1: 1, 2: 4, 3: 9, 4: 16}
18
dict_keys(['name', 'age'])
dict_values(['Alice', 18])

Summary of Basic Data Types

In Python, dictionaries are particularly useful for working with labeled data. As you progress through the course, you’ll see that Pandas DataFrames and NumPy arrays provide more powerful ways to handle structured data, especially for materials science applications.

Key takeaway: Understand these basic Python data structures, but for practical data analysis, use pandas DataFrames (see pandas).

Operators

Python supports various operators for arithmetic, comparison, logical operations, etc.

Arithmetic Operators

  • Addition (+) adds two numbers.

  • Subtraction (-) subtracts the second number from the first. Multiplication (*) multiplies two numbers.

  • Division (/) divides the first number by the second.

  • Modulus (%) returns the remainder of the division.

  • Exponentiation (**) raises the first number to the power of the second.

  • Floor Division (//) divides the first number by the second and returns the largest integer less than or equal to the result.

Comparison Operators

  • Equal to (==) checks if two values are equal.

  • Not equal to (!=) checks if two values are not equal.

  • Greater than (>) checks if the first value is greater than the second.

  • Less than (<) checks if the first value is less than the second.

  • Greater than or equal to (>=) checks if the first value is greater than or equal to the second.

  • Less than or equal to (<=) checks if the first value is less than or equal to the second.

Logical Operators

  • Logical AND (and) returns True if both operands are true.

  • Logical OR (or) returns True if at least one of the operands is true.

  • Logical NOT (not) returns True if the operand is false.

  • If both operands are not boolean, logical operators will return one of the operands based on the evaluation.

is and in Operators

  • is: The is operator checks if two variables refer to the same object in memory. It returns True if they do, and False otherwise.

  • in: The in operator checks if a value is present in a sequence (such as a list, tuple, set, or dictionary). It returns True if the value is found, and False otherwise.

Short-Circuiting Operators

Short-circuiting operators are a type of logical operator that stops evaluating as soon as the result is determined. In Python, and and or are short-circuiting operators. Short-circuiting operators can help improve performance by avoiding unnecessary evaluations.

  • and: If the first operand is False, the second operand is not evaluated because the result is already False.

  • or: If the first operand is True, the second operand is not evaluated because the result is already True.

a = 10
b = 3
print(a + b)  # Output: 13
print(a > b)  # Output: True
print(a and b)  # Output: 3 (since both are non-zero)
print(a or b) # Output: 10 (since a is non-zero)


print(a is b)  # Output: False
print(name is hello_str)  # Output: False
print('apple' in fruits)  # Output: True
print(5 in squares)  # Output: False
print('Alice' in students)  # Output: True

print("this is true" and "this is false") # Output: this is false
print("this is true" or "this is false") # Output: this is true
print([1,2,3] and "this is true")# Output: this is true
print([1,2,3] or "this is true")# Output: [1, 2, 3]


print(True == 1)  # Output: True
print(False == 0)  # Output: True
print(True + False/True)  # Output: 1.0
print(True == "this is true")  # Output: False
print(False == "this is false")  # Output: False
print(False =="")  # Output: False
13
True
3
10
False
False
True
False
True
this is false
this is true
this is true
[1, 2, 3]
True
True
1.0
False
False
False

Control Flow

Control flow refers to the order in which statements are executed in a program. It includes conditional statements (if, elif, else), loops (for, while), and control transfer statements (break, continue, return). These statements allow you to execute code based on conditions, repeat code, and manage the program’s flow.

  • Conditional Statements: Conditional statements execute code blocks based on specific conditions. The if statement evaluates a condition and executes the code block if the condition is True. The elif (else if) statement checks multiple conditions sequentially. The else statement executes a code block if none of the preceding conditions are True.

age = 18.0
if age < 18:
    print("Minor")
elif age == 18:
    print("Just turned adult")
else:
    print("Adult")
Just turned adult
  • Loops: for, while for loops are used to iterate over a sequence (such as a list, tuple, dictionary, set, or string) and execute a block of code for each item in the sequence. The range() function is often used with for loops to generate a sequence of numbers.

while loops are used to repeatedly execute a block of code as long as a condition is true. The condition is evaluated before the execution of the loop’s body.

# For loop
for i in range(5):
    print(i)

# While loop
count = 0
while count < 5:
    print(count)
    count += 1

# Break and continue
print("Break example")
for i in range(10):
    if i == 5:
        break
    if i % 2 == 0:
        continue
    print(i)
0
1
2
3
4
0
1
2
3
4
Break example
1
3

Functions

Functions are reusable blocks of code that perform a specific task. They are defined using the def keyword.

Defining a Function

To define a function, use the def keyword followed by the function name and parentheses (). Inside the parentheses, you can specify parameters (arguments) that the function can accept. The function body is indented and contains the code to be executed.

For example:

def greet(name):
    return f"Hello, {name}!"

print(greet("Alice"))
Hello, Alice!

Using *args and **kwargs

Sometimes, you need the number of arguments to be flexible. So *args and **kwargs are used to pass a variable number of arguments to a function. They are useful when you want to create flexible functions that can handle a varying number of inputs.

Comparison of *args and **kwargs

Feature*args**kwargs
TypeTupleDictionary
UsageNon-keyword argumentsKeyword arguments
Exampleexample_function(1, 2, 3)example_function(name="Alice", age=30)
AccessBy indexBy key
Use CaseWhen the number of positional arguments is unknownWhen the number of keyword arguments is unknown

Keywords arguments (key = value) are passed as a dictionary, while non-keyword arguments are passed as a tuple.

Here’s an example of using *args and **kwargs together:

def example_function(*args, **kwargs):
    for arg in args:
        print(f"arg: {arg}")
    for key, value in kwargs.items():
        print(f"{key}: {value}")

example_function(1, 2, 3, name="Alice", age=30)
arg: 1
arg: 2
arg: 3
name: Alice
age: 30

Object-Oriented Programming (OOP)

OOP is a programming paradigm based on the concept of “objects”, which can contain data and code to manipulate that data. Python supports OOP and allows you to define classes and create objects.

Why We Need Classes:

  • Encapsulation: Classes allow you to bundle data (attributes) and methods (functions) that operate on the data into a single unit. This helps in organizing code and makes it more modular and manageable.

  • Reusability: Once a class is defined, it can be reused to create multiple objects. This avoids code duplication and promotes code reuse.

  • Inheritance: Classes can inherit attributes and methods from other classes, allowing you to create a hierarchy of classes and promote code reuse.

  • Polymorphism: Classes can define methods that can be overridden by subclasses, allowing for flexible and dynamic behavior.

Key Concepts:

  • Class: A blueprint for creating objects. It defines a set of attributes and methods that the created objects will have.

  • Object: An instance of a class. It is created using the class blueprint and can have its own unique values for the attributes defined in the class.

  • Instance: A specific object created from a class.

  • Attributes: Variables that belong to an object. They represent the state or properties of an object.

  • Methods: Functions that belong to an object. They define the behavior or actions that an object can perform.

Understanding Classes in Python:

In Python, a class is defined using the class keyword followed by the class name and a colon. The attributes and methods of the class are defined within an indented block.

Classes in Python have several important features:

  • __init__() method: A special method (constructor) that initializes a new object. It’s called automatically when creating a new object.

  • . operator: Used to access attributes and methods of an object.

  • self parameter: Refers to the instance of the class being created or acted upon.

  • Instance methods: Methods that can access and modify object attributes.

Other Special Methods in Python Classes (Optional):

  • __str__(): Called when converting object to string (e.g., print(obj))

  • __repr__(): Called for object representation (e.g., in debugger)

  • __len__(): Defines behavior for len(obj)

  • __call__(): Makes object callable like a function

  • __eq__(): Defines equality comparison (==)

  • __lt__(), __gt__(): Define less than/greater than comparisons

  • __add__(), __sub__(): Define arithmetic operations

  • __getitem__(), __setitem__(): Enable index/key access

  • __enter__(), __exit__(): Support context manager protocol

  • __del__(): Destructor method called when object is garbage collected

  • __iter__(), __next__(): Enable iteration over object

These methods allow you to customize how objects of your class behave in different contexts.

Here is an example to illustrate the concept of classes and objects:

class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

    def greet(self):
        return f"Hello, my name is {self.name} and I am {self.age} years old."

# Creating an object
person1 = Person("Alice", 30)

# Accessing methods and attributes
print(person1.greet())  # Output: Hello, my name is Alice and I am 30 years old.
Hello, my name is Alice and I am 30 years old.

Inheritance and Polymorphism

Inheritance allows a class to inherit attributes and methods from another class. The class that inherits is called the child class, and the class being inherited from is called the parent class.

Polymorphism allows methods to be used interchangeably between different classes, even if they are implemented differently.

class Animal:
    def __init__(self, name):
        self.name = name

    def speak(self):
        raise NotImplementedError("Subclass must implement abstract method")

class Dog(Animal):
    def speak(self):
        return f"{self.name} says Woof!"

class Cat(Animal):
    def speak(self):
        return f"{self.name} says Meow!"

dog = Dog("Buddy")
cat = Cat("Whiskers")

print(dog.speak())  # Output: Buddy says Woof!
print(cat.speak())  # Output: Whiskers says Meow!
Buddy says Woof!
Whiskers says Meow!

In this example, Dog and Cat classes inherit from the Animal class and implement the speak method differently. This demonstrates polymorphism, as the speak method can be called on any Animal object, regardless of its specific type.

Modules and Packages

Modules are files containing Python code, and packages are collections of modules. You can import and use them in your programs. To use modules and packages in a VS Code Jupyter Notebook, follow these steps:

  • Importing Modules: You can import built-in modules or third-party packages using the import statement. For example:

    import math
    print(math.sqrt(16))  # Output: 4.0
  • Using Installed Packages: Once installed, you can import and use the package in your notebook:

    import numpy as np
    array = np.array([1, 2, 3])
    print(array)  # Output: [1 2 3]
  • Creating Custom Modules: You can create your own Python modules by writing Python code in a .py file. For example, create a file named mymodule.py with the following content:

    def greet(name):
         return f"Hello, {name}!"

    Then, you can import and use this custom module in your Jupyter Notebook:

    from mymodule import greet
    print(greet("Alice"))  # Output: Hello, Alice!
  • Organizing Code with Packages: You can organize your code into packages by creating a directory with an __init__.py file. For example:

    mypackage/
         __init__.py
         module1.py
         module2.py

    Then, you can import modules from the package:

    from mypackage import module1, module2
from mypackage import module1
i.greet("Alice")