Pandas is by far the most popular way to work with data these days. It offers a huge amount of functionality and lots of convenience for data analysts. This course gives a complete introduction.
3-5 days
Professionals working with data in other technologies like Excel, R, matlab etc.
Acquire all skills necessary for everyday Data operations using Python/Pandas, including exploring, querying, cleaning and transforming datasets, statistical and mathematical operations, time series, and visualization.
In-classroom or virtual. The entire course is hands-on and based on real-world tasks for Data Analysts
Outline
Below is an example of how this course might be delivered. Of course this is fully customizable to fit your needs.
Labs/Exercises
There are exercises available for all topics covered. Participants typically work on a Jupyter Lab environment hosted by Code Sensei, but other environments (e.g. Visual Studio Code on participants laptops) are supported as well.
1: Core Python recap
We will adjust the time spent on core Python skills according to the experience level of the participants.
- Course Introduction
- Group Introductions
- Overview of Learning Environment
- Variables
- Basic data types (int, str, float, bool)
- Input, Output, Type Conversions
- If statements
- While loops
- Functions
- Lists
- Dicts
- Tuples
- Sets
- For Loops
- Exceptions
Day 2: Numpy and Pandas, Part 1
- Comprehensions
- Lambda, map, filter
- Numpy introduction
- Understanding numpy arrays and dtypes
- Creating arrays
- Indexing and slicing numpy arrays
- Efficient computations using numpy
- Pandas introduction
- DataFrames and Series
- Reading/Writing a dataset (CSV, Excel, SQL, etc)
- Exploring a Dataset with Pandas
- Columns, dtypes, info()
- Selecting and indexing, .loc, .iloc
Day 3: Pandas, Part 2
- Updating selected values
- Boolean indexing
- Basic statistics
- Sorting by value and index
- Cleaning a Dataset with Pandas
- Detecting missing values
- Handling null values: bfill/ffill, dropna, fillna, interpolate
- Removing duplicates
- Converting column types
- Changing/fixing/resetting index
Day 4: Pandas, Part 3
- Transforming a DataSet with Pandas
- Apply mathematical functions and statistics
- Groupby
- Changing data structure: pivot, melt, stack, unstack
- Working with Time Series
- Joinining and concatenating datasets
- Visualization with pandas, and matplotlib
- Standard pandas plots: line, bar, scatter, box, histogram, etc.
- Subplots and shared axes
- Styling axes, colors, and lines
- Using common Seaborn plots