Python for Data Analysis

Are you migrating your Data to Python from other technologies like Excel, R or SPSS? Learn how to analyze and transform datasets using Python and Pandas.

Pandas is by far the most popular way to work with data these days. It offers a huge amount of functionality and lots of convenience for data analysts. This course gives a complete introduction.
3-5 days
ansible python
ansible pandas
ansible jupyter
Professionals working with data in other technologies like Excel, R, matlab etc.
Acquire all skills necessary for everyday Data operations using Python/Pandas, including exploring, querying, cleaning and transforming datasets, statistical and mathematical operations, time series, and visualization.
In-classroom or virtual. The entire course is hands-on and based on real-world tasks for Data Analysts
Need this for your dev team?

Outline

Below is an example of how this course might be delivered. Of course this is fully customizable to fit your needs.

Labs/Exercises

There are exercises available for all topics covered. Participants typically work on a Jupyter Lab environment hosted by Code Sensei, but other environments (e.g. Visual Studio Code on participants laptops) are supported as well.

1: Core Python recap

We will adjust the time spent on core Python skills according to the experience level of the participants.

  • Course Introduction
  • Group Introductions
  • Overview of Learning Environment
  • Variables
  • Basic data types (int, str, float, bool)
  • Input, Output, Type Conversions
  • If statements
  • While loops
  • Functions
  • Lists
  • Dicts
  • Tuples
  • Sets
  • For Loops
  • Exceptions

Day 2: Numpy and Pandas, Part 1

  • Comprehensions
  • Lambda, map, filter
  • Numpy introduction
  • Understanding numpy arrays and dtypes
  • Creating arrays
  • Indexing and slicing numpy arrays
  • Efficient computations using numpy
  • Pandas introduction
  • DataFrames and Series
  • Reading/Writing a dataset (CSV, Excel, SQL, etc)
  • Exploring a Dataset with Pandas
  • Columns, dtypes, info()
  • Selecting and indexing, .loc, .iloc

Day 3: Pandas, Part 2

  • Updating selected values
  • Boolean indexing
  • Basic statistics
  • Sorting by value and index
  • Cleaning a Dataset with Pandas
  • Detecting missing values
  • Handling null values: bfill/ffill, dropna, fillna, interpolate
  • Removing duplicates
  • Converting column types
  • Changing/fixing/resetting index

Day 4: Pandas, Part 3

  • Transforming a DataSet with Pandas
  • Apply mathematical functions and statistics
  • Groupby
  • Changing data structure: pivot, melt, stack, unstack
  • Working with Time Series
  • Joinining and concatenating datasets
  • Visualization with pandas, and matplotlib
  • Standard pandas plots: line, bar, scatter, box, histogram, etc.
  • Subplots and shared axes
  • Styling axes, colors, and lines
  • Using common Seaborn plots
Adapt this course to fit your needs