How to Learn Python Pandas - Best Resources (2023)
So you want to learn the Pandas library and you’re looking for a book you can reference at any time? That’s exactly what you’ll find in this article. We bring you the top 10 books you can get your hands on today, at least if you opt for the digital version.
The list of books you’re about to see isn’t ordered, so book #1 isn’t better than book #7 - at least according to us. Read our description, and pro and con list, and decide for yourself if it’s a perfect match for you.
If you decide to purchase, please use the links listed in this article. This way, we get a small fee without any impact on the total price you pay. Thank you!
Table of contents:
Pages: 497 | Published on: Dec 2021 | Avg. rating: 4.7 | Num. reviews: 198
Years of knowledge condensed into sub 500 page book on best practices for data manipulation with Pandas. The book covers Pandas Series manipulation, creating columns, summary statistics, grouping, pivoting, cross tabulation, time series data, visualizing, chaining, code debugging, and much more. It’s ideeal for programmers, data scientists, data engineers, and packed with real-life examples explained clearly.
- Clear explanations and usage examples of Pandas Series and DataFrames
- Examples of how to combine Pandas operations efficiently
- Excelent coverage of all Pandas must-know topics
- Author explains the topics clearly and makes sure the reader understands it
- Formatting on Kindle is kindof messy
- Occasional gramatical hickups
- Not the most up to date with the recent Pandas updates
Pages: 744 | Published on: Jun 2022 | Avg. rating: 4.6 | Num. reviews: 15
If you want a book full of hands-on examples and exercises, look no more. The Pandas Workshop packs over 700 pages and most of them are specifically on Pandas. There are minor sections on data visualization and regression modeling, but nothing that would stray you away from the main topic. Expect to learn ins and outs of Data I/O, Pandas Series, DataFrames, data exploration, transformation, preprocessing, visualization, regression modeling, and even time series. Each chapter is packed with exercises, which means you’ll be required to put in the work constantly.
- Excellent balance between explanations and exercises
- Truly a comprehensive one-stop shop for learning Pandas
- Excellent flow from one chapter to another
- It somehow feels like the authors are speaking directly to you
- Readers have found errors in a couple of places
- Authors sometimes follow practices that aren’t considered best nowadays
- Sometimes inconsistent presentation among different chapters - reasonable because there are four book authors
Pages: 626 | Published on: Feb 2020 | Avg. rating: 4.4 | Num. reviews: 94
Another book by Matt Harrison in this list, consisted of practical and easy-to-implement recipes for quick solutions to the most common problems in Pandas. It’s a book that covers the library in around 600 pages, hence being the most detailed one on this list. The book starts by covering Pandas foundations, Series and DataFrame operations, Data I/O, approaches to exploratory data analysis using Pandas, filtering, grouping, and transformation operations, time series analysis, data visualization with pure Pandas, Matplotlib, and Seaborn, and finishes of with the section on debugging and testing Pandas code.
- You’ll get a great grasp on how to work with data in Pandas efficiently
- Examples/recipes separate this book from the rest
- You can use the book only as a reference, no need to read it entirely
- There is no book index, so you’ll have to reference the topics from the table of contents
- The book is from 2020, so some functions might be deprecated
Pandas 1.x Cookbook
Practical recipes for scientific computing, time series analysis, and EDA
A book full of recipes that will show you how to solve 95% of the problems you'll encounter daily, brought to you by Matt Harrison and Ted Petrou. For the remaining 5%, you'll have to do some creative work.Buy Now on Amazon
Pages: 512 | Published on: Dec 2022 | Avg. rating: 4.5 | Num. reviews: 280
The book displays practical knowledge and insights for solving real-world problems with Pandas, even for beginners to Python and data analytics. Key concepts are introduced through simple examples, and exercise complexity is built incrementally. Updated for Python 3.9, and has extended coverage of plotting with Seaborn, and online bonus materials on GeoPandas, Dask, and Altair. The book has dedicated sections on plotting and covers machine learning extensively, which is something you may or may not like. Other than that, you can expect coverage of all Pandas topics that are a must in data analysis.
- Daniel manages to deliver without getting the readers bogged with theory, math, and similar
- Materials are available on a companion GitHub repo
- Provides a lot of explanation of basics in the appendixes
- Is somewhat of a complete package as it covers machine learning and data visualization
- Isn’t a complete Pandas reference since it covers data visualization and machine learning in addition to Pandas
- Some readers find the examples to be too basic
- Feels rushed on some sections
- Some users complain about the print quality
Pages: 579 | Published on: Sep 2020 | Avg. rating: 4.4 | Num. reviews: 80
Pandas is a big talking point, but not the only one, the book has an excellent primer on Python programming language, so even beginners will feel comfortable. After the primer, there are about 350 pages related to Pandas covering the basics of Index, Series, DataFrames, data loading, cleaning, wrangling, plotting, grouping, and even time series analysis. There are also sections on Numpy and advanced Numpy if you want to learn that library as well.
- Written by the author of Pandas library
- Gives an excellent foundation for data analysis since it covers Python, Pandas, and Numpy, but still focuses primarily on Pandas
- Excellent step-by-step instructions
- Not too many real-world examples (mostly made-up data)
- Not for complete Python beginners even though there’s a primer section
Pages: 788 | Published on: Apr 2021 | Avg. rating: 4.3 | Num. reviews: 77
A long book, to say at least. The first 250 pages cover Pandas, then the focus shifts to data visualization with Matplotlib and Seaborn for 150 pages. Next, it has a 100-page hands-on real-world example, and the remainder of the book focuses on machine learning. The book is tailored towards beginners in data science and analysis, but also Python developers that want to get into these fields. This is a useful book if you want to learn Pandas as a stepping stone for machine learning workflows. Book has a Python crash course if you’re a bit rusty on the subject.
- GitHub link for code and datasets
- Structured properly concerning theory to the practical ratio
- Has built-in data visualization sections so you’re not stuck only looking at table data
- Good stepping stone if you want to dive into machine learning
- Some readers report the GitHub code isn’t updated
- Some readers find it superficial
- Not the longest on Pandas particularly, since it incorporates data visualization and machine learning
Hands-on Data Analysis with Pandas
A Python data science handbook for data collection, wrangling, analysis, and visualization
A long, heavy book covering many aspects of data science - including data analytics, data visualization, and machine learning. Almost 800 pages won't leave you wanting more, brought to you by Stefanie Molin.Buy Now on Amazon
Pages: 440 | Published on: Oct 2021 | Avg. rating: 5.0 | Num. reviews: 11
This book is primarily about Pandas but provides appendixes for environment configuration, Python, Numpy, and regular expressions. You’ll learn how to work with Pandas Series and DataFrame objects, how to filter DataFrames, work with indexes, reshape, group, and wrangle data, work with dates and times, how to import/export data to and out of Pandas, and also how to make some of the most basic data visualizations.
- The book author knows how to teach and doesn’t spend time on filler content
- Practical, full of useful examples, and light on theory
- Great reference book on things that are easy to forget
- Supporting code and datasets on GitHub
- Data visualization chapter is somewhat rushed and unfinished, but this isn’t the book for that anyway
Pandas in Action
Start mastering Pandas with skills you already know from spreadsheet software
Take the next steps in your data science career! This friendly and hands-on guide shows you how to start mastering Pandas with skills you already know from spreadsheet software - by Boris Paskhaver.Buy Now on Amazon
Pages: 446 | Published on: Jun 2017 | Avg. rating: 4.0 | Num. reviews: 4
Somewhat of an old book (2017), but still packaged with 400+ pages of Pandas materials. If you opt for this one, you’ll begin by getting the big picture behind Pandas and Data Analysis, followed by deep dive into Pandas concepts, such as Series, DataFrames, data indexing, handling categorical data, statistical summaries, data accessing from various sources, data aggregations, and time series modeling. There’s even a full analysis project based on historical stock prices at the end of the book, which is an excellent way of connecting theory with practice.
- Covers pretty much everything you’d want when first learning about Pandas
- Has good sections on working with time series data
- Can be slightly out of date in some places
- Lots of typos (editing seems to be rushed)
- More expensive than some more recent and feature-rich alternatives
Pages: 212 | Published on: Jun 2016 | Avg. rating: 4.2 | Num. reviews: 90
Another, somewhat older book by Matt Harrison. It offers no primer in Python or data analysis, so you’re expected to have some experience in those fields. The book covers pretty much everything you’d want in a condensed 200-page format, from Series, Indexing, DataFrames, summary statistics, grouping, pivoting, reshaping, joining, handling missing values, and so on. There’s also a dedicated section on analyzing and plotting Avalanche data with Pandas, so this is a nice and practical way to connect all of the theoretical concepts.
- Cheapest book on the list, only $19.99 (March 2023)
- Good introductory book to Pandas if you have some experience with Python
- Condensed file format
- Not for newcomers to Python as it offers no primer sections
- The book is old (2016), so don’t expect things to necessarily be up to date with the most recent pandas releases
- Lacks examples because it’s so short
Pages: 218 | Published on: Jan 2022 | Avg. rating: 3.7 | Num. reviews: 7
Concise book with a brief Python introduction, and a dive into Pandas through environment setup, Series, DataFrames, data importing in the most common file formats, handling missing values, filtering and dropping rows/columns, sorting, summary statistics, grouping, merging, visualizing data, and also the basics of time series. Lacks depth since there are so many topics covered in only 218 pages, while the font used in the book is also fairly large.
- Comes with a PDF copy and example files that can be downloaded for free
- Straight and to the point
- Covers many aspects of Pandas for a such short format
- Occasional errors in code and logic, which might be a dealbreaker for beginners
- Not much depth or examples