Top Free Data Science Books

There are probably thousands upon thousands of tutorials, articles, videos, and blog posts on all things data science on the internet now. Yet I’m still a big fan of books.

Throughout history books have given wisdom, advice, and knowledge to everyone who wants to read them. Seneca, a Stoic philosopher, mentioned something similar:

Men who have made these discoveries before us are not our masters, but our guides.

So let books also be your guide in your data science journey along with the tutorials, articles, and videos. And to help you get started or to add to your collection, below is a list of some great, free books on different aspects of data science.

Programming

The main thing that differentiates a data scientist from a data analyst or statistician is their ability to write code. It’s no secret that the two biggest languages for data science is Python and R. Both have their respective advantages and disadvantages and it’s not going to hurt if you learn one over the other.

Automate the Boring Stuff is a great resource for beginners with Python programming or programmers who have years of experience as there is so many useful examples in this book that can be used. I have found this especially helpful as someone newer to the Python language but not new to programming in general.

The R Programming wikibook is a great resource for starting to learn the R programming language. This does have a lot more in terms of statistics and math, but the whole reason for R in the first place is to have a language to help do those calculations.

Statistics

The opposite of the above, a data scientist knows more statistics than the average programmer. Statistics is a huge field in itself, so just a basic knowledge of it can set you apart from the rest.

OpenIntro Statistics is the textbook if you take Coursera’s Statistics with R specialization. I’ve been going through this book as I’ve been taking the classes and have found it very helpful as another resource to my understanding of statistics.

Think Stats is another introductory statistics book, yet they introduce the statistics – not with formulas – but with Python code. For Bayesian statistics, there’s also a companion piece – Think Bayes.

Data Science

There are quite a lot of data science books out there already. However, these two are among the best I’ve come across.

The Python Data Science Handbook by Jake Vanderplas is a great reference from getting started with Jupyter notebooks, understanding data with pandas, visualizations with matplotlib, and even some machine learning with scikit-learn. This book goes through all aspects of what a data scientist might due during their day.

R for Data Science is similar to the python book above, but goes through these things with R and different R packages such as dplyr for analyzing data and ggplot2 for visualizations. Written by Hadley Wickham who wrote most of the R packages used for data science.

Machine Learning

Introduction to Statistical Learning with Applications in R sounds like a statistics book; actually it is a statistics book. However, this book covers all of the machine learning algorithms you’ll come across. After this one, feel free to dive into it’s big brother – Elements of Statistical Learning.

Hands-on Machine Learning with scikit-learn and TensorFlow is better if you get the printed book which is the best I’ve yet to read on machine learning. This is only the Jupyter notebooks and it doesn’t include all of the text. But, you can still get a good idea of the code and examples the book offers.

Jonathan Wood

View Comments

  • Such kind of science book is very important for the students and i hope they like this book to learn more science in here. It will be best option to make them interest about science.

  • Such kind of science book is very important for the students and i hope they like this book to learn more science in here. It will be best option to make them interest about science.

Recent Posts

8-Step AWS to Microsoft Azure Migration Strategy

Microsoft Azure and Amazon Web Services (AWS) are two of the most popular cloud platforms.…

2 days ago

How to Navigate Azure Governance

 Cloud management is difficult to do manually, especially if you work with multiple cloud…

1 week ago

Why Azure’s Scalability is Your Key to Business Growth & Efficiency

Azure’s scalable infrastructure is often cited as one of the primary reasons why it's the…

3 weeks ago

Unlocking the Power of AI in your Software Development Life Cycle (SDLC)

https://www.youtube.com/watch?v=wDzCN0d8SeA Watch our "Unlocking the Power of AI in your Software Development Life Cycle (SDLC)"…

1 month ago

The Role of FinOps in Accelerating Business Innovation

FinOps is a strategic approach to managing cloud costs. It combines financial management best practices…

1 month ago

Azure Kubernetes Security Best Practices

Using Kubernetes with Azure combines the power of Kubernetes container orchestration and the cloud capabilities…

2 months ago