# Resources for Learning Statistics and Data Mining

There are an abundance of resources on the web to help novices teach themselves statistics and machine learning, but they can be hard to track down. Below are a few resources that have helped me in the past.

1. **Khan Academy** has a series of
excellent videos on the basics of statistics and probability. They range from
5 minutes to 20 minutes in length. You can create an account to track your
progress if you desire. Although good for the basics, Khan Academy does not
cover advanced topics.

2. **The Elements of Statistical Learning**, a free ebook, is a relatively in-
depth overview of the major concepts and techniques involved in machine
learning. You will need to learn inferential statistics before reading this
book, as it assumes that the reader understands the basics.

3. **The Statsoft Online Statistics
Textbook** is a good resource on
statistics and machine learning. A reader will need a basic understanding of
statistics and probability before reading this text. It is not particularly in
depth, but is good for an overview of major concepts, or as a refresher.

4. **Machine Learning Videos from mathematicalmonk** can be found on youtube(the
link is to the full playlist). Although I have not viewed all of these videos
personally, they appear to be a good overview of machine learning techniques.

5. **Andrew Ng’s Online Machine Learning Class**
is a simplified version of the Stanford class CS229. It glosses over most of
the mathematics involved, but is a very good introductory resource for machine
learning. Mlclass also features quizzes and programming exercises that create
additional engagement. I would recommend that the reader study basic
statistics before attempting this class.

6. **Concepts and Applications of Inferential
Statistics** is written by a
professor at Vassar, and is a good introduction to basic statistics.

7. **Online Stat Book** is a good
resource for basic statistical concepts. It has lots of interactive exercises
interspersed throughout the text.

8. **Introduction to Statistical
Thought** covers basic
probability and statistics, and covers some more advanced topics such as time
series and survival analysis. It also has R code that the reader can
implement.

9. **Linear Algebra**, a free to
download ebook, provides an overview of linear algebra equivalent to an
initial undergraduate course. I have not read through it yet.

10. **MIT Opencourseware** is an excellent
site that has full video lecture series and problem sets for hundreds of
classes including linear algebra, calculus, and statistics.

**Further Reading**

This is an interesting blog post on how to learn mathematics as a programmer.