o Audience background:
I expect the audience have basic knowledge of Python and know how to use libraries. Some basic knowledge of statistics is a plus but not necessary.
o After this talk:
I hope the audience have some insights about how to do basic statistics in Python. I believe this knowledge can be fundamental if they want to pursue more detail in machine learning with Python.
o Presentation flow:
First, I will start about historical background of statistics and
Python. After that, I’ll introduce basic array calculation using numpy and
scipy, because in the next part of this presentation, we will use that concept.
Then, I will explain some fundamental functions in descriptive statistics like
mean, variance, etc.
Since statistics is strongly related to probability, thus I will explain some basic knowledge of probability distributions. For example, what is probability density function and probability mass function. Differences between discrete and continuous distributions will be explained also. Because there are so many type of discrete and continuous distributions, I will choose only one for each of type. After that, I will move to use a complete data using pandas and have some exploration with it. Finally, I want to discuss some statistical model that can be built in Python, from linear regression until clustering.
o Additional resource:
http://www.mv.helsinki.fi/home/jmisotal/BoS.pdf