03 - Statistical Basics
Computing the mean
While computation of the mean is a subject as broad as life, we'll start with a simple average.
Data
Let's say we have a blood-data.csv
file:
uid,date,variable,unit,value
1,2022-04-04T11:00Z,blood:glucose,mmol/l,5.34
2,2022-05-20T12:00Z,blood:glucose,mmol/l,4.74
3,2022-04-04T11:00Z,urine:glucose,mmol/l,4.12
4,2022-05-20T12:00Z,urine:glucose,mmol/l,4.11
Save it to the file, and let's read this data to a dataframe:
Statistics
Read data
$ fx
In [1]: import pandas
In [2]: df = pandas.read_csv('blood-data.csv')
Computing the mean
In [3]: df.value.mean()
Out [3]: 4.5775
This is literally (5.34 + 4.74 + 4.12 + 4.11) / 4 = 4.5775
.
OK. Enough with the simple. Let's do some required sample size computations for hypothesis testing.
Last updated