Statistics with Python
  • Sequential
    • 01 - Introduction
    • 02 - Preparation
    • 03 - Statistical Basics
    • 04 - Work Process
    • 05 - Comparing the Means
  • Unsequential
    • BetaPERT Distribution
Powered by GitBook
On this page
  1. Sequential

03 - Statistical Basics

Computing the mean

While computation of the mean is a subject as broad as life, we'll start with a simple average.

Data

Let's say we have a blood-data.csv file:

uid,date,variable,unit,value
1,2022-04-04T11:00Z,blood:glucose,mmol/l,5.34
2,2022-05-20T12:00Z,blood:glucose,mmol/l,4.74
3,2022-04-04T11:00Z,urine:glucose,mmol/l,4.12
4,2022-05-20T12:00Z,urine:glucose,mmol/l,4.11

Save it to the file, and let's read this data to a dataframe:

Statistics

Read data

$ fx

In [1]: import pandas
In [2]: df = pandas.read_csv('blood-data.csv')

Computing the mean

In [3]: df.value.mean()
Out [3]: 4.5775

This is literally (5.34 + 4.74 + 4.12 + 4.11) / 4 = 4.5775.

OK. Enough with the simple. Let's do some required sample size computations for hypothesis testing.

Previous02 - PreparationNext04 - Work Process

Last updated 2 years ago