Stats refresh: Random variables
February 19, 2024
Good experimental design makes for clean data analysis.
Knowing which statistical techniques you analyse helps to plan your design.
Choose the statistical approach that best fits your needs (graphs, tests, confidence intervals, regressions).
Think of what kind of data you can collect, to get the cleanest possible test of your hypothesis.
Compute the sample size necessary to meaningfully test your hypotheses.
A random variable \(X\) is a measurable function \(X: \Omega \rightarrow E\) from a sample space \(\Omega\) (a set of possible outcomes) to a measurable space \(E\).
\[ P(X \in S) = P(\{\omega \in \Omega \mid X(\omega) \in S\}) \]
Experiment | Random variable |
---|---|
Toss two dice | \(Y =\) sum of the numbers |
Toss a coin 25 times | \(Y =\) number of heads in 25 tosses |
Apply different amounts of fertilizer | \(Y =\) yield per acre |
Can communicate / can not communicate | \(Y =\) contribution to public good |
Reflects a notion of magnitude, that is, if the values it can take are numbers. A quantitative variable represents thus a measure and is numerical.
Note: For all measurements, we usually stop at a standard level of granularity
also referred as categorical variables or factors are variables that are not numerical and which values fits into categories.
In other words, a qualitative variable is a variable which takes as its values modalities, categories or even levels, in contrast to quantitative variables which measure a quantity on each individual.
Note: that a qualitative variable with exactly 2 levels is also referred as a binary or dichotomous variable.
On the other hand, a qualitative ordinal variable is a qualitative variable with an order implied in the levels. For instance, if the severity of road accidents has been measured on a scale such as light, moderate and fatal accidents, this variable is a qualitative ordinal variable because there is a clear order in the levels.
Another good example is health, which can take values such as poor, reasonable, good, or excellent. Again, there is a clear order in these levels so health is in this case a qualitative ordinal variable.