random notes
statistics
Probability
long term relative frequency.
Random variables
function which takes outcome as input and real number as output
Probability distributions
- plot of random variable Vs. relative frequency
- Continous
- Gamma -> beta & weibull
- normal($ \mu , \sigma $)
- lognormal
- exponential($\lambda$)
- Beta
- t-distribution -> cauchy, heavy tailed compared to normal
Estimation
- estimator = function which gives estimate
- estimate is a random variables
- unbiased and consistent estimators is one of the primary goals
- estimation can be used for inference
- statistics like sample mean, median, trimmed mean, extreme mean etc are estimated
inference and confidence intervals
- p-value
- significance level
- sample size effect on interval length
regression
- SST = SSE + SSR
- Least squares is MLE for regression coefficient
alternative approaches to inference
- normality assumption wrong
- t-distribution based inference is worse than Wilcoxon signed rank test
- rank tests
- Bayesian inference, credibility interval Vs. confidence interval