Agresti (2021). Foundations of Statistics for Data Scientists: With R and Python
Obiettivi Formativi
The course introduces the main concepts of inferential statistics and regression.
Knowledge: Students will acquire knowledge on the main tools and methods proper of inferential statistics and regression: the concept of statistical model, the tools of point estimation, interval estimation, and statistical hypothesis testing. Students will be also able to practically solve inferential problems by means of the R software.
Acquired expertise: Students will acquire the ability to identify the most appropriate inferential procedure to use based on the specific goals of the analysis. They will be able to critically understand features and limits of models and methods presented during the course, as well as to properly interpreter and describe the results of the analyses.
Prerequisiti
Maths knowledge: basic operations and properties; functions; special functions (power, exponential, logarithm); integrals and differential calculus; some basic knowledge of matrix algebra
Metodi Didattici
Lectures, sessions of exercises, and labs
Modalità di verifica apprendimento
Written test. The written test consists of exercises on inferential methods introduced during the course. In particulars, the written test consists of exercises on inferential statistics, and regression to be solved with the support of the R software.
The written test will be evaluated using a score in thirtieths with honors if applicable. The written test is successfully passed if the grade awarded is at least 18.
Exercises aim at assessing the acquired knowledge of methods and tools introduced during the course, as well as verifying the students' ability to select the most appropriate inferential procedure to provide an answer to practical problems.
Programma del corso
Descriptive statistics: frequency distributions, graphical representations, measures of central tendency and variability.
Elements of probability: sample space and events, event algebra, axioms of probability, conditional probability.
Random variables: simple distributions, joint distributions and conditional distributions, expected value and variance of distributions of random variables, models for discrete random variables, models for continuous random variables
Sampling distributions and Central Limit Theorem.
Point and interval estimation: maximum likelihood (ML) estimators, properties of ML estimators, confidence intervals for the mean, proportion, variance, difference between means and variances of two populations.
Hypothesis testing: motivation and framework, definition of statistical hypotheses and statistical tests, type I and II errors, significance level and power of a test, test for the mean, proportion, variance and difference between means and variances.
Linear regression: model definition, parameter estimation and inference, model adequacy and diagnostics