mail unicampaniaunicampania webcerca

    Antonio BALZANELLA

    Insegnamento di STATISTICS FOR DATA SCIENCE

    Corso di laurea magistrale in DATA SCIENCE

    SSD: SECS-S/01

    CFU: 9,00

    ORE PER UNITÀ DIDATTICA: 72,00

    Periodo di Erogazione: Primo Semestre

    Italiano

    Lingua di insegnamento

    Inglese

    Contenuti

    Statistics è un corso introduttivo che non richiede precedenti conoscenze di statistica e che fornisce i concetti e i metodi della statistica descrittiva enfatizzando la comprensione dei principi della raccolta dati, dell’analisi dei dati e delle teorie sottostanti. Un parte significativa del corso è incentrata sull’utilizzo della statistica in casi reali.
    I principali argomenti del corso includono:
    Introduzione alla statistica
    Raccolta dati
    Visualizzazione e sintesi dei dati;
    Relazioni tra variabili
    Comprensione e comparazione delle distribuzioni

    Testi di riferimento

    Intro Stats, 5th Edition
    De Veaux, Velleman & Bock
    2018
    Pearson

    Obiettivi formativi

    Fornire le competenze per:

    1. Elucidare il concetto di variabilità, identificare e porre domande meritevoli di ulteriore investigazione;
    2. Pianificare un'indagine su dati statistici che includa l'identificazione di variabili nonchè la proposta di una strategia per la raccolta dei dati, che risponda agli obiettivi dell’indagine.
    3. Raccogliere, gestire e archiviare dati statistici pronti per l'analisi.
    4. Applicare metodi statistici fondamentali per esplorare, analizzare e visualizzare i dati
    5. Interpretare analisi statistiche e trarre conclusioni sul contesto oggetto di studio
    6. Utilizzare il pacchetto statistico gratuito R per l'elaborazione statistica e l'analisi dei dati

    Prerequisiti

    Conoscenze di base della matematica

    Metodologie didattiche

    Per raggiungere efficacemente gli obiettivi dichiarati, questo corso utilizzerà le seguenti strategie didattiche:
    lezioni frontali:

    Le lezioni si concentreranno sugli aspetti teorici della statisticha. Sono progettati per sviluppare la comprensione e la conoscenza delle statistiche descrittive e per rafforzare le abilità degli studenti nella raccolta dei dati, nell'analisi dei dati e nelle interpretazioni.
    Sessioni di laboratorio
    Verranno effettuate esercitazioni in laboratorio per introdurre agli studenti il software R. Lo scopo è quello di sviluppare le capacità analitiche e interpretative dei dati mediante software per l'analisi dei dati statistici

    Metodi di valutazione

    Prova intercorso scritta
    Esame finale basato sulla discussione di un caso studio. La prova intercorso ha un peso del 50% sulla valutazione finale.

    Programma del corso

    PART I: EXPLORING AND UNDERSTANDING DATA
    1. Introduction to statistics
    1.1 What Is Statistics?
    1.2. Data
    1.3 Variables
    1.4 Models

    2. Displaying and Describing Data

    2.1 Summarizing and Displaying a Categorical Variable
    2.2 Displaying a Quantitative variable
    2.3 Shape
    2.4 Center
    2.5 Spread

    3. Relationships Between Categorical Variables — Contingency Tables

    3.1 Contingency tables
    3.2 Conditional distributions
    3.3 Displaying Contingency Tables
    3.4 Categorical Variables

    4. Understanding and Comparing Distributions

    4.1 Displays for Comparing Groups
    4.2 Outliers


    5. The Standard Deviation as a Ruler and the Normal Model

    5.1 Using the standard deviation to Standardize Values
    5.2 Shifting and scaling
    5.3 Normal models
    5.4 Working with Normal Percentiles
    5.5 Normal Probability Plots
    PART II: EXPLORING RELATIONSHIPS BETWEEN VARIABLES
    6. Scatterplots, Association, and Correlation
    6.1 Scatterplots
    6.2 Correlation
    6.3 Warning: Correlation ≠ Causation
    6.4 *Straightening Scatterplots
    7. Linear Regression

    7.1 Least Squares: The Line of “Best Fit”
    7.2 The Linear model
    7.3 Finding the least squares line
    7.4 Regression to the Mean
    7.5 Examining the Residuals
    7.6 R2–The Variation Accounted for by the Model
    7.7 Regression Assumptions and Conditions
    8. Regression Wisdom

    8.1 Examining Residuals
    8.2 Extrapolation: Reaching Beyond the Data
    8.3 Outliers, Leverage, and Influence
    8.4 Lurking Variables and Causation
    8.5 Working with Summary Values

    English

    Teaching language

    English

    Contents

    Statistics is an introductory course that assumes no prior knowledge of statistics. Basic statistical concepts and methods are presented in a manner that emphasizes understanding the principles of data collection and analysis as well as those of theory. Much of the course will be devoted to discussions of how statistics is commonly used in the real world.
    Main topics include:
    Introduction to Statistics
    Collecting Data
    Displaying and summarizing Data
    Relationships Between Variables
    Understanding and Comparing Distributions

    Textbook and course materials

    Intro Stats, 5th Edition
    De Veaux, Velleman & Bock
    2018
    Pearson

    Course objectives

    To provide the skills to:
    1. Elucidate the concept of variation and identify and pose statistical questions requiring investigation
    2. Plan a statistical data investigation including identifying variables and measures and proposing a method of data collection that will answer the question posed.
    3. Collect, manage and store statistical data ready for analysis.
    4. Apply fundamental statistical methods to explore, analyse and visualise data
    5. Interpret statistical analysis and draw conclusions in context and in the presence of uncertainty
    6. Use the free and powerful statistical package R for statistical computing and reproducible analysis.

    Prerequisites

    Basic Mathematics Skills

    Teaching methods

    To effectively fulfill its stated goals, this course will make use of the following teaching strategies:
    Lectures

    Lectures will focus on the theoretical aspects of Statistics. They are designed to develop students’ understanding and knowledge of descriptive statistics, and to strengthen students’ skills in data collection, data analysis and interpretations.
    Laboratory Sessions
    A series of laboratory sessions will be given to familiarise students with the software R. The purpose is to develop students’ data analytical and interpretation skills that are necessary for the analysis of statistical data.

    Evaluation methods

    Written mid-term examination
    Final examination consisting in a case study discussion.
    The weight of mid-term examination on the global marks is 50%

    Course Syllabus

    PART I: EXPLORING AND UNDERSTANDING DATA
    1. Introduction to statistics
    1.1 What Is Statistics?
    1.2. Data
    1.3 Variables
    1.4 Models

    2. Displaying and Describing Data

    2.1 Summarizing and Displaying a Categorical Variable
    2.2 Displaying a Quantitative variable
    2.3 Shape
    2.4 Center
    2.5 Spread

    3. Relationships Between Categorical Variables — Contingency Tables

    3.1 Contingency tables
    3.2 Conditional distributions
    3.3 Displaying Contingency Tables
    3.4 Categorical Variables

    4. Understanding and Comparing Distributions

    4.1 Displays for Comparing Groups
    4.2 Outliers


    5. The Standard Deviation as a Ruler and the Normal Model

    5.1 Using the standard deviation to Standardize Values
    5.2 Shifting and scaling
    5.3 Normal models
    5.4 Working with Normal Percentiles
    5.5 Normal Probability Plots
    PART II: EXPLORING RELATIONSHIPS BETWEEN VARIABLES
    6. Scatterplots, Association, and Correlation
    6.1 Scatterplots
    6.2 Correlation
    6.3 Warning: Correlation ≠ Causation
    6.4 *Straightening Scatterplots
    7. Linear Regression

    7.1 Least Squares: The Line of “Best Fit”
    7.2 The Linear model
    7.3 Finding the least squares line
    7.4 Regression to the Mean
    7.5 Examining the Residuals
    7.6 R2–The Variation Accounted for by the Model
    7.7 Regression Assumptions and Conditions
    8. Regression Wisdom

    8.1 Examining Residuals
    8.2 Extrapolation: Reaching Beyond the Data
    8.3 Outliers, Leverage, and Influence
    8.4 Lurking Variables and Causation
    8.5 Working with Summary Values

    facebook logoinstagram buttonyoutube logotype