SubjectsSubjects(version: 964)
Course, academic year 2024/2025
   Login via CAS
Statistical Methods for Data Analysis - NMST711
Title: Statistické metody analýzy dat
Guaranteed by: Department of Probability and Mathematical Statistics (32-KPMS)
Faculty: Faculty of Mathematics and Physics
Actual: from 2024
Semester: summer
E-Credits: 6
Hours per week, examination: summer s.:2/2, C+Ex []
Capacity: unlimited
Min. number of students: unlimited
4EU+: no
Virtual mobility / capacity: no
State of the course: taught
Language: English
Teaching methods: full-time
Additional information: https://dl1.cuni.cz/course/view.php?id=11676
Guarantor: doc. RNDr. Zdeněk Hlávka, Ph.D.
doc. Ing. Marek Omelka, Ph.D.
Teacher(s): doc. Ing. Marek Omelka, Ph.D.
Classification: Mathematics > External Subjects, Probability and Statistics
Annotation -
Resampling methods, multivariate statistical analysis, nonparametric kernel methods. A course for the students of the Faculty of Social Sciences, Charles University.
Last update: Omelka Marek, doc. Ing., Ph.D. (18.02.2025)
Aim of the course

Selected chapter from mathematical statistics useful in data analysis.

Last update: Omelka Marek, doc. Ing., Ph.D. (18.02.2025)
Course completion requirements

(1) Severel written assignments during the semester (summing to 60 points) in which the student analyses real data sets. The student needs to get in total at least 30 points from these assignments to be allowed for the oral exam.

(2) Oral exam (max 40 points) focusing on all topics of the course, with an emphasis on the theoretical part and correct understanding. During this oral exam it is possible to introduce revisions of the written assignments (to retrieve some points) provided that at least 50% of the points are gained from the original solution to the assignment.

The student needs to get at least 30 points from the original (not revised) solutions to written assignments and 51 points in total to pass the course. The final grade will be awarded based on your total number of points using the official faculty grading system, i.e.:

91 – 100 pts Excellent A

81 – 90 pts Very good B

71 – 80 pts Good C

61 – 70 pts Satisfactory D

51 – 60 pts Satisfactory- E

0 – 50 pts Failed F

Last update: Omelka Marek, doc. Ing., Ph.D. (18.02.2025)
Literature

Basic references

Venables, W. N., and Ripley, B. D. (2013). Modern applied statistics with S-PLUS. Springer Science & Business Media.

Wasserman, L. (2006). All of nonparametric statistics. Springer Science & Business Media.

Reading for meeting entry requirements

Kulich, M. and Omelka M. (2023) NMSA331: Mathematical statistics 1. https://www2.karlin.mff.cuni.cz/~omelka/Soubory/nmsa331/ms1_en.pdf

Wasserman, L. (2013). All of statistics: a concise course in statistical inference. Springer Science & Business Media.

Further recommended literature (for those who are interested in some of the topics):

Davison, A. C. and Hinkley, D. V. (1997) Bootstrap methods and their application. Cambridge University Press.

Efron, B. and Tibshirani, R. J. (1993) An Introduction to the bootstrap. Chapman & Hall.

Fan, J. and Gijbels, I. (1996) Local polynomial modelling and Its applications. Chapman & Hall/CRC.

Härdle, W. and Simar, L. (2015) Applied multivariate statistical analysis. Springer.

Mardia, K. V., Kent, J. T., and Bibby, J. M. (1979) Multivariate analysis. Academic Press Inc.

Omelka, M. (2023) NMST424: Modern statistical methods. https://www.karlin.mff.cuni.cz/~omelka/Soubory/nmst434/nmst434_course-notes.pdf

Wand, M. P. and Jones, M. C. (1995) Kernel smoothing. Chapman & Hall.

Last update: Omelka Marek, doc. Ing., Ph.D. (18.02.2025)
Teaching methods

Lecture+exercises.

Last update: G_M (07.05.2014)
Requirements to the exam

Requirements for the oral exam correspond to the syllabus of the subject in the scope that was presented at the lecture.

Last update: Hlávka Zdeněk, doc. RNDr., Ph.D. (06.10.2017)
Syllabus

Basic notions, one sample and two sample tests.

ANOVA and problems of multiple comparisons.

Resampling methods with focus on nonparametric/parametric boostrap.

Nonparametric regression: kernel estimators of densities and regression curves.

Multivariate statistical methods: visualisation, Hotelling's test, principal components, factor analysis, discriminant and cluster analysis.

Last update: Omelka Marek, doc. Ing., Ph.D. (18.02.2025)
Entry requirements

Basic courses in probability and mathematical statistics.

The student should know the following concepts:

random variable and its characteristics - expectation, variance, standard deviation, cumulative distribution

function, probability density function, quantile function

random vector and its characteristics - variance matrix, marginal distributions, covariance, correlation

coefficient,

independence of random variables/vectors

convergence in probability and distribution, Law of Large Numbers, Central Limit Theorem,

basic of hypothesis testing - type I error, type II error, level (size) of the test, power of the test.

See for instance: Wasserman, L. (2013). All of statistics: a concise course in statistical inference. Springer

Science & Business Media.

At least elementary knowledge of R computing environment (https://cran.r-project.org/).

Last update: Omelka Marek, doc. Ing., Ph.D. (18.02.2025)
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html