SubjectsSubjects(version: 945)
Course, academic year 2023/2024
   Login via CAS
Data Science 2 - NMFP436
Title: Data Science 2
Guaranteed by: Department of Probability and Mathematical Statistics (32-KPMS)
Faculty: Faculty of Mathematics and Physics
Actual: from 2021
Semester: summer
E-Credits: 5
Hours per week, examination: summer s.:2/2, C+Ex [HT]
Capacity: unlimited
Min. number of students: unlimited
4EU+: no
Virtual mobility / capacity: no
State of the course: taught
Language: English, Czech
Teaching methods: full-time
Teaching methods: full-time
Guarantor: RNDr. Václav Kozmík, Ph.D.
doc. RNDr. Michal Pešta, Ph.D.
Marek Teller
Class: M Mgr. FPM
M Mgr. FPM > Povinně volitelné
M Mgr. PMSE
M Mgr. PMSE > Povinně volitelné
Classification: Informatics > Software Applications
Mathematics > Probability and Statistics
Is pre-requisite for: NMFP556
Annotation -
Last update: doc. RNDr. Martin Branda, Ph.D. (11.12.2020)
A crucial part of big data analysis is machine learning. Machine learning is widely used and is successful when solving complex tasks in many fields. This course serves as an introduction to basic machine learning principles and its use in practice. It presents the most used methods as decision trees or neural networks, which will be implemented in practicals in Python language. We will focus on analysis of real data and interpretation of the results.
Aim of the course -
Last update: RNDr. Jitka Zichová, Dr. (06.05.2021)

An introduction to basic machine learning principles and its use in practice.

Course completion requirements -
Last update: RNDr. Václav Kozmík, Ph.D. (09.02.2022)

Details can be found on the webpage: https://www2.karlin.mff.cuni.cz/~kozmikk/DS2.php

Literature -
Last update: RNDr. Václav Kozmík, Ph.D. (11.12.2020)

Yoshua Bengio, Ian Goodfellow, Aaron Courville: Deep learning, MIT Press, In preparation.

Jürgen Schmidhuber: Deep learning in neural networks: An overview, Neural networks 61 (2015): 85-117.

Friedman, J. H. (March 1999): Stochastic Gradient Boosting, Computational Statistics and Data Analysis, vol. 38, pp. 367-378

Teaching methods -
Last update: RNDr. Jitka Zichová, Dr. (06.05.2021)

Lecture + exercises.

Requirements to the exam -
Last update: RNDr. Václav Kozmík, Ph.D. (21.04.2022)

Exam will include solving a practical task in Python with discussion about selected algorithm, its theoretial background and results achived in the practical task. Student will receive a data set together with a description of the prediction task which needs to be solved.

Syllabus -
Last update: RNDr. Václav Kozmík, Ph.D. (11.12.2020)

Lectures:

• introduction to machine learning, motivation, examples

• general methods in machine learning: split of dataset to training and validation, over-fitting, regularization

• methods using decision trees: decision trees, random forest, gradient boosting

• methods using neural networks: simple neural networks, convolutional neural networks, recurrent neural networks

• clustering methods – supervised vs unsupervised

• other classification methods – support vector machine, naive Bayes

Practicals:

• Practicals will be held in computer lab and Python language will be used

• Machine learning algorithms will be applied on real data

Entry requirements
Last update: doc. Ing. Marek Omelka, Ph.D. (19.11.2021)

Necessary:

  • Basic calculus: derivatives, integrals, Taylor expansion, etc.
  • Basic probability and statistics: probability distributions, central limit theorem, statistical tests and hypotheses, Fisher information, maximum likelihood estimators
  • Basic programming skills (in any language)

Good to know:

  • Python: some basics will be covered, but can be challenging if the student has no experience with Python

 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html