SubjectsSubjects(version: 945)
Course, academic year 2023/2024
   Login via CAS
Practical Fundamentals of Probability and Statistics for Computer Linguistics - NPFL136
Title: Praktické základy pravděpodobnosti a statistiky pro komputační lingvistiku
Guaranteed by: Institute of Formal and Applied Linguistics (32-UFAL)
Faculty: Faculty of Mathematics and Physics
Actual: from 2023
Semester: summer
E-Credits: 2
Hours per week, examination: summer s.:0/2, C [HT]
Capacity: unlimited
Min. number of students: unlimited
4EU+: no
Virtual mobility / capacity: no
State of the course: not taught
Language: English
Teaching methods: full-time
Teaching methods: full-time
Additional information: http://ufal.ms.mff.cuni.cz/courses/pfl081
Guarantor: RNDr. Martin Holub, Ph.D.
Class: Informatika Mgr. - volitelný
Classification: Informatics > Computer and Formal Linguistics
Incompatibility : NPFL081
Interchangeability : NPFL081
Is incompatible with: NPFL081
Is interchangeable with: NPFL081
Annotation -
Last update: RNDr. Jiří Mírovský, Ph.D. (12.05.2022)
ONLY for students in EM Program in LCT, see http://ufal.mff.cuni.cz/lct.html. The aim of the course is to introduce elementary probabilistic and statistical principles, techniques and methods which are used in solving computational linguistics (natural language processing) tasks. An essential part of the course is active work with data and introduction to workflow in R while solving a given task. A part of the course will consist of individual study of mutually agreed selected materials.
Course completion requirements -
Last update: RNDr. Jiří Mírovský, Ph.D. (12.05.2022)

Students should regularly attend the classes and pass a written test during the term and/or assignments of given tasks in R. Both theoretical knowledge and practical skills will be tested.

Literature -
Last update: RNDr. Jiří Mírovský, Ph.D. (12.05.2022)

Sheldon M. Ross: A First Course In Probability. (7th Ed.) Prentice Hall, 2005.

Gonick, Larry and Woollcott Smith. The Cartoon Guide to Statistics. Harper Resource. 2005.

Syllabus -
Last update: RNDr. Jiří Mírovský, Ph.D. (12.05.2022)
  • mathematical probability, its definition and calculating
  • random variable (discrete and continuos) and its probability distribution
  • distribution function, quantile function, density
  • statistical independence
  • expected value and variance
  • properties of binomial and normal distributions
  • random sampling
  • parameters of distributions, parameter estimating, t-test
  • statistical hypothesis testing, critical values
  • contingency tables, hypothesis testing in contingency tables
  • chi-square distribution, chi-square tests
  • entropy, conditional entropy, mutual information
  • basics of programming in R system (www.r-project.org)

 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html