SubjectsSubjects(version: 964)
Course, academic year 2024/2025
   Login via CAS
Data science in R - B90296
Title: Data Science in R
Guaranteed by: Institute of Pathological Physiology First Faculty of Medicine Charles University (11-00180)
Faculty: First Faculty of Medicine
Actual: from 2024
Semester: both
Points: 0
E-Credits: 0
Hours per week, examination: 10/0, C [HS]
Capacity: unlimited
Min. number of students: unlimited
4EU+: no
Virtual mobility / capacity: no
State of the course: taught
Language: English
Teaching methods: combined
Note: course is intended for doctoral students only
course can be enrolled in outside the study plan
you can enroll for the course in winter and in summer semester
Guarantor: doc. MUDr. Petr Waldauf, Ph.D.
prof. RNDr. Jan Hendl, CSc.
Annotation
The course aims to teach participants the basics of working with the statistical program R in the graphical environment of RStudio. The goal is to acquire fundamental experience and practical skills for the rapid and efficient processing of clinical data using the tidyverse package collection. The course will cover an introduction to statistical data processing in R (descriptive statistics, advanced data visualization, basic statistical tests, and an introduction to multivariate analysis). No prior experience with the R programming language is required from participants. The instruction will be hands-on, directly in RStudio, which participants will install on their own computers. Teaching scripts and data sets will be provided. Participants must have their own laptop. The topics covered include, for example: • Introduction to R Overview of data science, software for data science, study literature, DataCamp, cheatsheets, introduction to R and RStudio, installation of R and RStudio, first commands in R • How to load and organize your data: Introduction to tidyverse Why Tidyverse, data import into R (csv, xlsx) – reader, glimpse, introduction to data wrangling – pipeline, dplyr (select, filter, mutate, arrange, rename, group_by, summarise) • Images for articles and posters: Introduction to visualization ggplot (scatter plot, bar plot, box plot, histogram, facets, geom_smooth) • Publishing directly from RStudio: Introduction to markdown Why markdown, reproducible research, pandoc, html, ms word, pdf • Working with factors, strings, and dates/times factor, forcats, stringr, lubridate • Introduction to functions • Relational data Wide + long data formats, what relational data are, how data are stored in databases, left_join, right_join, full_join, anti_join, wide • Introduction to statistical data processing Descriptive statistics (gtsummary), t-test (paired, unpaired), non-parametric tests, categorical data, chi-square test, Fisher's test, linear regression, logistic regression, time-to-event analysis, multivariate linear, logistic, and Cox regression…
Last update: Machová Marie, Bc., DiS. (15.08.2024)
Literature - Czech

Povinná:

  • Wickham, Hadley Grolemund, Garrett. R for data science : import, tidy, transform, visualize and model data. Beijing ; Boston ; Farnham ; Sebastopol ; Tokyo: O'Reilly Media, 2016, 492 s. ISBN 978-1-491-91039-9.
  • Biostatistics With 'R': A Guide for Medical Doctors [online]. Dostupné z: https://www.bigbookofr.com/

Last update: Machová Marie, Bc., DiS. (15.08.2024)
 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html