SubjectsSubjects(version: 945)
Course, academic year 2023/2024
   Login via CAS
Statistics in Biology III - Seminar on Advanced Statistical Methods - MB120P174
Title: Biostatistika III – Seminář pokročilých statistických metod
Czech title: Biostatistika III – Seminář pokročilých statistických metod
Guaranteed by: Department of Botany (31-120)
Faculty: Faculty of Science
Actual: from 2023
Semester: summer
E-Credits: 5
Examination process: summer s.:
Hours per week, examination: summer s.:0/6, C+Ex [DS]
Capacity: 20
Min. number of students: 5
4EU+: no
Virtual mobility / capacity: no
State of the course: not taught
Language: Czech
Note: enabled for web enrollment
Guarantor: prof. RNDr. Tomáš Herben, CSc.
Pre-requisite : MB120P163
Opinion survey results   Examination dates   Schedule   
Annotation -
Last update: Mgr. Michal Štefánek (29.06.2019)
The course aims to introduce students into a selection of frequently-used advanced techniques of statistical data
analysis. The course is a sequel to Statistics in biology and design of ecological experiments (MB120P163), which
is a prerequisite to this course. In justified cases (e.g. similar course in data analysis accomplished), the teachers
will allow enlisting into the course without achieving the prerequisite. The course shall consist of three two-day
teaching blocks of combined talks and practicals – 1) non-normally distributed response variables – generalised
linear models (GLM); 2) hiearchical experimental designs – mixed-effect models (LME, GLMM) and nested
ANOVAs; 3) models with spatially, temporally or phylogenetically correlated responses – generalised least
squares (GLS, PGLS).
Literature -
Last update: RNDr. Zdeněk Janovský, Ph.D. (25.10.2019)

Recommended literature:
Crawley, M. J. (2007) The R book. John Wiley & Sons Ltd., Chichester, UK.
Venables, W. N. & Ripley, B. D. (2002) Modern Applied Statistics with S. Springer Verlag, New York, USA.
Fitting Linear Mixed-Effects Models Using lme4 - https://cran.r-project.org/web/packages/lme4/vignettes/lmer.pdf
Pinheiro, J. C. & Bates, D. M. (2000) Mixed-Effects Models in S and S-PLUS. Springer-Verlag, New York, NY, USA.
Zuur, A., Ieno, E.N., Walker, N., Saveliev, A.A., Smith, G.M. (2009): Mixed Effects Models and Extensions in Ecology with R. Springer-Verlag, New York, NY, USA.
Zuur, A., Ieno, E.N., Smith, G.M. (2007) Analysing Ecological Data. Springer-Verlag, New York, NY, USA.
Swenson, N. (2014) Functional and phylogenetic ecology in R. Springer Publishing, New York. 
Paradis, E. (2012) Analysis of phylogenetics and evolution with R. Springer Publishing, New York.

Requirements to the exam -
Last update: RNDr. Zdeněk Janovský, Ph.D. (25.10.2019)
The exam will be awarded on the basis of successful elaboration of three classified home assignments (one after each block of lecture) consisting of analyzing together 6 data sets using taught statistical techniques. The condition for obtaining the exam is to obtain at least 60% points from the home assignments.
Syllabus -
Last update: RNDr. Zdeněk Janovský, Ph.D. (19.12.2020)

Schedule of the individual two-day blocks:

Block 1 - Generalized Linear Models (GLM) and introduction to hierarchical designs

Day 1 - morning
Theory (cca 2 h) - Introduction to GLM, concept of deviance, link functions, etc., introduction to logistic regression
Exercise (approx. 1 h) - logistic regression, its assumptions, interpretation, construction of confidence intervals of the logistic curve

Day 1 - afternoon
Theory (approx. 1 h) - GLM with binomial and Poisson distribution, treatment of overdispersion
Exercises (approx. 3 h) - practical analyses using poisson and binomial GLM, interpretation of diagnostic graphs, detection and treatment of overdispersion

 

Day 2 - morning
Theory (approx. 1 h) - GLM with gamma distribution, other than canonical link-functions
Exercise (approx. 2 h) - practical exercises on GLM from the whole spectrum of variants discussed so far

Day 2 - afternoon
Theory (approx. 1 h) - Hierarchical data designs and hierarchical ANOVA (split-plot, hierarchical ANOVA s.s.)
Exercises (approx. 1.5 h) - Identification of individual layers of hierarchical designs, practical implementation of hierarchical ANOVs, auxiliary linear models for verification of assumptions
Theory (approx. 1.5 h) - Revision of the concept of random effect factors and introduction to linear models with mixed effects

1st classified homework: analysis of two data sets focused on GLM usage


Block 2 - Mixed Effect Models - Linear (LME) and Generalized Linear (GLMM)

Day 1 - morning
Theory (approx. 1 h) - LME - continuation, interpretation of LME, introduction to LME testing
Exercise (approx. 2 h) - LME with one random effect, introduction to testing of fixed effect factors, interpretation of LME results

Day 1 - afternoon
Theory (approx. 1 h) - LME - continuation, differences between random effect and mixed effect factor, testing of random effect factors in LME
Exercise (approx. 3 hours) - LME with multiple factors with random and mixed effects

Day 2 - morning
Theory (approx. 1 h) - construction of confidence intervals in LME - model profiling and other CI construction methods, expression of the amount of explained variability within LME (pseudo-R2)
Exercise (approx. 2 h) - construction of confidence intervals for LME, calculation of psuedo-R2

Day 2 - afternoon
Theory (approx. 1 h) - transition from LME to GLMM, common problems when working with mixed effect models and how to deal with them
Exercises (approx. 3 hours) - GLMM exercises and revision excersises for mixed effect models

2nd classified homework: analysis of two data sets with hierarchical design


Block 3 - Data with temporal, spatial or phylogenetic correlation between observations - Generalized Least Squares (GLS)

Day 1 - morning
Theory (approx. 1 h) - Introduction to GLS, possibility to use for heteroscedastic data, introduction to temporal and spatial autocorrelation of data, time series analyses, detection of spatial autocorrelation of data (semivariograms), functions useful for approximation of semivariogram
Exercise (approx. 2 h) - GLS with weights (heteroskedasticity), 1st-order autoregressive models, ARIMA models, spatial data autocorrelation

Day 1 - afternoon
Theory (approx. 1 h) - Introduction to work with phylogenetic data, models of character evolution, phylogenetically independent constants (PIC)
Exercise (approx. 3 hours) - recording and editing of phylogenesis data, mapping of characters to phylogenetic trees, analysis of data using PIC

Day 2 - morning
Theory (approx. 1 h) - Phylogenetic GLS (pGLS) and transformation of phylogenetic tree into a variance-covariance matrix, phylogenetic RMA (reduced major axis regression)
Exercise (approx. 2 h) - analysis of data sets with available data on phylogeny

Day 2 - afternoon
Theory (approx. 0.5 h) - Phylogenetic analysis of major components (phylPCA)
Exercise (approx. 1.5 h) - continuation of tasks from morning + phylPCA

Seminars (approx. 2 hours) - discussion of model tasks, focusing on the identification of the nature of data and selection of appropriate analytical techniques

3rd classified homework: analysis of two data sets with spatial, temporal or phylogenetic correlation of response variable

 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html