Module Overview

Data Science for Physics 3

This module introduces the learner to the principles and practice of modelling with advanced classification and regression in Data Science, focussing on classification and clustering methods. The module will have a focus on applying these principles to scientific and, in particular, physical data.

Module Code

PHYS 4001

ECTS Credits

5

*Curricular information is subject to change
  • Statistical learning;
    • Maximum margin methods and constrained optimization problems;
    • Slack variables; Support vector machines;
    • Kernel transformations;
  • Regression analysis;
    • Multivariate linear regression; metrics of performance;
    • Principal components regression and partial least squares regression;
    • Support vector regression;
    • Application to classification problems (e.g. PLS discriminant analysis and logistic regression)
  • Networks;
    • Neural networks;
    • Introduction to deep learning;
  • Clustering techniques;
    • Hierarchical (HCA) and partitioning clusterings;
    • Distance metrics; centre-based, contiguity based, density based;
    • HCA; agglomerative and divisive methods; Measures of inter-cluster similarity; Measures of cluster validity;
    • Hard and soft clustering techniques; k-means and C-means; fuzzy algorithms;
  • Data visualisation techniques
    • Visualization for multivariate data;
    • Force-based graphs and networks;

Programming will be taught entirely in the computer laboratory, with supplemental lectures. The module will use the computer laboratory throughout the syllabus to achieve as much as possible subject matter interaction.

Module Content & Assessment
Assessment Breakdown %
Other Assessment(s)100