The aim of this module is to give learners an understanding of how natural language processing (NLP) techniques and machine learning techniques can be applied to text analytics.
Overview & Methodology
Text & Web Mining methodologies such as adapting CRISP-DM. Application domains for text mining & case studies.
Applying structure to textual information
Extracting structure using NLP techniques such as Tokenisation, Phrase recognition, Named Entity Recognition & Concept Chunking. Vector generation.
Web mining
Crawling the web. Web content mining; Web usage Mining. Information Extraction. Semantic Web.
Classification
Learning methods for classification such as similarity measures; nearest neighbour methods; decision rules; probability based methods.Evaluating model performance.
Clustering
Measures of similarity.Methods for clustering documents such as k-means clustering; hierarchical clustering;
Big Data analytics
Introduction to the challenges of working with Big Data and Large scale file systems. Adapting standard analytics techniques to run in a distributed environment. Emerging trends.
Module Content & Assessment | |
---|---|
Assessment Breakdown | % |
Other Assessment(s) | 40 |
Formal Examination | 60 |