The Data Wrangling module provides students with an opportunity to learn the skills and techniques most often associated with building and manipulating information sources to perform Data Analysis tasks. The module is intended to build a student’s skills and confidence with Data Programming tasks and provide a firm foundation for further Data Analytics study.
Unit 1: Data Types – structured versus unstructured, numeric versus categorical
Unit 2: Data Programming Languages – common features. Frames, Matrix Manipulation, Graphics Libraries
Unit 3: Data Programming Languages II – Reading and Writing Common Data Types to File
Unit 4: Data Selection: Indices versus Named Access
Unit 5: Data Processing Operations: Applying functions to subsets of a Data Table
Unit 6: APIs for Data Access. Twitter API, Reddit API, Accessing APIs in Data Programming Languages
Unit 7: Data Storage Types: Spreadsheet Files, Relational Databases, NoSQL Databases
Unit 8: Retrieving and Storing API Data for Future Analysis
Unit 9: Data Quality – Common Quality concerns with Data. Missing Data. Inaccurate Data.
Unit 10: Data Cleansing Methods – Dropping Rows, Interpolating Values
Unit 11: Data Normalisation: Z-Score Normalisation vs Min-Max Normalisation
Unit 12: Integrating Heterogeneous Data Sets
Unit 13: Review
The module is designed to be delivered within a blended learning model, employing mixed modes (online and face to face) of learning, teaching and assessment.
This module will be 100% CA and will use a blend of lecture, tutorial and lab sessions.
The specifics of the module on a year by year basis will be defined by the lecturer, but it is intended that the languages and tools used on this module are well aligned with the tools identified as useful to the programme by the programme committee.
Module Content & Assessment | |
---|---|
Assessment Breakdown | % |
Other Assessment(s) | 100 |