Module Overview

Big Data Technologies

The aim of this subject is to instil an understanding of the key architectures of NoSQL Database technologies and how they differ from relational database technologies from a physical and conceptual viewpoint. The student will also gain an appreciation of data warehousing from both a relational and non-relational database technologies viewpoint.

Module Code

DBAS H3001

ECTS Credits

5

*Curricular information is subject to change

DATA WAREHOUSING:

Understanding the need for On-Line Analytical Servers; Patterns of Data Warehouse and Data Mart Development; Data warehouse architecture and design e.g. Star Schemas, Snowflake, Starflake, Fact Constellations; Measures, Slowly Changing Dimensions, Data Warehouse OLAP Query Tools;

NOSQL PARADIGMS

NoSQL Database Systems: (Document Databases e.g. MongoDB; Key-Value Stores e.g. Riak; Column Family Database e.g. Cassandra, HBase; Graph Database (e.g.Neo4J), Object Oriented DB, Object Relational DB,Domain Driven Design(Aggregates), Sharding, Replication ( MasterSlave, Peer-to-Peer), Schema-less environments, Brewers CAP Theorem, BASE transactions, Eventual Consistency, Tunable Consistency, Logical Ring ( vNodes, Consistent Hashing, Read-Repair, Hinted-Handoff)

HADOOP BIG DATA ENVIRONMENTS

Big Data and the 5 V's; Hadoop HDFS Storage: File SplittingSlicing: Blocks; RackAwareness Replication, Write operation Read Operation NameNode, DataNode, SecondaryNameNode (2NN), Standby NameNode for HA, DataNodeFailure; Hadoop Processing: YARN, ResourceManager, Application Master Container, containers, NodeManager; mapReduce, MapReduce Tools e.g. HiveQL, PIG,Impala; Spark Framework,Ingesting data(e.g. Apache Flume, Apache Sqoop)

Lectures, labs and independent study.

Module Content & Assessment
Assessment Breakdown %
Formal Examination60
Other Assessment(s)40