The aim of this subject is to instil an understanding of the key architectures of NoSQL Database technologies and how they differ from relational database technologies from a physical and conceptual viewpoint. The student will also gain an appreciation of data warehousing from both a relational and non-relational database technologies viewpoint.
DATA WAREHOUSING:
Understanding the need for On-Line Analytical Servers; Patterns of Data Warehouse and Data Mart Development; Data warehouse architecture and design e.g. Star Schemas, Snowflake, Starflake, Fact Constellations; Measures, Slowly Changing Dimensions, Data Warehouse OLAP Query Tools;
NOSQL PARADIGMS
NoSQL Database Systems: (Document Databases e.g. MongoDB; Key-Value Stores e.g. Riak; Column Family Database e.g. Cassandra, HBase; Graph Database (e.g.Neo4J), Object Oriented DB, Object Relational DB,Domain Driven Design(Aggregates), Sharding, Replication ( MasterSlave, Peer-to-Peer), Schema-less environments, Brewers CAP Theorem, BASE transactions, Eventual Consistency, Tunable Consistency, Logical Ring ( vNodes, Consistent Hashing, Read-Repair, Hinted-Handoff)
HADOOP BIG DATA ENVIRONMENTS
Big Data and the 5 V's; Hadoop HDFS Storage: File SplittingSlicing: Blocks; RackAwareness Replication, Write operation Read Operation NameNode, DataNode, SecondaryNameNode (2NN), Standby NameNode for HA, DataNodeFailure; Hadoop Processing: YARN, ResourceManager, Application Master Container, containers, NodeManager; mapReduce, MapReduce Tools e.g. HiveQL, PIG,Impala; Spark Framework,Ingesting data(e.g. Apache Flume, Apache Sqoop)
Lectures, labs and independent study.
Module Content & Assessment | |
---|---|
Assessment Breakdown | % |
Formal Examination | 60 |
Other Assessment(s) | 40 |