Content
This module will cover the following topic areas:
Data Storage and Retrieval
- Importance of data for business.
- Understand the difference between data, information and knowledge.
- Traditional ways to store and retrieve data.
- Big Data challenges and opportunities.
Introduction to Big Data
- Defining Big Data: Sources of Big Data; The four dimensions of Big Data: Volume, velocity, variety, veracity; Introducing storage and MapReduce.
- Business application of Big Data: Big Data applications/examples in business;Delivering business benefit from Big Data; Establishing the business importance of Big Data.
- Addressing the challenge of extracting useful data/knowledge.
- Integrating Big Data with traditional data.
SQL Databases vs. NoSQL Databases
- Understand the growing amounts of data.
- The relational database management systems (RDBMS).
- Capabilities of traditional RDBMSs.
- Overview of Structured Query Languages (e.g. SQL).
- Introduction to NoSQL databases.
- Understanding the difference between a relational DBMS and a NoSQL database.
- Identifying the need to employ a NoSQL DB.
Storing Big Data
- Analysing data characteristics: Selecting data sources for analysis.
- Introduction of selected Big Data stores from the following list: Hadoop, Cassandra, Amazon S3, BigTable, etc.
Achieving Data Quality
- Introduction to data quality.
- Why is data quality a business problem?
- Problems when data is not "fit for purpose".
- Preparing data.
- Ways to improve data quality.
- Understand ETL - Extract, Transform, Load procedures to improve Data Quality.
Knowledge-based Information Retrieval
- Introduction to knowledge-based information retrieval.
- Use for ontologies for knowledge modelling.
- Learn how to build an ontology to link knowledge with data.
- Using ontologies for information retrieval - case study.
- Machine learning for knowledge acquisition: Introduction to machine learning and pattern recognition; Capabilities of different modelling, analysis and algorithmic techniques.
Big Data and Cloud Computing (technology, challenges and trends)
- Cost of storing Big Data.
- Is cloud computing a solution?
- Issues: Privacy and trust.
- Future of Big Data and cloud computing.
- Future research trends in Big Data.
Learning and Teaching
The module is delivered through weekly lectures and tutorial sessions, which take place on consecutive weeks.
Each lecture will direct the course and introduce the new ideas and skills required. Then small group tutorial sessions will enable each student to carry out the study and research exercises described in the associated work-sheet under the guidance of a Tutor.
The teaching material is available from Blackboard (our online learning environment).
A course text is also recommended.
Study time
This module (course) will involve 2 hours direct contact time per week for one semester equally divided between lecture and tutorial sessions.
A 15 credit module, like this, is expected to take 150 hours to complete:
- 24 hrs contact time through lectures and face to face discussion
- 30 hrs coursework preparation
- 86 hrs assimilation and development of knowledge
- 10 hrs exam preparation
Assessment
The module will be assessed through a written report and an oral assessment (presentation/viva).
For more details, see our full glossary of assessment terms.