Postgraduate Courses
- DSAA 5002Data Mining and Knowledge Discovery in Data Science[3-0-0:3]BackgroundKnowledge of databases and statisticsDescriptionWith more and more data available, data mining and knowledge discovery has become a major field of research and applications in data science. Aimed at extracting useful and interesting knowledge from large data repositories such as databases, scientific data, social media and the Web, data mining and knowledge discovery integrates techniques from the fields of database, statistics and AI.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Extract, clean and represent data.
- 2.Explain clustering of data.
- 3.Conduct classification of data.
- 4.Apply frequent pattern discovery.
- 5.Describe outlier detection.
- DSAA 5003Automatic Machine Learning[3-0-0:3]DescriptionA recent trend in Data Science and Machine Learning communities is to further boost the accessibility of Machine Learning techniques by reducing the tedious effort on learning model selection, hyper-parameter tuning, etc. Automatic Machine Learning (AutoML) aims to reduce this tedious effort and make Machine Learning easier to use. In this course, students will master the basics of AutoML, and understand key techniques including hyper-parameter optimization, feature engineering and meta-learning. This course also introduces common AutoML systems and covers real-world case studies on the applications of AutoML.
- DSAA 5009Deep Learning in Data Science[3-0-0:3]DescriptionIn this course, theories, models, algorithms of deep learning and their application to data science will be introduced. The basics of machine learning will be reviewed at first, then some classical deep learning models will be discussed, including AlexNet, LeNet, CNN, RNN, LSTM, and Bert. In addition, some advanced deep learning techniques will also be studied, such as reinforcement learning, transfer learning and graph neural networks. Finally, end-to-end solutions to apply these techniques in data science applications will be discussed, including data preparation, data enhancement, data sampling and optimizing training and inference processes.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Identify basic knowledge of machine learning and deep learning.
- 2.Apply deep learning modes.
- 3.Generate data for deep learning tasks.
- 4.Design optimized training and inference process.
- DSAA 5012Advanced Database Management for Data Science[3-0-0:3]DescriptionIn this course, the concepts and implementation schemes in advanced database management systems for data science applications will be introduced, such as disk and memory management, advanced access methods, implementation of relational operators, query processing and optimization, transactions and concurrency control. It also introduces emerging database related techniques for data science.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Identify basic knowledge of data model and relational algebra.
- 2.Identify data storage and indexing structures.
- 3.Implement Query processing techniques.
- 4.Carry out Physical database design and advanced transaction management.
- DSAA 5013Advanced Machine Learning[3-0-0:3]DescriptionIn this course, advanced algorithms for data science will be introduced. It covers most of the classical advanced topics in algorithm design, as well as some recent algorithmic developments, in particular algorithms for data science and analytics.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Identify basic knowledge of algorithm and its complexity analysis.
- 2.Apply advanced algorithms.
- 3.Carry out optimization by randomized and sampling algorithms.
- 4.Apply parallel and distributed algorithms.
- DSAA 5015Parallel Programming for Data Science and Analytics[3-0-0:3]DescriptionIntroduction to parallel computer architectures; principles of parallel algorithm design; shared-memory programming models; message passing programming models used for cluster computing; data-parallel programming models for GPUs; case studies of parallel algorithms, systems, and applications; hands-on experience with writing parallel programs for data science and analytics.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Categorize parallel computer architectures and programming models.
- 2.Select suitable parallel programming models for data science and analytics tasks on specific hardware.
- 3.Design and implement parallel programs for specific data science and analytic tasks.
- 4.Evaluate and analyze parallel program performance for data science and analytic tasks.
- DSAA 5020Foundation of Data Science and Analytics[3-0-0:3]DescriptionThis course will introduce fundamentals techniques for data science and analytics. Specifically, it will teach students how to clean the data, how to integrate data and how to store the data. On top of these, it will also teach students knowledge to conduct data analysis, such as Bayes rule and connection to inference, linear approximation and its polynomial and high dimensional extensions, principal component analysis and dimension reduction. In addition, it will also cover advanced data analytics topics including data governance, data explanation, data privacy and data fairness.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Clean, integrate and mine data.
- 2.Manage the data and process data.
- 3.Acquire the knowledge of data fairness, data privacy and data governance.
- DSAA 5021Data Science Computing[3-0-0:3]DescriptionThis course will teach students data science computing techniques. Topics cover: (1) Basic concepts of Data Science Computing and Cloud; (2) MapReduce - the de facto datacenter-scale programming abstraction - and its open source implementation of Hadoop; and (3) Apache Spark - a new generation parallel processing framework - and its infrastructure, programming model, cluster deployment, tuning and debugging, as well as a number of specialized data processing systems built on top of Spark.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Build a Hadoop Ecosystems.
- 2.Manage the data over Spark systems.
- 3.Apply the knowledge of Spark and Hadoop Ecosystems for maintaining and monitoring Big Data Computing Systems.
- DSAA 5022Data Analysis and Privacy Protection in Blockchain[3-0-0:3]DescriptionThis course introduces basic concepts and technologies of blockchain, such as the hash function and digital signature, as well as data analysis and privacy protection over blockchain applications. The students will learn the consensus protocols and algorithms, the incentives and politics of the block chain community, the mechanics of Bitcoin and Bitcoin mining, data analysis techniques over blockchain and user/transaction privacy protection.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Build a blockchain system.
- 2.Manage the data over blockchain.
- 3.Acquire the knowledge of blockchain and data analytics techniques.
- 4.Develop privacy protection techniques over blockchain.
- DSAA 5024Data Exploration and Visualization[3-0-0:3]BackgroundBasic knowledge of computer programmingDescriptionThis course covers essential techniques for data exploration and visualization. Students will learn the iterative process of data preprocessing techniques for getting data into a usable format, exploratory data analysis (EDA) techniques for formulating suitable hypotheses and validating them, and specific techniques for domain-related data exploration and visualization such as high-dimensional, hierarchical, and geospatial data. The course uses programing languages such as python and tools like Tableau.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Understand basic concepts of exploratory data analysis and its relation to descriptive and inferential statistics, data mining, and data visualization
- 2.Describe the iterative process of data wrangling, analysis, and visualization
- 3.Understand principles for effective visualization design and select suitable visualization techniques
- 4.Implement visualizations for diverse types of datasets using python and Tableau
- 5.Demonstrate how data exploration and visualization can be applied to real-world problems
- DSAA 5027Spatio-Temporal Data Analysis[3-0-0:3]DescriptionIn this course, we will introduce spatial and multimedia database management concepts, theories and technologies, from data representation, indexing, fundamental operations to advanced query processing. Challenges and solution for high dimensional data will also be introduced.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Master basic knowledge of Relational Data Indexing and Query Processing.
- 2.Master knowledge of Spatial Databases.
- 3.Apply Spatial Data Organisation and Indexing; Spatial Query Processing.
- 4.Managing Spatiotemporal Data, Multimedia Databases and High Dimensional Data.
- DSAA 5037Introduction to Graph Learning[3-0-0:3]Previous Course Code(s)DSAA 6000BDescriptionGraph, as a very expressive model, has been widely used to model real-world entities and their relationships in application-specific networks. In this course, students will gain a thorough introduction to the basics of graph theories, as well as cutting-edge research in deep learning for graphs. The topics include graph embeddings, graph neural networks, graph clustering models, graph generative models, adversarial attacks on graphs, graph reasoning, etc.
- DSAA 6000Special Topics[3-0-0:3]DescriptionThe special topics course is designed for faculty to offer a course about popular research topics. The research topics in data science and analytics change and evolve very fast. The special topics will help students to know the research trend in this area and master the state-of-the-art solutions. Students will not only learn from the lectures, but also investigate the techniques by reading papers, giving presentations and working on projects.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Gain the knowledge of current research trend in Data Science and Analytics area.
- 2.Master the state-of–the-art techniques of the chosen topic.
- 3.Apply the learned techniques to the real problem on the chosen topic.
- DSAA 6010Industry Round Table[1 credit]DescriptionThis course offers students opportunities to learn the possible industry topics and supervisors that they will work with during their internship. The goals are: (a) understand (i) problems in industry, (ii) existing data-sets, (iii) AI models, (iv) service scenario and KPI, (v) challenges; (b) propose industrial projects based on the understanding; and (c) matching for their industrial projects. This course is only available for MSc(DCAI) students. Graded P or F.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Understand problems in industry
- 2.Propose industrial projects based on the understanding
- 3.Match their industrial projects
- DSAA 6018Independent Study[1-3 credit(s)]DescriptionIn this course, an independent research project will be carried out under the supervision of a faculty member.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Identify knowledge related to the proposed topic.
- DSAA 6100Practical Lab Course[1-0-6:3]DescriptionThis course will teach students practical programming and parallel processing skills on implementing various deep learning or machine learning models, starting from preparing data, feature selection to model choosing, hyperparameter tuning, and final result analysis and explaining. This course is only available for MSc(DCAI) students.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Understand deep learning frameworks
- 2.Gain knowledge about data preparation and feature engineering
- 3.Grasp skills on model selection and hyperparameter tuning
- 4.Apply parallel processing techniques to speed up model training and inference
- 5.Apply knowledge to analyze and explain results
- DSAA 6101Data Science and Analytics Program Seminar I[0 credit]DescriptionIn this course, students are required to attend at least 6 seminars offered by the program. The program will offer at least 10 seminars related to the state of the art research on data science and analytics in each term. These seminars will help students to broaden the horizons of their knowledge on data science and analytics. Graded P or F.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Identify knowledge of state-of-the-art research topics and techniques on data science and analytics.
- DSAA 6102Data Science and Analytics Program Seminar II[1-0-0:1]DescriptionIn this course, students are required to attend at least 6 seminars offered by the program. The program will offer at least 10 seminars related to the state-of-the-art research on data science and analytics in each term. These seminars will help students to broaden the horizons of their knowledge on data science and analytics. Graded P or F.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Identify knowledge of state-of-the-art research topics and techniques on data science and analytics.
- DSAA 6800Independent Project[6 credits]DescriptionIn this course, the student will work on a practical project. The independent project is related to real problems existing in data science and AI application domains. The student needs to conduct literature survey, method comparison, solution selection and implementation, experimental study and write the final report. The course will train students skills on proposing end-to-end solution for a realapplications problem. This course is only available for MSc(DCAI) students. Graded P or F.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Narrow down the problem scope and define the problem
- 2.Find possible solutions from existing works
- 3.Implement and improve the selected solutions
- 4.Conduct result analysis and propose possible improvement
- DSAA 6920Industry Internship Ⅰ[2 credits]DescriptionIn this course, students will be trained in the industry. They will work under the guidance of their supervisors (industry and academia) to practice what they have learned in the program, and apply the data science and AI knowledge and techniques to various real-life problems. In Part I, students are required to complete an Open Topic report with oral examination on the scientific value, feasibility and technical challenges of the proposed topic. This course is only available for MSc(DCAI) students. Graded PP, P or F.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Conduct comprehensive literature survey
- 2.Conduct comparison study
- 3.Define research problem formally
- 4.Discover new research methodologies
- 5.Prepare data for deep learning tasks
- DSAA 6921Industry Internship Ⅱ[3 credits]Prerequisite(s)DSAA 6920DescriptionIn this course, students will be trained in the industry. They will work under the guidance of their supervisors (industry and academia) to practice what they have learned in the program, and apply the data science and AI knowledge and techniques to various real-life problems. In Part II, students are required to complete an Intermediate report with oral examination on the student’s progress and industry collaboration progress. This course is only available for MSc(DCAI) students. Graded PP, P or F.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Program with deep learning frameworks
- 2.Debug the deep learning models
- 3.Set initial values and tune hyperparameters
- 4.Optimize training and inference process
- 5.Analyze and explain the results and models
- DSAA 6922Industry Internship Ⅲ[3 credits]Prerequisite(s)DSAA 6920 and DSAA 6921DescriptionIn this course, students will be trained in the industry. They will work under the guidance of their supervisors (industry and academia) to practice what they have learned in the program, and apply the data science and AI knowledge and techniques to various real-life problems. In Part III, students are required to complete a Final report with oral examination on the final output from the internship project and whether the student indeed knows how to apply AI techniques to concrete data science applications. This course is only available for MSc(DCAI) students. Graded PP, P or F.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Analyze and debug results
- 2.Compare models and tune hyperparameters
- 3.Conduct data argumentation and filtering
- 4.Conduct data parallelism, model parallelism and hybrid parallelism
- 5.Write technical reports
- DSAA 6990MPhil Thesis ResearchDescriptionMaster's thesis research supervised by co-advisors from different disciplines. A successful defense of the thesis leads to the grade Pass. No course credit is assigned.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Design, develop and conduct cross-disciplinary research in Data Science and Analytics.
- 2.Communicate research findings effectively in written and oral presentations.
- DSAA 7990Doctoral Thesis ResearchDescriptionOriginal and independent doctoral thesis research supervised by co-advisors from different disciplines. A successful defense of the thesis leads to the grade Pass. No course credit is assigned.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Design, develop and conduct cross-disciplinary research in Data Science and Analytics.
- 2.Communicate research findings effectively in written and oral presentations.