Postgraduate Courses
- MSBD 5001Foundations of Data Analytics[3-0-0:3]DescriptionThis course will provide fundamental techniques for data analytics, including data collection, data extraction, data integration and data cleansing. The students will learn how to manage and optimize the analytics value chain, including collecting and extracting the suitable values, selecting the right data processing processes, integrating the data from various resources, data governance, security and privacy for Big Data applications.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Identify useful data to extract and collect.
- 2.Use skills of data integration to integrate heterogeneous data.
- 3.Apply data cleaning technique to clean dirty data.
- 4.Use various data analytic techniques to help decision making.
- MSBD 5002Data Mining and Knowledge Discovery[3-0-0:3]Co-list withCSIT 5210Exclusion(s)COMP 5331, CSIT 5210, MFIT 5004DescriptionData mining has recently emerged as a major field of research and applications. Aimed at extracting useful and interesting knowledge from large data repositories such as databases and the Web, data mining integrates techniques from the fields of database, statistics and AI.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Apply the clustering techniques to find clusters within the data.
- 2.Use the classification techniques to conduct classification and predication.
- 3.Use the knowledge of frequent pattern mining to discover patterns from the data.
- 4.Conduct mining over social media and text data and detect outliers from the data.
- MSBD 5003Big Data Computing[3-0-0:3]DescriptionBig data systems, including Cloud Computing and parallel data processing frameworks, emerge as enabling technologies in managing and mining the massive amount of data across hundreds or even thousands of commodity servers in datacenters. This course exposes students to both the theory and hands-on experience of this new technology. The course will cover the following topics. (1) Basic concepts of Cloud Computing and production Cloud services; (2) MapReduce - the de facto datacenter-scale programming abstraction - and its open source implementation of Hadoop. (3) Apache Spark - a new generation parallel processing framework - and its infrastructure, programming model, cluster deployment, tuning and debugging, as well as a number of specialized data processing systems built on top of Spark.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Build a Hadoop Ecosystem.
- 2.Manage the data over Spark systems.
- 3.Apply the knowledge of Spark and Hadoop Ecosystems for maintaining and monitoring Big Data Computing Systems.
- MSBD 5004Mathematical Methods for Data Analysis[3-0-0:3]DescriptionThis course will introduce mathematical formulations and computational methods (convex/non-convex optimization) to exploit structures contained in the data. Moreover, specific computational methods (Randomized computational methods) will be explored for big data analysis.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Represent the Big Data with structured data models.
- 2.Apply convex and non-convex approaches for various Big Data optimization tasks.
- 3.Apply randomized optimization techniques to get approximate solutions for optimization problems on Big Data.
- MSBD 5005Data Visualization[3-0-0:3]DescriptionThis course will introduce visualization techniques for data from everyday life, social media, business, scientific computing, medical imaging, etc. The topics include human visual system and perception, visual design principles, open- source visualization tools and systems, visualization techniques for CT/MRI data, computational fluid dynamics, graphs and networks, time-series data, text and documents, Twitter data, and spatio-temporal data.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Apply human visual system and perception on designing a visual system.
- 2.Use open-source visualization tools and systems to complete various design and visualization tasks.
- 3.Visualize different types of data including graphs, networks, texts, documents, social media and spatio-temporal data with the visualization techniques.
- MSBD 5006Quantitative Analysis of Financial Time Series[3-0-0:3]Co-list withMAFS 5130Exclusion(s)MAFS 5130, MSDM 5053DescriptionAnalysis of asset returns: autocorrelation, predictability and prediction. Volatility models: GARCH- type models, long range dependence. High frequency data analysis: transactions data, duration. Markov switching and threshold models. Multivariate time series: cointegration models and vector GARCH model.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Recognize market indexes, financial time series and their features.
- 2.Recognize the foundation of time series and basic time series models.
- 3.Formulate time series models to study financial data, including market returns and volatility.
- 4.Evaluate the relationships of different markets via the cointegration time series and ECM models.
- 5.Analyze the real financial data with the statistical techniques from this course via a course project.
- MSBD 5007Optimization and Matrix Computation[3-0-0:3]DescriptionThe course will introduce basic techniques about optimization, including unconstrained optimization and constrained optimization, and matrix computation, including matrix analysis, linear systems, orthogonalization and least squares and eigenvalue problems.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Apply unconstrained optimization and constrained optimization techniques on data optimization problems.
- 2.Carry out various data analysis tasks over Matrixes by using Matrix computation techniques.
- 3.Use optimization and Matrix computation techniques together to solve real-life optimization problems, such as community detection or leadership discovery.
- MSBD 5008Introduction to Social Computing[3-0-0:3]DescriptionThis course is an introduction to social information network analysis and engineering. Students will learn both mathematical and programming knowledge for analyzing the structures and dynamics of typical social information networks (e.g. Facebook, Twitter, and MSN). They will also learn how social metrics can be used to improve computer system design as people are the networks. It will cover topics such as small world phenomenon; contagion tipping and influence in networks; models of network formation and evolution; the web graph and PageRank; social graphs and community detection; measuring centrality; greedy routing and navigations in networks; introduction to game theory and strategic behavior; social engineering; and principles of computer system design.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Use the knowledge of social networks to discover influential nodes and communities.
- 2.Identify human behavior on social networks and recommend events or users based on user profiles and connections.
- 3.Recognize the influence of nodes and edges and design viral marketing based on social influence analysis.
- MSBD 5009Parallel Programming[3-0-0:3]Exclusion(s)COMP 5112DescriptionIntroduction to parallel computer architectures; principles of parallel algorithm design; shared-memory programming models; message passing programming models used for cluster computing; data-parallel programming models for GPUs; case studies of parallel algorithms, systems, and applications; hands-on experience with writing parallel programs for tasks of interest.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Apply principles of parallel algorithm design to verify the correctness of parallel programs.
- 2.Design and deploy parallel programs on shared-memory programming models.
- 3.Design and deploy parallel programs on shared-nothing programming models via message passing.
- 4.Improve execution efficiency via parallel programming models for GPUs.
- MSBD 5010Image Processing and Analysis[3-0-0:3]DescriptionThis course will introduce the basic techniques for image data processing and analysis. Topics include image processing and analysis in spatial and frequency domains, image restoration and compression, image segmentation and registration, morphological image processing, representation and description, feature description, face recognition, iris recognition, fingerprint recognition, image analysis topics, such as medical image analysis.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Enhance image quality or effect by applying image representation, restoration and filtering.
- 2.Apply segmentation, registration and compression techniques to reduce the volume of images.
- 3.Apply various image analysis techniques on different recognition tasks and pattern discovery on media images.
- MSBD 5011Advanced Statistics: Theory and Applications[3-0-0:3]DescriptionThis course introduces basic statistical principles, methodology and computational tools needed in performing data analysis. The topics of the course include parametric models, sufficiency principles, estimation methods, liner models, quantile estimations, nonparametric curve estimation, resampling methods, statistical computing and hypothesis testing.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Use parametric models to analyze data.
- 2.Apply various estimation methods to approximate parameters.
- 3.Use nonparametric curve models to investigate data.
- 4.Use the knowledge about statistical computing for various reasoning tasks.
- MSBD 5012Machine Learning[3-0-0:3]Exclusion(s)COMP 5212, CSIT 5910DescriptionThe course introduces fundamentals of machine learning, including concept learning, evaluating hypotheses, supervised learning, unsupervised learning and reinforcement learning, Bayesian learning, ensemble methods.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Apply concept and decision tree learnings to various decision making tasks.
- 2.Use Bayesian learning and Artificial Neural Networks for classifications and recommendations.
- 3.Use computational learning theory to improve the learning efficiency.
- 4.Identify the requirements of learning tasks and apply suitable learning techniques.
- MSBD 5013Statistical Prediction[3-0-0:3]DescriptionThis course will introduce statistical predication models and algorithms, including regression models, classification, additive models, graphical models and network, model assessment and selection, model inference and model averaging.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Conduct prediction based on statistical models.
- 2.Investigate different predication tasks and apply different models.
- 3.Revise the current statistical models to meet the requirements of real-life predication problems.
- MSBD 5014Independent Project[3 credits]DescriptionAn independent project carried out under the supervision of a faculty member. This course may be repeated for credit.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Investigate existing problems on Big Data and conduct original Big Data research.
- 2.Apply acquired knowledge of Big Data on privacy protection and policy making.
- 3.Design solutions to improve the existing computing and analytic techniques on Big Data.
- MSBD 5015Artificial Intelligence[3-0-0:3]Previous Course Code(s)MSBD 6000AExclusion(s)CSIT 5900DescriptionThis course will cover advanced topics in AI including machine learning, agent design, mulitiagent systems, game search, natural language processing and knowledge representation and reasoning systems.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Examine and formulate problems as AI heuristic search problem.
- 2.Model and design autonomous agents.
- 3.Examine and formulate multiagent problems as games.
- 4.Examine and formulate problems as machine learning problem.
- 5.Integrate reasoning and machine learning in natural language understanding.
- MSBD 5016Deep Learning Meets Computer Vision: Practice and Applications[3-0-0:3]Previous Course Code(s)MSBD 6000GDescriptionComputer vision and relevant algorithms have started to make their way in various applications and become more relevant in our daily life: face detection, drones, and self-driving cars to name a few. Recent developments in deep neural network architectures, large datasets and training techniques have greatly advanced the performance of these state-of-the-art visual recognition systems. This course will investigate common deep learning architectures with a focus on learning effective models for various computer vision tasks. Part of the course evaluation involves student-proposed course projects focusing on building a working computer vision system using deep learning algorithms and techniques.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Put together a simple image classification pipeline, based on the k-NN or the SVM/Softmax classifier.
- 2.Write backpropagation code, and train Neural Networks and Convolutional Neural Networks.
- 3.Implement recurrent networks, and apply them to image captioning on Microsoft COCO.
- 4.Explore methods for visualizing the features of a pretrained model on ImageNet.
- 5.Train a generative adversarial network to generate images that look like a training dataset.
- 6.Design and implement a deep neural network for computer vision applications.
- MSBD 5017Introduction to Blockchain Technology[3-0-0:3]Previous Course Code(s)MSBD 6000DDescriptionThis course introduces basic concepts and technologies of blockchain, such as the hash function and digital signature, as well as the blockchain applications, especially in Fintech. The students will learn the consensus protocols and algorithms, the incentives and politics of the block chain community, the mechanics of Bitcoin and Bitcoin mining. The course also covers the limitations and possible improvements of the blockchain system.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Identify the fundamental technologies about blockchain and cryptocurrencies.
- 2.Analyze the algorithms, optimization, and tradeoff of the distributed system.
- 3.Develop de-centralized applications over blockchain.
- 4.Write smart contracts on Ethereum platform.
- 5.Identify the token economy and mining eco-system.
- 6.Design advanced consensus algorithms taking into consideration of scalability, security, and privacy.
- 7.Write blockchain technical white paper supporting solid business plan.
- MSBD 5018Natural Language Processing[3-0-0:3]Previous Course Code(s)MSBD 6000HExclusion(s)COMP 5221DescriptionThis course is an introduction to Natural Language Processing. It covers a brief overview of the subject, including the task background, its computational problem setting and general thoughts of methodologies. Several state-of-the-art techniques that well support the industrial level services, including topic modelling, deep learning models, and their applications in search engine, chatbot and QA system will also be included.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Identify the basic principles behind machine learning algorithms for Natural Language Processing (NLP) data
- 2.Implement programs for NLP tasks
- 3.Formulate machine learning solutions to domain problems
- 4.Describe an understanding of the complexity of real world problems
- MSBD 5019Spatial and Multimedia Databases[3-0-0:3]Previous Course Code(s)MSBD 6000JExclusion(s)CSIT 6000PDescriptionData types in many new applications domains, such as spatial, multimedia and a wide range of scientific applications, are often represented as multi-dimensional data. Data management and query processing techniques for multi-dimensional data are significantly different from the relational data. This course covers a selection of database topics to provide students with the critical thinking of issues related to techniques for multidimensional data management and analytics. Spatial and multimedia database management concepts, theories and technologies, from data representation, indexing, fundamental operations to advanced query processing, as well as challenges and solutions for high dimensional data will be introduced.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Identify data management and analytics challenges for large-scale complex data, especially for spatial and multimedia data.
- 2.Represent complex data types as multi-dimensional vectors to support similarity-based search.
- 3.Identify fundamental operations and advanced query processing techniques for spatial, spatiotemporal and multimedia data.
- 4.Use the knowledge of multidimensional databases to process queries with spatial, spatiotemporal and multimedia data.
- 5.Identify challenges and solutions for high dimensional data in the context of big data and data science.
- MSBD 6000Special Topics[3-0-0:3]DescriptionSelected topics in big data reflecting recent developments in techniques and tools which are not covered by existing courses. May be repeated for credit if different topics are covered.Intended Learning Outcomes
On successful completion of the course, students will be able to:
- 1.Recognize the major issues and technical problems in a specific topic in the big data field.
- 2.Identify key technical solutions to the problems in the topic.
- 3.Solve specific problems in this topic.