SIG731 - Data Wrangling

Year:

2024 unit information

Enrolment modes:

Trimester 3: Great Learning

Credit point(s): 1
EFTSL value:

0.125

Prerequisite:

Nil

Corequisite:

Must be enrolled in S773 Master of Data Science (Global)

Incompatible with:

SIT731

Study commitment

Students will on average spend 150 hours over the trimester undertaking the teaching, learning and assessment activities for this unit.

This will include educator guided online learning activities within the unit site.

Scheduled learning activities - online

Online independent and collaborative learning including optional scheduled activities as detailed via the Great Learning platform.

Note:

This unit is part of the Master of Data Science (Global) program and is restricted to online international students who reside outside Australia.

Content

Data Science (DS) and Artificial Intelligence (AI) are popular fields in making sense of data that have been collected in large quantities from various sources. Performing accurate exploration and modelling using DS and AI heavily rely on appropriately prepared data. Data wrangling is the process of preparing the raw data appropriately for modelling purposes. The aim of this unit is to learn various data wrangling methodologies and programming techniques to perform them. This include programming in Python for performing various data wrangling tasks, learning data extraction methods  from different sources, working with different types of data, storing and retrieving them, applying sampling techniques and inspecting them, cleaning them by identifying outliers/anomalies, handling missing data, transforming, selecting and extracting features, performing exploratory analysis, visualisation using various tools, summarising data appropriately, performing basic statistical analysis and modelling using basic machine learning. Further, techniques for maintaining data privacy and exercising ethics in data manipulation will be covered in this unit.

Hurdle requirement

To be eligible to obtain a pass in this unit, students must meet certain milestones as part of the portfolio, and must achieve a mark of at least 50% in the online quiz.