SIT796 - Reinforcement Learning


2024 unit information

Enrolment modes: Trimester 1: Waurn Ponds (Geelong), Online
Credit point(s): 1
EFTSL value: 0.125

SIT720, SIT771 and SIT787.
For students enrolled in S464: Must have completed 16 credit points.
For students enrolled in S470, S506, S507, S508, S535, S536, S538, S577, S677, S735, S737, S739, S770, S778, S779, S789: SIT720 and SIT787

Corequisite: Nil
Incompatible with: Nil
Study commitment

Students will on average spend 150 hours over the teaching period undertaking the teaching, learning and assessment activities for this unit.

This will include educator guided online learning activities within the unit site.

Scheduled learning activities - campus

1 x 2 hour online lecture per week, 1 x 2 hour practical experience (workshop) per week.

Scheduled learning activities - online

Online independent and collaborative learning including 1 x 2 hour online lecture per week (recordings provided), 1 x 2 hour practical experience (workshop) per week.


Reinforcement Learning (RL) is one of the three fundamental paradigms of Machine Learning that is inspired by psychology and neuroscience and is concerned with the development of software agents capable of taking actions in an environment to achieve one or more goals. RL differs from supervised learning in that it does not require correct input/output pairs and incorrect actions do not need to be directly corrected, instead agents balance exploration and exploitation to search for optimal policies. In this unit students will explore, research and implement solutions to a range of RL problems or Markov Decision Process (MDP), including variants such as: discrete-time MDP; Semi-MDPs (SMDP); continuous-time MDPs; Partially Observable-MDP (POMDP) and Multi-objective MDPs (MOMDP). In solving these problems, students will apply their knowledge and skills with a range of techniques such as: multi-armed bandits; reward design; value iteration; policy gradient; temporal difference learning; on-policy; off-policy; eligibility traces; feature construction; and, continuous action. Additionally, students will research and discuss topics such as; curriculum; interactive; inverse; transfer; ensemble; hierarchical; curiosity, multi-goal; multi-agent; multi-objective and deep RL.

Hurdle requirement

To be eligible to obtain a pass in this unit, students must meet certain milestones as part of the portfolio.

Unit Fee Information

Fees and charges vary depending on the type of fee place you hold, your course, your commencement year, the units you choose to study and their study discipline, and your study load.

Tuition fees increase at the beginning of each calendar year and all fees quoted are in Australian dollars ($AUD). Tuition fees do not include textbooks, computer equipment or software, other equipment or costs such as mandatory checks, travel and stationery.

Use the Fee estimator to see course and unit fees applicable to your course and type of place.

For further information regarding tuition fees, other fees and charges, invoice due dates, withdrawal dates, payment methods visit our Current Students website.