- Study at Deakin
- Campus life
- Industry and community
- About Deakin
RGBD sensors project an infrared pattern and calculate depth from the reflected light using an infrared sensitive camera. In this research, the depth sensing capabilities of two sensors are compared under various conditions. The experiments took place on a hired squash court (Figure 1).
Figure 1: Experiments were conducted on a hired squash court. The depth sensors were located in different vertical positions perpendicular to the planar surface at the front of the squash court.
The depth resolution decreases when the distance between the sensor and a planar surface increases. The Xtion depth resolution is better than the Kinect. The depth accuracy was also evaluated, where the number of levels for all 640x480 pixels was measured through a period of 1000 frames. The normalised histogram of the pixel bit values was calculated. It appeared that roughly 18% of the pixels maintain a single value, around 46% of the pixels alternate between two different levels; about 23% of the pixels have three different levels and the rest of the pixels have four or more levels. This implies that the values of around half of the pixels fluctuate between two values. The entropy of each pixel at different distances ranging from 500mm to 2500mm was calculated along with the average entropy of pixels for both the Kinect and Xtion depth sensors as shown in Figure 2. The entropy decreases when increasing the distance between the sensor and a planar surface, indicating lower depth accuracy.
Figure 2: This figure shows the entropy for each pixel of both the Kinect and Xtion depth sensors, where the sensors are placed at different distances ranging from 500mm to 2500mm from the planar surface at the front of the squash court and 1000 frames were captured. Each column of the figure represents the results at a certain distance, the first column being the results at a distance of 500mm. The top row shows the Kinect sensor results, while the bottom row shows the Xtion sensor.
In many assembly operations there are repetitive motions, uncomfortable postures, and other ergonomic hazards. Ergonomic assessments contribute to increasing the productivity and performance of organisations by reducing the rate of workplace injuries and assisting to prevent them. The aim of this research is to enable the use of RGBD cameras for real time ergonomic assessment in assembly operations. The accuracy and high sampling rates of RGBD sensors create an opportunity to monitor and alert operators if their current posture is risky and may cause joint or muscle injuries. RGBD cameras are suitable for shop floor environments due to their portability, relatively low cost compared to other motion tracking sensors, and rapid automatic calibration.
Methods used for RULA scoring. a) Scoring using joint angles displayed on 2D tracked skeleton. b) Scoring using joint angles displayed on a virtual mannequin. c) Scoring using voxels displayed on a virtual mannequin.
RULA, or Rapid Upper Limb Assessment, is a survey method used for the evaluation of ergonomics in workplaces. It helps to decrease the probability of upper limb injuries. The posture of a worker is evaluated in the workplace by assessing muscle effort and external loads applied on the workers body. The score of each joint of the skeleton is calculated using the joint's orientation calculated with reference to RULA assessment scores. Our system provides visual feedback where the user's limbs are highlighted with different colours to indicate the RULA score for each limb.
The event related potential (ERP) technique is a derivative of Electroencephalography (EEG), which measures the brain activity during cognitive processing of a sensory, motor or cognitive task. ERPs are obtained by time-locked averaging of EEG recordings for a specific sensory, motor or cognitive task.
The information flow or dynamic effective connectivity analysis applied to ERP data is a vital technique to understand higher cognitive processing and the functional differences under different events. Among other tools used in effective connectivity analysis, Granger Causality (GC) has found a prominent place. The GC analysis, based on strictly causal multivariate autoregressive (MVAR) models does not account for the instantaneous interactions among the sources. If instantaneous interactions are present, GC-based on strictly causal MVAR will lead to erroneous conclusions on the underlying information flow.
Current research focuses on using extended MVAR (eMVAR) models for GC-based information flow analysis of ERP data. The eMAR model accounts for instantaneous interactions, by adding the zero-lag component to the conventional MVAR model. The development of different adaptive estimation techniques such as constrained Kalman filtering for eMVAR model identification and utilisation of these techniques in the source space is currently under investigation, with the goal of improved extraction of the underlying information flow using the ERP data. While improving the theoretical algorithms used in ERP based information flow analysis these techniques are also utilised in a variety of applied research including cognitive load assessment, development of brain-machine interfaces and rehabilitation related studies.
Biomedical identification and authentication using physiological modalities such as a person's face, fingerprints or gait have been widely investigated because of their significance in many security applications. Recent research reveals that individuals can be identified from Electrocardiogram (ECG) signals as different individuals have different physiological and geometrical hearts, resulting in unique ECG signals. Human identification from ECG signals is robust to falsification, compared to the other biometric methods, which creates a great deal of interest in the signal processing community.
Figure 1: The framework of the proposed method for human identification from ECG signals
Most previous methods extract feature representations based on individual heartbeat waveforms or fiducial points, which require segmentation of individual heartbeats or detection of fiducial points. However, accurately segmenting individual heartbeats or detecting fiducial points is an arduous procedure, especially for those ECG signals that contain noise. In this work, a novel framework is proposed to extract compact and discriminative features from ECG signals based on sparse representation of local segments. The proposed method is able to capture both local and global structural information and does not need to segment individual heartbeats or detect fiducial points.
Figure 2: An example of ECG signals and the corresponding sparse features
With the advancement of modern imaging technology and reduction in the cost of electronic hardware, surveillance systems are now commonly installed in public places. Automatic analysis of human motion across a large number of video sequences is a challenging task. When human figures are recorded in a crowded scene, the resulting video sequences often contain occlusions. Moreover, there may exist substantial variations within the same class of motions performed by different subjects. Even the same activities may be performed by the same subjects at different speeds, giving rise to temporal variations of the same activities. Due to these challenges, human motion analysis algorithms that can model complex scenarios, and are simultaneously robust to viewpoint variation, noise and occlusion are highly desirable.
The bag-of-words representation of video sequences
This research proposed a general framework for the analysis of human motion in videos based on the bag-of-words representation and the probabilistic Latent Semantic Analysis (pLSA) model. This framework consists of detecting human subjects in videos, extracting pyramid Histogram of oriented Gradients (HoG) descriptors, constructing a visual codebook by k-means clustering, and supervised learning the pLSA model for recognition.
This work has investigated different strategies for the application of optimal fault-tolerant force within human-robot cooperation for the slow pushing or lifting of an object. Six different strategies were presented to optimally maintain a cooperative force despite manipulator failure through a locked joint event. These strategies determined the post failure cooperation of the faulty manipulator and the human.
Human-robot force cooperation in load lifting application
The aim of this research is to extend
This research analysed existing methods of harvesting energy from human body motion and introduced a novel system for harvesting energy from abdominal motion. This method allows the harvesting to occur during completely passive motion with a special focus on the user-friendliness of the harvester.
The pipe inspection system is an efficient camera tractor for the inspection of 200mm diameter pipes or larger. Its powerful drive enables proficient pipeline inspection over long distances. The electrically adjustable lift device enables optimal positioning of the camera in the pipe. In combination with a lowering device, the horizontally and vertically displaceable folding plug for the camera cable and the camera receptacle ensure simple, safe introduction of the camera tractor to the sewer line. After inspection, the fast reverse speed makes it possible to quickly complete the procedure. All components and sub-assemblies are designed for maximum ruggedness and reliability. The unit has integrated high-power LEDs and is pressure tested to up to 100 bar. The well inspection system has pressure-watertight pan and tilt DOFs. Pan rotates continuously and tilt has rotation range of ±90°.
The cable reel is a synchronised, fully automatic, motor driven cable winch that holds up to 500m (1500 ft) of cable. The traction regulating device synchronises the coordination of the camera tractor (inspection system) and the cable winch and is developed for optimal operating conditions during sewer inspections. The automatic cable guide ensures level winding of the camera cable. The fully integrated swivel winch boom and remote control unit enable the operator to work professionally and efficiently. A steel rope winch for easy, precise deployment of the camera systems ensures safe and ergonomic operation. A digital length measuring system and a workplace light for optimal lighting of the manhole opening complete the product equipment.
We propose a framework to use human gesture as input to trigger events within a DI-Guy simulation scenario in real-time, which could greatly help users to control events and avatar reactions in the scenario. We use Microsoft Kinect device for motion capture and write our own plug-in for gesture recognition and associate gestures with various commands in scene. The framework is a distributed system in which different modules are communicating and synchronizing through data streams. This provides a scalable loosely coupled highly cohesive modular framework where any component can be altered or modified without redesigning the whole system. The proposed framework could provide fast, affordable yet reliable environments for real time interactive crowd simulations. It supports direct user gesture input and allow for direct and collateral avatar reactions based on artificial intelligence. The gesture libraries provide most commonly used gesture recognitions and it could be easily expanded with more requirements.
|Figure 1: Skeleton tracking using Kinect (top) Intuitive interfacing with crowd simulation package (DI-Guy) (bottom)|
The vast majority of image fusion cases take place by having high frequencies information in a source image at spatial coordinates where the other image holds low frequencies. While this is quite common in fusing infrared and thermal images with visual images (at night), expanding image fusion to accept multi-source multi-modal images raises concerns of saturating the resulting fused image and highlights the need to study the fusion capacity in order to minimize the overlapping of high frequencies causing fusion artefacts.
Figure 1: Fusing a natural scene with saturated (8-bits) images
Figure 2: Fusion capacity maps using localised mutual information
Natural scene imaging statistics suggest that the probability to find a uniform image tends to zero. Therefore, the fusion capacity of an image can be approximated by measuring the distance between the normalised histogram of the examined image and the uniform distribution using mutual information. Using the normalised local mutual information measure fusion capacity can be formulated as follows;
where x is the source image with normalised histogram X and u is the uniformly distributed image with a normalised uniform histogram U.
In nature, the fish adjusts itself to hydrodynamics environment and becomes a perfect swimming expert during years of evolution, which drives many researchers to study its body structure and swimming characteristics. With the development of propulsive theories and robotic technologies, the research on a biomimetic robot fish with high velocity, high efficiency and high manoeuvrability has been a hotspot. This helps human being have more chances of exploring the mysterious underwater world and develop higher efficient propellers for ships or underwater vehicles. We designed a new free-swimming biomimetic robot fish (Fig. 1) with a biomimetic tail to simulate the carangiform tail, a barycenter-adjustor for descending/ascending motions and multiple sensors. The robot fish can communicate with the outside by an information relay system on water. We also proposed a three-dimensional (3D) computational fluid dynamic simulation of the biomimetic robot fish by Fluent. User-defined function (UDF) is used to define the movement of the robot fish and a Dynamic Mesh is used to mimic the fish swimming in water. The hydrodynamic analysis (for example: Fig. 2) of the robot fish helps us get comparative data about hydrodynamic properties and guides us to improve the design, remote control and flexibility of the underwater robot fish. Figure 2 shows pressure contours at 4 locomotion times in one fish tail swing period (N*m-2) (a) approaching the left maximal rotation position (t=0.1s); (b) intermedial position during oscillating motion (t=0.25s); (c) going on intermedial position during oscillating motion (t=0.4s); (d) approaching right maximal rotation position (t=0.55s).
Figure 1: The functional prototype of the biomimetic robot fish
One well acknowledged drawback of traditional parallel kinematic machines (PKMs) is that the ratio of accessible workspace to robot footprint is small for these structures. This is most likely a contributing reason why relatively few PKMs are used in industry today. The SCARA-Tau structure is a parallel robot concept designed with the explicit goal of overcoming this limitation and developing a PKM with a workspace similar to that of a serial type robot of the same size. Research done at CISR shows how a proposed variant of the SCARA-Tau PKM can improve the usability of this robot concept further by significantly reducing the dependence between tool platform position and orientation of the original concept. The inverse kinematics of the proposed variant has been derived and a comparison made between this structure and the original SCARA-Tau concept, both with respect to platform orientation changes and workspace.
Fig. 1. The SCARA-Tau prototype (left). A model of the same structure seen from the top (middle) and a model of the modified structure seen from the top (right).
Parallel kinematic machines (PKMs) are receiving increasing attention, both in academia and industry. One rapidly growing use for PKMs is in the solar photovoltaic (PV) industry, where a large number of DELTA robots are used. There is a rising demand for fast robots with larger workspace than what is available today. The SCARA-Tau robot is a novel PKM with a large workspace that would be highly suitable for applications in the PV industry. Present research efforts at CISR are aimed at evaluating the feasibility of using this robot in solar cell manufacturing applications. Adapting the SCARA-Tau robot to solar cell manufacturing involves finding optimal structural parameters for this application. This is a multi-purpose optimization problem. An optimal workspace should have a large volume below the lowest upper arm of the robot, while the isotropic properties and dexterity of the structure should be kept high.
Fig. 1. Parallel kinematic robots used for solar cell manufacturing (left) and the SCARA-Tau prototype (right)
Stereo correspondence estimation in one of the most active research areas in the field of computer vision and number of techniques has been proposed and developed, possessing both advantages and shortcomings. Among the techniques reported, multiresolution analysis based stereo correspondence estimation has gained a lot of research focus in recent years. Although the most widely employed medium for multiresolution analysis is wavelets and multiwavelets bases, however, relatively little work has been reported in this context. In this research we have tried to address some of the issues regarding the work reported in this domain and the shortcomings involved. While addressing the shortcomings of the existing algorithms, we also propose a new technique to overcome some of the flaws that could have significantly negatively impacted on the algorithm performance and has not been addressed in the earlier propositions.
Our algorithm uses multi-resolution analysis enforced with wavelets/multiwavelets transform modulus maxima to establish correspondences between the stereo pairs of images. A variety of wavelets and multiwavelets bases, possessing distinct properties such as orthogonality, approximation order and shapes are employed to analyse their effect on the performance of correspondence estimation. The idea is to provide knowledge base to understand and establish relationships between wavelets and multiwavelets properties and their effect on the quality of stereo correspondence estimation. In addition, comparative performance analysis of the proposed algorithm, with eight existing famous algorithms, is also performed to provide an insight of the capabilities of the proposed algorithm as well as the potential of wavelets and multiwavelets theories in stereo vision.
(RMS) for number of selected images with wavelets/multiwavelets basis highlighted with different colours
In many imaging applications, such as medical imaging and surveillance operations, it is beneficial to extract key details from captured images. However, in such applications the imaging technology used often results in low quality images, making it difficult to extract meaningful information. For example, a surveillance camera may capture a wide field of view, at low resolution, but if a forensic team needs to identify a suspect's face or a car number plate it is often not possible.
Super-resolution is a method of enhancing images so that features of interest can be extracted in fine detail, while still using low resolution imaging hardware. Multi-frame super-resolution fuses information from a series of low resolution images to create an image, or series of images, with a higher spatial resolution. As well as increasing resolution, this concept can be used to extend the field of view, remove moving objects and correct degradations inherent in the imaging process, such as noise, blur and spatial-sampling errors. Input images for multi-frame super-resolution algorithms must be captured from slightly different perspectives of a scene, resulting in a sub-pixel shift between images, so that detail not captured by the original imaging system can be recovered.
Current super-resolution algorithms make a range of assumptions about the low resolution input images to simplify the scenario. Super-resolution algorithms generally use a simple affine image transform, permit only minute displacement between images and assume all images belong to a planar scene. In removing these restrictions it is possible to greatly increase the number of possible applications for this technology. Allowing significant displacement between images and using a projective, rather than affine transform, means super-resolution may be used to create a high resolution view of a 3-Dimensional scene.
An image fusion metric does not suit all fusion algorithms. Some algorithms have to be evaluated with specially designed metrics. This section discusses, through counter examples, the compatibility challenges facing image fusion metrics with respect to a certain fusion algorithm.
Figure 1 illustrates different incompatibility challenges between image fusion algorithms and metrics. The proposed duality index depends on performing fusion experiments where we know exactly how the result should be. Developing such an index is fairly simple although it has its solid mathematical background listed in abstract algebra literature. The duality index depends mainly on estimating if the fusion operator measures the actual transferred data from a source image with an non-informative, also called a zero, image. The image fusion algorithm/metric duality index is then defined as DI0,:⊕x→R where ⊕ is the set of all fusion algorithms and is the set of all fusion metrics. The duality index is estimated as the average error in Equation below for all images in the domain.
While single-domain image fusion systems aim to add information from source images into one fused image, the main objective of multi-domain image fusion is to incorporate captured information from sensors observing a certain phenomena from different viewpoints. This allows the observer to understand the whole situation. However, not all sensors are as accurate, low cost, portable and readily available. In fact, some of these sensors have limitations according to the nature of captured data, power consumption and cost of the device. Most of these specially designed sensors adopt a color map that best demonstrates the captured information. This is because most non-visual sensors are too complex to be equipped with large memory modules, faster codecs or modules to support higher bandwidth. This work values image properties during the fusion process.
where F⊕ is the set of fusable images with their properties and Pi⊕i is the ith image property
Color-map and Histogram Fusion
Multi-sensor data fusion refers to the process of combining data from different resources to improve the quality of the information and accuracy of measurements. It is in the heart of a widespread range of applications including military, smart buildings and bridges, satellites and industry. The main problem with multi-sensor data fusion is that it is highly dependent on the quality of measurements obtained from each sensor. And since these sensors operate in real environments, there is no guarantee that the outputs of the sensors are accurate. The inaccuracies can be because of the health of the sensors, i.e. the battery charge condition, or the high noise nature of the environment. Some errors may also occur due to the faulty communication channels. The result of these situations will be the loss of measurements or packets that may cause the data fusion process to diverge.
The main goal of this research is to incorporate the possibility of missing measurements or packet in the data fusion process and obtain the optimal data fusion for systems suffering from this problem. The research finds the minimum error variance of the resulting system state estimation using a Kalman-like recursive filter. Figures 1 and 2 represent the performance of the newly developed optimal filter against the traditional data fusion techniques in terms of error variance and MSE.
Kalman filtering is a well known technique for state estimation of real-time dynamic systems. It provides the minimum variance error of state estimation when the system incorporates noises from the environment and the sensors. However, the Kalman filters assume perfect modelling of the system and complete knowledge about the system parameters. In real-world situations, these assumptions are not valid and the outputs of the Kalman filters will not be robust. This research addresses the problem of system modelling when the system under consideration suffers from uncertainties. It also provides the robust design of a new filter that provides the optimal filtering for systems under uncertainties without prior knowledge about the uncertainty values.
The uncertainties addressed in this research are the uncertainties in the modelling parameters and the uncertainties in the observations. The first kind may take place due to inaccurate modelling of the systems, linearization of nonlinear systems or model reduction. The second kind of uncertainty is due to poor communication channels, unhealthy sensors, high manoeuvring of tracked objects or high noise in the environment. The results of this research include finding an upper bound on the estimation error covariance and finding the optimal filter parameters that guarantee the estimation error covariance to fall below this upper bound. Figures 1 and 2 compare the result of state estimation of a two state system suffering from uncertainties in the modelling parameters and uncertainties in the observation process using the proposed robust filter and the conventional Kalman filter.
Despite the advantages of MRI, there are problems related to the use of MRI machines. The biggest problem when using MR imaging is that it is a time consuming process where a single scan may take 30 minutes. This creates large patient backlogs that delay their medical diagnosis and proper medication. Another major problem is that the MRI scanning process is sensitive to patients' movements. In some cases it is hard to control the movement of the patient such as in the case of children. In this case the patient is required to do the MRI scan again which increases the backlog. The MRI machines are complex and very expensive equipment. This makes increasing the number of operating MRI machines an impractical solution for reducing the backlogs. The practical solution would be to increase the speed of the scanning process itself. However, the speed of MRI machines is physically limited and cannot be any faster as long as the same number of measurements is required.
This research addresses the problem of increasing the speed of the MRI scanning process using a novel measurements sampling technique that guarantees the minimum number of acquired measurements without compromising the scan quality. The idea is to find the best set of measurements that represent the whole set of measurements before starting the scanning process. This will eliminate any chance of redundancies in the measurements. The new data acquisition technique reduces the time of the scan by one third without introducing any noise. Higher speeds can be achieved in the cost of introducing additive noise. The results of this research also include guaranteed robustness of the image reconstruction in the presence of patient frequent movements. The next figure represents the result of image acquisition from 65% of the measurements and the result after robustly filtering the output image.
This research investigates a methodology for generating a distinctive object representation offline, using short-baseline stereo fundamentals to tri-angulate highly descriptive object features in multiple pairs of stereo images. Several sparse 2.5D perspective views are generated and then registered into a single coordinate space. Having some priori knowledge, such as the proposed sparse-feature model, is very useful when detecting an object and estimating its pose in real-time systems, such as augmented reality.
Figure - 2.5D perspective view of a box edge
Creating a highly programmable surface operating at relatively high speed and in real time is an area of research with many challenges. Such a system has applications in the field of optical telescopes, product manufacturing, and giant 3D-screens and billboards for advertising and artwork. This research investigates various system designs, modularity, programmability and the system control intelligence. A simulation environment was developed to streamline system reconfiguration to translate complex mathematical functions into 3D shapes virtually before being displayed on the physical surface.
World leading operations are increasingly relying on modelling and simulation to develop more efficient systems and to produce higher quality products and services. Modelling and simulation allows scientists and engineers a better understanding of three-dimensional and time-dependent phenomena, as well as providing a platform for predicting future behaviour. Virtual training (VT) systems utilising advanced virtual reality technology is of growing importance to widespread application domains, such as aerospace, engineering, medical and education fields.
The perceived capability and technological affordance for enhancing human abilities to learn abstract concepts and complex procedural tasks of VT systems led to a wide adaptation for training and fast-paced technological advancements. Despite its adaptation for training and fast-paced technological advancements, ways in which to evaluate efficacy of such technology are unclear. In addition, how to better design such technology to achieve effective learning outcomes remains a significant challenge. In particular, consider the complexity of individual differences on human performance, adaptation and acceptance of new software and hardware devices evoked by 3D VT systems.
It is imperative to obtain a solid understanding of important elements that contribute to effective learning via such class of applications. This problem is addressed by developing a new evaluation method that focuses on cognitive, affective and skill-based learning dimensions. Results highlight the contribution of the method in analysing user performance, and understanding of the individual differences in learning ability and user experience within the 3D VT. The benefits of the method imply that it is effective to provide adequate, reliable and valid evaluation results in a comprehensive and systematic fashion. In addition, by exploring spatial knowledge and technical skill acquisition, knowledge visualization and user perceptions in a machine assembly training scenario, this research suggests 3D VT is effective in integrating multimodal system feedback and presents information appropriate to user input. Also the 3D VT is effective to support cognition and performance, as well as induce positive user perception and affect.