FEIT Research Project Database

Multi-modal deep dictionary learning framework for managing smart city assets

Project Leader: Marimuthu Palaniswami
Staff: Karim Seghouane, Aravinda Rao
Collaborators: SenSen Networks
Sponsors: Australian Research Council (ARC), Linkage Project
Primary Contact: Marimuthu Palaniswami (palani@unimelb.edu.au)
Keywords: optimisation; sensor fusion
Disciplines: Electrical & Electronic Engineering
Research Centre: ARC Research Network on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP)

Australia is undergoing an unprecedented scale of urbanisation (80–90%) [1] and population growth [2]. Our cities are becoming denser, posing challenges to existing road infrastructure, public transport and services. The gross replacement value of Australia’s assets is $438 billion, and $47 billion value of assets are in poor or very poor condition [3]. Managing city assets (traffic signs, parking signs, public transport signs, hospital and hotel signs) is becoming difficult due to rapid development of cities [4, 5] and most cities lack a complete database of assets.

Currently auditing assets involve visiting each asset and update its condition using a few measures (such as whether they are in good condition, needs maintenance because of regular wear and tear, damaged due to environment or vandalism, or broken due to accident or completely missing) into a customised database. This process considers only a few aspects of the assets, such as location in the city, and lack an integrated overview of the operations of assets with one another. As a result, the current practice of managing assets are:

  1. prohibitive because of expensive personnel cost and time
  2. subject to human error on the condition of assets
  3. motorists and pedestrians are forced to make subjective calls when assets are missing or damaged
  4. lack of tools to optimally digitise assets.

The current solutions to auditing assets lack:

  1. robustness to handle occlusion and unusual noise captured while acquiring the data
  2. online approach to handle very large datasets
  3. handling nonlinearities in the datasets
  4. quality measure to recognise the state of assets.

Deep learning has been used popularly in object detection and classification because of its structure to learn features hierarchically. However, it is a supervised approach where the optimisation of features is such that they are extremely suited to specific tasks at hand. In addition, deep learning algorithms also have the problem of vanish gradients when we add more layers and becomes more difficult to train deeper networks.

On the other hand, dictionary learning represents data by learning overcomplete dictionaries with a global objective. We see dictionary learning being used in a variety of applications, such as image denoising, image restoration and image classification. Dictionary learning provides representations of data that can be generalised and easily adopted to new unseen test data. It also requires comparatively less data for training the models. Moreover, it provides closed form equation for further analysis and interpretation of solutions to apply for real-world scenarios and understand the fundamental signals. Therefore, in this project, we will explore and build multimodal deep dictionary learning framework for asset recognition and provide states of condition of assets.

This project aims to develop the world-first automated asset auditing technology. Dictionary learning and deep learning algorithms have shown promising results in detecting, tracking and understanding assets (or objects in general) in public places for smart city operations. Using such learning techniques, we intend to create new models for automated:

  • identification of city assets
  • determine the condition of the assets (damaged, broken, or missing)
  • integrate with city services for decision-making and service delivery.

Significance/innovations of this project

Our proposed project will fundamentally change the practice of auditing assets. Our solution will deliver the state-of-the-art dictionary learning and deep learning technologies by integrating multi-modal sensing technologies to create dictionaries of assets in existing cities. The outcomes of the project will provide automated tools for auditing of cities and efficient operations of cities. Our innovation is a novel multi-level solution framework for detecting, recognising and characterising cities assets.

  • A new multi-modal deep dictionary learning framework, which will allow to handle data sets from different modalities and dimension by forcing the different features to interact through their sparse codes
  • Robust variant of the proposed dictionary learning and deep learning procedure to handle occlusion and unusual noise in data
  • An online version of the proposed dictionary learning procedure to handle very large data sets


Prof Marimuthu Palaniswami (palani@unimelb.edu.au)

Dr Karim Seghouane (abd-krim.seghouane@unimelb.edu.au)

Dr Aravinda Rao (aravinda.rao@unimelb.edu.au)


  1. Trading Economics, 2019. [Online]. Available: https://tradingeconomics.com/australia/urban-population-percent-of-total-wb-data.html
  2. Australian Bureau of Statistics, 2018.
  3. J. Roorda and Associates, “National state of the assets: Roads and community infrastructure report,” Australian Local Government Association (ALGA), Report, 2015.
  4. Department of Infrastructure and Regional Development, “Trends: Infrastructure and transport to 2030,” Australian Government, Report, 2014.
  5. K. Rosier and M. McDonald, “The relationship between transport and disadvantage in Australia,” 2011. [Online]. Available: https://aifs.gov.au/cfca/publications/relationship-between-transport-and-disadvantage-austr