MSE Research Project Database

Resource allocation for attention in the context of intent (in both cooperative and adversarial settings)


Project Leader: Adrian Pearce
Primary Contact: Adrian Pearce (adrianrp@unimelb.edu.au)
Keywords: applied control theory; artificial intelligence; autonomous systems; machine learning; optimisation and programming languages
Disciplines: Computing and Information Systems
Domains:

Artificial intelligence has the potential to vastly amplify and augment human intelligence, especially in human agent teaming applications. One key challenge is to ensure alignment of values and goals in terms of being well behaved with respect to a human users' intentions (i.e. safety and fairness). Another is managing attention in multi-agent settings, which typically involves physical actions (e.g. robot motion), sensing actions (e.g. human gaze, (robot) sensor scheduling) and interleaved reasoning steps (e.g. AI planning). Attention for both humans and robotics is constrained by individual perspective and occlusion, the number of sensing actions that can be performed over time, and physical actions, such as motion and changes in physical configuration. In complex and dynamic multi-agent settings, when those inputs rely on belief about the intentions of either a collaborator, or an adversary, this is where this problem gets interesting and this is the key challenge that this thesis will tackle. This thesis focuses on the allocation of the finite resources available each agent can spend on attention, which must take into account the intent of other agents. On one hand, techniques in deep learning have, at least in principle, demonstrated that attention can be used to draw 'global' dependencies between input and output, as an alternative to recurrency using self-attention mechanism ('Attention is all that is needed', Ashish Vaswani et al., Google Brain). On the other hand, classical planning is a compelling emerging framework for this task, used successfully in unison with deep networks (for example DeepMind’s AlphaGo and AlphaZero) and relies on a domain (task) independent modelling language (semantics). There are advantages in having an explicit (symbolic) model representation: including learning behaviours is usually faster and more sample efficient; prior knowledge and experience can be integrated more easily; and the model can be flexibly reused for a wide variety of goals and objectives. Model-based approaches also enable counterfactual reasoning (“what would have happened if …”) which is difficult without a model. Relevant technologies include, but are not limited to, classical planning, mathematical programming, reinforcement and deep learning, constraint programming and epistemic reasoning (reasoning about ‘nested’ belief).