PROJECT SUMMARY
Transparent Machines: From Unpacking Bias to Actionable Explainability
Focus Areas: News and Media, Transport and Mobility, Health, and Social Services
Status: Active
ADMs, their software, algorithms, and models, are often designed as “black boxes” with little efforts placed on understanding how they work. This lack of understanding does not only impact the final users of ADMs, but also the stakeholders and the developers, who need to be accountable for the systems they are creating. This problem is often exacerbated by the inherent bias coming from the data from which the models are often trained on.
Further, the wide-spread usage of deep learning models has led to increasing number of minimally-interpretable models being used, as opposed to traditional models like decision trees, or even Bayesian and statistical machine learning models.
Explanations of models are also needed to reveal potential biases in the models themselves and assist with their debiasing.
This project aims to unpack the biases in models that may come from the underlying data, or biases in software (e.g. a simulation) that could be designed with a specific purpose and angle from the developers’ point-of-view. This project also aims to investigate techniques to generate diverse, robust and actionable explanations for a range of problems and data types and modality, from large-scale unstructured data, to highly varied sensor data and multimodal data. To this end, we look to generate counterfactual explanations that have a shared dependence on the data distribution and the local behaviour of the black-box model by level, and offer new metrics in order to measure the opportunity cost of choosing one counterfactual over another. We further aim to explore the intelligibility of different representations of explanations to diverse audiences through an online user study.
PUBLICATIONS
i-Align: An Interpretable Knowledge Graph Alignment Model, 2023
Salim, F., Scholer, F., et al.
TransCP: A Transformer Pointer Network for Generic Entity Description Generation with Explicit Content-Planning, 2023
Salim, F., et al.
Contrastive Learning-Based Imputation-Prediction Networks for In-hospital Mortality Risk Modeling Using EHRs, 2023
Salim, F., et al.
How Robust is your Fair Model? Exploring the Robustness of Diverse Fairness Strategies, 2023
Small, E., Chan, J., et al.
Equalised Odds is not Equal Individual Odds: Post-processing for Group and Individual Fairness, 2023
Small, E., Sokol, K., et al.
Helpful, Misleading or Confusing: How Humans Perceive Fundamental Building Blocks of Artificial Intelligence Explanations, 2023
Small, E., Xuan, Y., et al.
Navigating Explanatory Multiverse Through Counterfactual Path Geometry, 2023
Small, E., Xuan, Y., Sokol, K.
Mind the gap! Bridging explainable artificial intelligence and human understanding with Luhmann’s Functional Theory of Communication, 2023
Sokol, K., et al.
Measuring disentangled generative spatio-temporal representation, 2022
Chan, J., Salim, F., et al.
FAT Forensics: A Python toolbox for algorithmic fairness, accountability and transparency, 2022
Sokol, K., et al.
Analysing Donors’ Behaviour in Non-profit Organisations for Disaster Resilience: The 2019–2020 Australian Bushfires Case Study, 2022
Chan, J., Sokol, K., et al.