Transparent Machines: From Unpacking Bias to Actionable Explainability
Focus Areas: News and Media, Transport and Mobility, Health, and Social Services
Research Program: Machines
ADMs, their software, algorithms, and models, are often designed as “black boxes” with little efforts placed on understanding how they work. This lack of understanding does not only impact the final users of ADMs, but also the stakeholders and the developers, who need to be accountable for the systems they are creating. This problem is often exacerbated by the inherent bias coming from the data from which the models are often trained on.
Further, the wide-spread usage of deep learning models has led to increasing number of minimally-interpretable models being used, as opposed to traditional models like decision trees, or even Bayesian and statistical machine learning models.
Explanations of models are also needed to reveal potential biases in the models themselves and assist with their debiasing.
This project aims to unpack the biases in models that may come from the underlying data, or biases in software (e.g. a simulation) that could be designed with a specific purpose and angle from the developers’ point-of-view. This project also aims to investigate techniques to generate diverse, robust and actionable explanations for a range of problems and data types and modality, from large-scale unstructured data, to highly varied sensor data and multimodal data. To this end, we look to generate counterfactual explanations that have a shared dependence on the data distribution and the local behaviour of the black-box model by level, and offer new metrics in order to measure the opportunity cost of choosing one counterfactual over another. We further aim to explore the intelligibility of different representations of explanations to diverse audiences through an online user study.