The main theme which Cameron Buckner is concerned with regards the possibility that errors produced by neural networks may not be errors at all, but instead tell us something about the reality it surveys and about the differences between human and machine intelligence.
It all starts with adversarial examples which are (manipulated) inputs that cause neural networks, necessarily depending on their training data, to misclassify manipulated images, manipulations on the other hand do not fool humans. But the features which are responsible for deep learning’s perceived failures, are not just the result of overfitting idiosyncrasies or noise in the data, but may contain truths which are just not recognizable to humans (non-robust). Explanations of why this is the case range from historical contingency and social construction to evolutionary development. That they are non-robust however leaves open the possibility that human perception and related elements such as science are limited, but could potentially move beyond human limitations with the right tool which has a different apparatus for cognition.
If such non-robust features can be useful to us, since they flee our perception, depends on the specific use case. On one hand we ought to accept that non-robust features are not only noise, but may reflect useful patterns in natural data. However we should not be tempted to believe that these features must be predictively useful and reflect real signals about our object of interest. To Buckner, the way in between is that of artifacts, which are often predictively useful, but are neither objective features of the input that need to be necessarily tracked. It follows that non-robust but predictively features which may lead to errors from a human perspective cannot be disqualified as garbage a priori, but neither as useful.
As Buckner notes, there is the question if understanding and knowledge, and hence use, can derive from such features which remain inaccessible to us. Could evidence based on non-robust features be called progress in science? Buckner’s solution is that of taxonomy, etiological theory and causality. As I understand it, we would try to frame the unknown features, put it in a readable structure which would make it intelligible to us, while remaining quite unaware of the non-robust itself. If I am reading this right, this is however merely a workaround. As one cannot 100% translate words between languages, some things always get lost in the translation, wouldn’t such workarounds for encapsulating non-robust features only forfeit the problem? In the end, we are still trying to translate something unintelligible into our spectrum of understanding.
There also seems to be a potential differentiation between non-robust features that are deep learning artifacts and real features detectable only via “alien” cognition. What is the relevant difference here? I take it to be non-robust due to its existence being a result of calculations or a further level of abstraction, which is different from an alien cognition which is morel like putting on a set of distorting glasses. Nevertheless, both imply an ability of conception different from that of humans. Immanuel Kant proposed that whatever we perceive is dependent on our human apparatus. Also Heidegger held that even truths in mathematics could never capture a universal truth but was always limited to human ability of interpretation. If Buckner is right, this not only opens for a variety of dimensions of explanatory methods, but it also as a result asks us to take a new stance with regards to our ability to understand the world and what to do with intelligences that view it fundamentally differently. This seems to also have an ethical dimension: us not understanding the basis for whatever result is the outcome and not being able to look inside the black box, can the usage of potentially crucial predictions made by such a system be morally feasible?
Of course, there is a larger metaphysical and epistemological issue at work here, namely how are we to interpret features which supposedly are in the world, but which we cannot perceive of. This line of thought has a long tradition, Kant for instance both held that humans held a priori knowledge which premised our experience, and in his critiques formulated how we are led to error in our conception of the world. With neural networks and Buckner’s suggestion, the unintelligible is now brought to the main stage as something to make use of. However, we are not only concerned with ontological elements which we cannot conceive of, but with elements which I believe can be argued exist only as a result of calculation on data derived from the world (thus good ol’ dualism). Even if a machine claims to have found something in the world, how can we even validate that it is ontologocally real, how do we know that it’s not just a bug?
Buckner, Cameron (2020) “Adversarial Examples and the Deeper Riddle of Induction,” https://arxiv.org/ftp/arxiv/papers/2003/2003.11917.pdf