Inscrutable Features: The epistemology of Artificial Intelligence and limits of human knowledge

Are Deep Learning Networks able to provide knowledge about the world which we ourselves as humans cannot gather? Do such networks even tell us something about a reality which is not intelligible to humans? These questions are at the heart of the debate sparked by some of the recent developments in Deep Neural Networks systems brought about by Machine Learning. Machine Learning Networks are based on statistics and probability theory, what these systems thus do extremely well is to find statistically based patterns in large amounts of data (Mitchell 2019). Usually, an engineer will train a system by both adding training data and providing an evaluation on the system’s calculation, giving the system reason to further finegrain it’s algorithm. However, this is also where the systems’ weakness lies, for while deep learning models can routinely achieve superior performance on novel natural data points which are similar to those they encountered in their training set, presenting them with unusual points in data space […] can cause them to produce behavior that the model evaluates with extreme confidence, but which can look to human observers looks like a bizarre mistake (Buckner, p.2). Such systems regularly make mistakes in understanding human language and logic (Alpaydin, p.91) or classifying images (Dreyfus, p. xv, p. xxxiii, p. xxxv) and they are also willingly deceived by strategically manipulated adversarial examples (Buckner 2020). But what if a falsely identified picture is not really a mistake, but contains pieces of knowledge about the world brought about by a non-human intelligence. Researchers have recently shown that DNN were discovering highly-predictive features in adversarial examples that generalize well to novel real-world data; they were hence not merely concerned with junk, but gained usable insights from the manipulated data, apparently because the features they detected carry predictively-useful information that is present in real-world input data (Buckner, p.6). So what we perceive may not be all there is to the world and machines may help us get at those hidden truths. This thesis has some philosophical tradition to back it up. Plato held that intelligence means acting according to ideal ideas, which however remain unknown to us (Platon, p.283-285), Kant worried that nothing we perceive can be guaranteed independent of processing constraints imposed by our cognitive architecture (Buckner, p.18). More recently, Kris McDaniel has argued in a similar manner that humans make inductive inferences based on a large enough sample size, which is however never complete, it’s just a way of getting on in the world (McDaniel 2020; Buckner, p.9). There however remain things in the world which are “projectible”, corresponding to objects that objectively belong together, but which we do not directly perceive (McDaniel 2020). The fundamental claim underlying the idea of human inscrutable features thus relates to the tradition which differentiates between what something looks like, for a human or a machine, and what it really is (Buckner, p.6).

There has been some critique aimed at this line of thought, but this is not what this paper is about. So given that there is a distinction between features of the world and features which humans perceive, if there is a realm of knowledge which stays hidden to the human mind, could it still be of use to us? The science of protein folding had long been held to consist of  properties which cannot be reduced to patterns in lower level details, which is why abstract explanatory models were needed (Buckner 2020). The AlphaFold system was however able to beat these models on a majority of the test proteins given and achieve a 15% jump in accuracy (Buckner 2020). I am not a scientist, I don’t know if this 15% increase is a world-changing development. But what this hints at, according to Buckner, is that the “interaction fingerprints” which deep learning neural networks were able to identify in this scenario, are just like the sorts of (non-robust) features that cause image-classifying networks to be susceptible to adversarial attacks (Buckner, p.11). They are features which human intelligence does not grasp.

If we assume that ideas as Buckner’s carry some weight, the question arises if and how information derived from human inscrutable features should be used.  

One one hand, there are technical limitations which we ought to be concerned about. As Melanie Mitchell points out, Neural Networks become ever more intransparent with growing depth and complexity. This is also due to the fact that the Machine Learning system autonomously finds a way to create the most accurate result (Mitchell 2020). So if a system is not only intransparent in its inner workings, but its output is also based on human inscrutable features, it seems to become rather tricky if not impossible to do any form of quality assurance on the system. A trustworthy system, according to the European Commission’s expert group on AI, is also a system which adheres to explicability, it needs to be transparent about why it generated a certain output (European Commission, p.19), something which a system using human inscrutable features will have a tough time doing. 

Another drawback due to the system’s setup is that subsymbolic systems may be well suited for perceptual or motor tasks for which humans can’t easily define rules, but are weak when it comes to logic and reasoning (Mitchell, 2019). As Buckner notes, modeling the difference between causation and correlation is a characteristic weak spot for deep learning (Buckner, p. 21). So there are definitely limits on what kind of information can validly be provided but also an urgency aimed at the users to put the results into context.

Another problem is that since we are concerned with features which may or may not be junk, we really do not know if something may be of value or not. Furthermore, even if some feature is recognized as being predictive, this does not mean that it will tell us something about the causes and effects that we are interested in (Buckner, p.16). Buckner sees this too, and believes that if and how such features are used must depend on its relevance to our purposes and how we interpret it (Buckner, p.14). This leaves however the problem of how such an analysis is to be done. Buckner believes that establishing tools of taxonomy, etiological theory and causality will help us frame the human inscrutable features and make them more usable and predictable (Buckner, p.11). But there is a valid suspicion I believe. Not only would such a framing-system as proposed by Buckner be man-made and perhaps distort the features when forced into human context, but human inscrutable features can also be held to merely be results of man-made training data and evaluations.

On a more abstract level, we can ask what good information derived from human inscrutable features can actually provide. On one hand these systems seem, without us knowing how, to enhance our tooling as becomes apparent in AlphaFold. But without us knowing how this was done does not seem to enhance our knowledge in any meaningful way. It is a bit like giving somebody a fish instead of teaching the person how to fish (Buckner, p.10). 

All this does not take away the credit due to a system finding patterns such as in the case of AlphaFold. It seems suited to provide insights within fields of vast and detailed data, where the result may trigger progress regardless of our knowledge about the causes increasing. This may work well for many projects in natural science, but less for philosophy, where arguably it is the way in which we derive a conclusion which matters. And this I hold, goes for many of mankind’s projects. As we do not know about the validity of the data (junk or inscrutable, biased by humans), we cannot depend on it blindly, and we surely should not make life-changing decisions based on such calculations. On the ethical side of things, we will find it hard to hold a system or it’s engineers accountable for any results the system may produce, if the level of abstraction makes its inner workings ever more intransparent.


Alpaydin, Ethem (2016) – Machine Learning: The New AI – MIT Press, Cambridge

Buckner, Cameron (2020) – Adversarial Examples and the Deeper Riddle of Induction

Dreyfus, Hubert L. (1992) – What Computers Still Can’t Do: A Critique of Artificial Reason – The MIT Press, USA

European Commission (2019) – Ethics guidelines for trustworthy AI

Platon (1952) – Sämtliche Werke II – Phaidon Verlag, Wien

McDaniel, Kris (2020) This is Metaphysics. Wiley Blackwell. Chapter 1.4 Do Things objectively belong together?

Mitchell, Melanie (2019) Artificial Intelligence. Farrar Strauss & Giroux. Chapter 2: Neural Networks and the Ascent of Machine Learning


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s