Computer Scientists Should Read Wittgenstein

by Viyan Poonamallee

May 10th, 2023

LPHI 3118

Table of Contents

Introduction
Wittgenstein and Turing
Computer Science as an Empirical Practice
Stochastic Models
AGI and the Evacuation of Philosophy
Conclusion
References

Introduction

This will be an examination of the schism between Wittgenstein’s ordinary language philosophy and traditional empirical science that presupposes formal and symbolic representations of logic in relationship to the philosophical history of computer science. Wittgenstein’s connection to Turing will be investigated, which will be contrasted by the opposing view of formal and empirical logic in relation to computer science, showing a distinction between two philosophies that serve as highly important paradigms in the history of computer science and artificial intelligence. Finally, we will examine how recent work has given way to a spontaneous abandonment of both philosophical outlooks recently in light of artificial intelligence research now belonging to an industry that is uninterested in inheriting this philosophical and methodological history.

Wittgenstein and Turing

In Cambridge, 1939, Ludwig Wittgenstein taught a series of lectures on the foundations of mathematics. In these lectures, there is a notable record of attendance by one Alan M. Turing, a young mathematician and logician who had recently received his PhD only a year prior at the time. Wittgenstein, likely less interested in Turing as a human being so much as a board to bounce arguments off, enthusiastically brought him into the conversation of his lectures. “Suppose I say to Turing, ‘This is the Greek letter sigma”, pointing to the sign σ. Then when I say “show me the Greek sigma in this book’, he cuts out the sign I showed him and puts it in the book.—Actually these things don’t happen … because we have all been trained from childhood to use such phrases as ‘This is the letter so-and-so’ in one way rather than another (Wittgenstein, 1975, p. 20). This statement in particular is about training, something that Wittgenstein’s late philosophy emphasizes to great degrees. For him, we all come into our existence in the world through the process of learning, particularly learning techniques for functioning in various circumstances. This builds to Wittgenstein’s larger argument on mathematics, defining it less as a discovery of natural phenomena, but rather an invented language constructed in accordance with an inherited background (Wittgenstein, 1975, p.22). Like most other activities in Wittgenstein’s idea of the human condition, math is a constructed technique and applied method, rather than something to be found existing prior to our engagement with it. Turing, being a logician of the analytic tradition, was naturally opposed to this conception, rather favoring a formalistic and symbolically immutable conception of math and the logic that governs it. And that’s it. Beyond those lectures, the recorded history of interaction between the two thinkers is sparse to nonexistent. With how brief and impersonable these recorded interactions are, the connection between Turing and Wittgenstein can conceivably be read as notable, but ultimately historically unimportant. Two highly influential thinkers met at a prestigious university once and argued about math. It’s not the first time gays have fought, and it will certainly not be the last. Yet, years later, something interesting happens. In 1950, more than ten years later, Turing would go on to write what is largely considered to be the most influential text on artificial intelligence in history. In the very first paragraph, he writes, “This should begin with definitions of the meaning of the terms "machine" and "think." The definitions might be framed so as to reflect so far as possible the normal use of the words” (Turing, 1950, p. 433). He starts his argumentation by acknowledging the ordinary use of language, a central idea to Wittgenstein’s work, bringing “words back from their metaphysical to their everyday use” (Wittgenstein, 2009, p. 53e). During the section dedicated to refutation of arguments against him, Turing has an entire dialogue with the specter of ordinary language philosophy through the section “The Argument from Informality of Behavior,” where he grapples with the issues relating to the denial of inner rules and representations. Wittgenstein famously denies that same notion of inner rules, “For our forms of expression, which send us in pursuit of chimeras, prevent us in all sorts of ways from seeing that nothing extraordinary is involved” (Wittgenstein, 2009, p. 48e). Especially interesting is that at no point does Turing try to assert that we certainly have those inner rules, instead that whether we do is empirically indeterminate (Turing, 1950, p. 449). Towards the end, Turing’s proposition for how to go about actually getting a computer to demonstrate human behavior is to model the mind of a human child and then train it (Turing, 1950, p. 457). The final paragraph suggests the teaching of a machine as teaching a child the meanings of things by pointing and naming (Turing, 1950, p. 460), in a description that strikingly resembles the ideas of teaching meanings within Wittgenstein’s writing (Wittgenstein, 2009, 9e). Years after their recorded interactions, Turing is thinking of Wittgenstein, of language games. To this day, despite the implicated connection, the philosophical link between Turing and Wittgenstein is ambiguous. They worked in different fields, and to what extent Turing was amenable, disagreeing, or ambivalent to Wittgenstein’s larger philosophy is a matter of speculation. However, it is difficult to deny the evidence towards one influencing the other, and it’s emblematic of Wittgenstein’s influence on computer science holding more weight than is often credited. There are even veritable historical accounts of his importance in other work within the development of computer science (Liu, 2022).

Computer Science as an Empirical Practice

As it has developed, computer science has diverged significantly from any association with Wittgenstein’s ordinary language philosophy, borrowing more from the strict empiricist traditions through its maturation. The ‘sci’ part of comp-sci isn’t for nothing, after all. Computer science is thus defined as a field of empirical inquiry that deals in experimentation, one that “develops scientific hypotheses which it then seeks to verify by empirical inquiry” (Newell and Simon, 1976, p. 120). Most important to this view for our purposes is the presupposition of a general symbolic language that can be used to model intelligent thought in nature. “Symbols lie at the root of intelligent action, which is, of course, the primary topic of artificial intelligence” (Newell and Simon, 1976, p. 114). This notion of mental processes as symbolic and formal can typically referred to as or associated with the computational model of the mind (Fodor, 2010, p. 226), one in which, thought, for all empirical purposes, is a function of a logically perfect set of objective rules. This mode also carries a strong implication of behaviorism, where the mind is reduced to empirical demonstrations of itself. While computer science is not explicitly a behaviorist model of inquiry, behaviorism is greatly amenable to the philosophical background of computer science, “sufficiently vague so that it is not terribly difficult to give them [behaviorism and Gestalt psychology] information processing interpretations” (Newell and Simon, 1976, p. 120). It should be noted that at this point that we have completely left the Wittgensteinian account of computation. Early interpretations of his work after his death often posit him as compatible with behaviorist models of the mind, an easy conclusion to come to, given his emphasis on practical application being part of the criteria for performing thought. However, Wittgenstein himself denies this interpretation (Wittgenstein, 2009, p. 109e). An interlocutor says to Wittgenstein, “[Y]ou will surely admit that there is a difference between pain behaviour with pain and pain-behaviour without pain[,]” to which Wittgenstein replies, “Admit it? What greater difference could there be?” (Wittgenstein, 2009, PPF 304e). While application is important, it isn’t just simple application, where the problem of consciousness is experimentally removed from the equation. Particularly, Wittgenstein’s language games are constructed to pose a refutation of a general form of logic or language altogether, made fundamentally incompatible with the formalistic interpretation of thought used in computer science. There can’t be a truly descriptive set of symbols that formally describe all forms of language, for there is no common denominator between them. “Instead of pointing out something common to all that we call language, I’m saying that these phenomena have no one thing in common” (Wittgenstein, 2009, p. 35e). None of this is to say that Wittgenstein has been ousted from the field forever, although he is less prevalent in the AI discourse. As we’ll see next, he’ll be back.

Stochastic Models

Strangely enough, we see a reversal with current AI models in computer science, a movement away from thought as represented by immutable symbolic language. To illustrate this, let’s look at a practical example. I’ve created an interactive model that roughly approximates the process of stochastic language generation. Let’s call this model BadGPT, a technically inaccurate name as it is not a large language model, and isn’t even a neural network for that matter, but I need a glib name that engages the tech that everyone is currently talking about. BadGPT generates sentences based on a mathematical construct called Markov chains that are used on an organized corpus of text; this can be considered a highly basic training process. For said training process, I fed it the entire text of Philosophical Investigations, which it then organized into a workable corpus. After that, the actual algorithm is relatively simple: pick a word to start the sentence with, and once BadGPT has that word, simply evaluate what the most likely word is to come based on the training data and stop when you see a period. Please give it a try yourself to see it in action.

If, when you run it, you get absolute nonsense back, it’s because it is nonsense. The main problem that makes it so incoherent is the fact the determination of each word in the sequence is ignorant of context; the larger string isn’t checked for any semblance of structural cohesiveness. However, this is not to show how bad of a programmer I am (I’m actually pretty good if anyone is hiring), but to show the underbelly of these models. Like all stochastic language models, BadGPT creates nonsense by design based purely on probability. It is, at its core, a program that simply executes the probabilistic generation of words. The important thing to understand is that the greatest stochastic models operate on the same principles. By design, ChatGPT does not at any point conduct symbolic or meaningful analysis of the shapes it arranges. It simply has an absurd amount of data that it is trained on, which it can then use to reproduce what someone would probably say in response to a query. The new language model, the stochastic model, is not a formal one. Rather, it operates in quite the Wittgensteinian fashion. Taking from Turing’s original emphasis on the ‘training’ of a model, a stochastic model must learn before it can function, and the implementation of symbolic language is unimportant to its calculations. Philosophically, what we get out of this methodology isn’t quite pure Wittgenstein, but a somewhat bizarre amalgamation of associated thoughts. Functionally, the stochastic model is uninterested in inner rules and representations, or any general form upon which the developer builds an algorithm. In fact, neural networks are built to be essentially opaque to user and developer alike (Grimes, 2022), so truly no one knows what’s actually going on in there. Instead of any truly formal immutable logic, the demonstration is all that is needed to meet the criteria of proper functioning. Simply input data, and check whether the output meets your criteria for success. Going back to the source, rather than ask, ‘Can machines think?’ someone like Turing goes for ‘Can machines convincingly act like they’re thinking” (Turing, 1950, p. 434)? For anything beyond a convincing imitation, the model isn’t concerned. A close reader will see that while associated, this isn’t true to Wittgenstein, as it still falls too close to behaviorist models of thought. More accurately, Wittgenstein is subsumed into behaviorism, information science, and the computational model of the mind, being closer to early positivist readings of him than the current understanding of his work. The stochastic model is somehow both interpretations of computer science and neither at the same time.

AGI and the Evacuation of Philosophy

Here’s where things go off the rails. The big topic of discussion regarding the future of artificial intelligence technology is the possibility of general artificial intelligence (AGI), meaning AI that has reached a level of capability that equals or surpasses humans in terms of intellectual capabilities. An easy benchmark for people thinking about it would be the moment that a program is capable of performing a function in relation to new stimuli without ever needing a programmer to design a contingency for the function itself. It's an incredibly popular idea within the public consciousness, presumably due to its revolutionary implications and ideological accessibility as conditioned through decades of pop culture and science fiction essentially describing fictionalized versions of what AGI implies. What’s worth noting is that the people making the technology also particularly fixate on the idea of it. Companies like OpenAI state their mission to be centered around AGI, ensuring that “artificial general intelligence (AGI)—by which we mean highly autonomous systems that outperform humans at most economically valuable work—benefits all of humanity” (OpenAI, 2018). For context, the entire charter that this line is taken from is about OpenAI’s preparations for AGI. On another page in regard to preparation and development, OpenAI illuminates their desire to “tackle alignment problems both in our most capable AI systems as well as alignment problems that we expect to encounter on our path to AGI” (Leike et al., 2022). This is a good time to remind the reader that this is regarding a hypothesis about a possible technology that does not currently exist. Keep that in mind as you read about their “path to AGI.” OpenAI is presupposing AGI as an inevitable consequence of their technological development, an assertion that seemingly isn’t motivated by the simple pursuit of knowledge through experimentation. Assuming this isn’t simply magical thinking, let’s assume that they see AGI as a natural perfection of the stochastic model. If we look at the actual technology, there are plenty of reasons that the ChatGPTs of the world are not the gateway to the AGI that these organizations are so confident in their ability to manifest. On the basis of energy, they’re a nightmare. Ecologically, a trained Transformer emits nearly 58 times the amount of CO2 as compared to the average human and requires exponentially more energy usage for even the most granular improvements (Bender et al., 2021, 612). There are numerous issues fundamental to the idea that all you need to make these models better is a larger dataset (Bender et al., 2021, 613-615). This isn’t even to mention issues in general conduct, where in order to keep ChatGPT successfully moderated as to not instruct its users on how to make a pipe bomb, OpenAI used Kenyan laborers who were paid less than $2 an hour in order to fine tune its system (Perrigo, 2023). This is the cost of maintaining these models at their current level of competence. Now, imagine what would need to be done to reach AGI. This isn’t an empirical inquiry anymore. If we actually look at what the people helming this research have to say, we’ll find their statements confusing at best. Demis Hassabis, executive officer at Google’s DeepMind, adumbrates the ideology of the project as “solving intelligence, and then using that to solve everything else” (Simonite, 2020). Whatever is meant by “solving intelligence” is ultimately left ambiguous. The article ends with “If everything works out as Hassabis hopes, his ethics board will eventually have real work to do” (Simonite, 2020). What does that even mean. The idea of “solving intelligence” resembles something closer to alchemy than science, and the ethics board statement treads a thin line between idealistic and psychotic. Even its Wittgensteinian influences seem to be only implicit at this point. All critical reflection has been subsumed into a form of thinking that completely evacuates any philosophy, rather favoring a mission that has no interest in using philosophy to guide method, but rather uses technology to guide everything else. “[W]e abandon any goals of theoretical insight into tasks and treat LMs as ‘just some convenient technology’” (Bender et al., 2021, p. 615). None of this is even to speak about whether AGI is possible, imminent, or inevitable. What’s meant to be acknowledged here is the apodictic certainty that practitioners in a ‘field of empirical inquiry’ seem to have for it. As Newell and Simon said, “Society often becomes confused about [the purpose of computer science], believing that computers and programs are to be constructed only for the economic use that can be made of them” (Newell and Simon, 1976, p. 114). What was once considered a confusion of the layman in regard to a form of inquiry now applies just as effectively to the producers of research in the field. While they may not admit this outright, simply looking at how the big players talk lays it bare—they want to develop, and are very certain that they can develop, a new instrumental technology for the purpose of affecting the world. I won’t spend too much time speculating upon the philosophical motivations for this turn, but it is notable to acknowledge that for companies like these, the technologies that they presuppose, like AGI, are what justify their very existence. Digital technology is an industry now in a way that it never was in the 50s or 80s. The status of computer science as an empirical practice (within it the subfield of artificial intelligence) has been reduced to nothing, because to allow doubt in the tech optimist’s hypothesis is to doubt your own viability in the market.

Conclusion

We need philosophy back. I would first posit that to return to the formal symbolic interpretation of computer science would be a mistake. Its confidence in a static set of rules, disinterest in what isn’t immediately observable, and certainty in the computational model of the mind, all lay the groundwork for allowing phrases like “solving intelligence” to be squared as valid ends to research. You can only “solve intelligence” if you believe that all the preliminary work in attaining necessary tools has already been done, that philosophy has no more questions to answer and that the task is exclusively for science now. Rather, I would suggest a perspective closer to Wittgenstein and closer to ordinary language philosophy. Wittgenstein’s language games still propose an importance to application and demonstration that is methodologically useful to the work of computer science, but it doesn’t necessarily presume all important content to be within those boundaries. An openness to the simple, the average, that which can’t be immediately deconstructed and subsumed into the empirical, while still accepting it as present within the picture can go a long way in grounding thought. “We have got on to slippery ice where there is no friction, and so, in a certain sense, the conditions are ideal; but also, just because of that, we are unable to walk. We want to walk: so we need friction. Back to the rough ground!” (Wittgenstein, 2009, p. 51e).

Note

I don’t go into it in this paper, but the cited text, Lydia H. Liu's "Wittgenstein in the Machine" (2021) is a strong account of Wittgenstein’s influence on computation through Margaret Masterman, one of Wittgenstein’s closest students who was instrumental to the development of computational linguistics. Another strong tie between Wittgenstein and the history of computer science.

References

Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the Dangers of Stochastic Parrots: Can Language Models be Too Big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–623. https://doi.org/10.1145/3442188.3445922

Fodor, J. (1980). Methodological solipsism considered as a research strategy in cognitive psychology. Behavioral and Brain Sciences, 3(1), 63-73. doi:10.1017/S0140525X00001771

Grimes, B. (2022, November 25). Artificial Neural Network: Here’s everything you need to know about black box of AI. Interesting Engineering. https://interestingengineering.com/innovation/artificial-neural-network-black-box-ai

Liu, L. H. (2021). Wittgenstein in the Machine. Critical Inquiry, 47(3), 425–455. https://doi.org/10.1086/713551

Leike, J., Schulman, J., & Wu, J. (2022, August 24). Our approach to Alignment Research. Our approach to alignment research. https://openai.com/blog/our-approach-to-alignment-research

Newell, A., & Simon, H. A. (1976). Computer Science as Empirical Inquiry. Communications of the ACM, 19(3), 113–126. https://doi.org/10.1145/360018.360022

OpenAI charter. OpenAI Charter. (2018, April 9). https://openai.com/charter

Perrigo, B. (2023, January 18). OpenAI used Kenyan workers on less than $2 per hour: Exclusive. Time. https://time.com/6247678/openai-chatgpt-kenya-workers/

Simonite, T. (2020, April 2). How Google Plans to Solve Artificial Intelligence. MIT Technology Review. https://www.technologyreview.com/2016/03/31/161234/how-google-plans-to-solve-artificial-intelligence/

Turing, A. M. (1950). Computing Machinery and intelligence. Mind, LIX(236), 433–460. https://doi.org/10.1093/mind/lix.236.433

Wittgenstein, L. (2009). Philosophical Investigations. (G. E. M. Anscombe, P. M. S. Hacker, & J. Schulte, Trans.). Wiley-Blackwell.

Wittgenstein, L. (1975). Wittgenstein’s Lectures on the Foundations of Mathematics: Cambridge, 1939 from the Notes of R.G. Bosanquet and others. (C. Diamond, Ed.). University of Chicago Press.