Hot Research Topics in AI and ML in 2026 and Their Philosophical Connections
This year comes again with many exciting research directions in AI and ML. As these are increasingly more complex and more foundational at the same time, I believe that many directions could profit from philosophical insight more than ever, since it is often in philosophy where the more fundamental questions and concepts are being discussed.
At the same time, the philosophy communities (except for a few notable philosophical hamlets in Munich, Tübingen, or Oxbridge) are falling behind as they struggle to come to terms with mostly technical research for which they have little training. For this reason, I’ll try to list a few hot topics from 2026 that I find particularly promising – to help philosophers on the one hand catch up with some of the buzz, and AI and ML researchers and practitioners on the other hand enrich their work with perspectives from philosophy. The following list is by far incomplete and should only serve to create some appetite in both camps for an increased dialogue. I hope some of my ideas can inspire the readership. They are on purpose often rather broadly defined and less focused on narrow topics of philosophy of machine learning, which exists already but is often stuck in statistical theory of outdated ML methods (sorry!).
Autoresearch
Engineering legend Andrej Karpathy has published a new project on GitHub, which he has termed Autoresearch. Within the last few weeks, this project has received more than 70k stars and has become one of the most popular GitHub contributions. In a nutshell, it’s a framework guiding an agent to constantly improve the code of a research project by applying smart changes and verifying their merit through fixed quality criteria. Changes are kept when they improve the quality. This is a form of local search (also called hill climbing) that the agent performs: it climbs within the landscape of the chosen criteria to higher and, in that, better metric values. A broader approach to AI in research has been proposed by Sakana’s AI Scientist: they implement an agent that traverses a tree in search not only of better code, but also of an idea as well as different experiments, to arrive at a complete scientific paper. Similarly, other strategies explore the usage of AI in science through global search methods: consider AlphaEvolve or ShinkaEvolve, which harness evolutionary approaches (i.e., the generation of multiple candidate solutions that are iteratively sampled, crossed, and re-evaluated) to make foundation models come up with improved algorithms.
These methods have been developed and demonstrably used to enhance scientific discovery. They work particularly well when verification mechanisms exist, and they are part of the reason why fields like mathematics will be vulnerable to AI automation, as they further enhance foundation models’ abilities to find ways (e.g., proofs) toward a verifiable solution. This is a trend that should not only excite or worry mathematicians like Daniel Litt, who has recently revised his view on AI’s ability to automate math, but also philosophers (of science).
Empirically, the AI autoresearch methods will not only fundamentally change scientific practices and, with that, assumptions about how scientific communities work (I’m thinking here of Thomas Kuhn and the sociology of science, which would need a big update in a few years), but could also be informed by many of the insights that the philosophical community has gathered about science. To limit this excursion to one example: I believe that the ML community in general has often little clue about what it actually would like to investigate. Does it formulate and check hypotheses about the world, AI, the model, what the model is supposed to model (e.g., language, reasoning, intelligence), or something else? Similarly, the AI co-scientists and agents that will now populate our labs’ repositories are largely confused if they don’t receive proper instructions and guidance. For this reason, I can imagine that ML and AI researchers could profit from philosophical theories about what properties and forms a scientific model can have. If, likewise, the AI scientists and agents knew what they are supposed to discover as well as to model in more detail, they would surely be better off. Some might wonder why I make this point, as it seems rather far-fetched since we might just ask the agent to improve a given algorithm, but I’m trying to think here in more long-term perspectives and in terms of more ambitious approaches such as Sakana’s AI Scientist. Just consider how rich the list of possible scientific models from Stanford’s Encyclopedia of Philosophy is: “Probing models, phenomenological models, computational models, developmental models, explanatory models, impoverished models, testing models, idealized models, theoretical models, scale models, heuristic models, caricature models, exploratory models, didactic models, fantasy models, minimal models, toy models, imaginary models, mathematical models, mechanistic models, substitute models, iconic models, formal models, analogue models.” Can this not somehow be better incorporated into what a scientific agent is supposed to achieve in the future?
Neural Computers
With the increasing power of foundation models, more ideas for their applications emerge. One of these ideas is to let neural networks fully simulate the computer as we know it. Mingchen Zhuge et al., under the guidance of Jürgen Schmidhuber, have just published a new paper that experiments with multimodal models to replicate a simple command-line terminal as well as more complex GUIs. Everything the user does in terms of clicking or writing is forwarded as input (i.e., tokens) to the model right away. While initial experiments are still brittle, the authors have outlined a few criteria that the neural computer must ideally meet. One of them is, of course, Turing completeness: the power to express and run computations. In some ironic way, the authors conceptualize a neural computer that starts as a neural tabula rasa only to learn the automata and rules that turn a machine into a computer.
This is much harder than it might appear at first glance. Just think of the difficulties that LLMs have with counting. And it is also here where I believe that researchers on neural computers could profit from philosophy in the future. My intuition is that a neural computer would need to begin by learning a few fundamental principles before it could be used as a user-input-and-model-output terminal or GUI. Consider François Chollet’s ARC-AGI considerations, which I have detailed in a different blog entry. Chollet argues that an intelligent system would most likely be bootstrapped from four core cognitive systems, one of which is the system that allows the intelligence to grasp numbers and ordering principles. Basically, half of Wittgenstein’s Philosophical Investigations is about the puzzling difficulty of understanding how a child learns to count, and with that, how rule following more generally can be taught. My hope would be that research on neural computers could be inspired more by these fundamental philosophical conundrums, for which different theories have also emerged. It remains an open question whether everything can simply be learned empirically, i.e., whether neural nets would just need enough data to become Turing machines (this is in some sense the topic of Cameron Buckner’s From Deep Learning to Rational Machines: What the History of Philosophy Can Teach Us about the Future of Artificial Intelligence; although several chapters become rather hand-wavy later in the book), but research in AI is hitting these hard walls more often than ever before – walls which we have encountered before in philosophy.
AI Safety: Agents in the wild
A new workshop at ICLR is called Agents in the Wild, which addresses the increasing safety needs for AI agents that become more and more powerful, as they will not only improve our code but also execute their own decisions with real-world impacts. While the trolley dilemma from autonomous driving is a dead horse that I don’t want to beat further, the demand for secure deployments is bigger than ever. These agents will come, and we will need different strategies to mitigate their risks. I believe some of these strategies will be technical, others will be normative. In this area, philosophers from AI ethics have already made important contributions, and I think it is the field where philosophy has so far had its greatest point of contact with AI. However, many philosophers are still shy to engage in true engineering. Agents would need a particular ethical worldview that could steer their behavior. Constitutional AI approaches like the one from Anthropic have been driven by non-philosophers, yet it is these kinds of approaches that are used to guide Claude. ML and AI engineers should demand more from the philosophical research community than abstract meta-discussions about different ethical perspectives or cultural analyses of how AI is perceived in society.
Multimodal Reasoning Continued
Reasoning models will increasingly engage with more than text and logical problems and focus more on the physical world. This will require different forms of reasoning, and training models to better navigate the world through mostly imagery is already happening. But can neural networks truly understand something like colors, which are for us a fundamental property of our lived world and an important signal, in some sense, to navigate the world, too? I have repeatedly thought about qualia when I was engaging with multimodal reasoning. Qualia are at the heart of the mind-body problem. They refer to subjective experiences of colors or feelings. It is open whether they fully reside in the mind alone or are also part of the external world. The most famous thought experiment on qualia has been proposed by Frank Jackson: he invites us to imagine Mary, the color scientist, who has been confined her whole life to a black-and-white space. However, she’s gathered all the knowledge about colors from physics, biology, and other sources of information through black-and-white books and TV. In this fashion, Mary has become an absolute color authority. Eventually, Mary is let free, and for the first time she sees red and green colors in a garden full of flowers. Suddenly, Mary has experienced colors in some other way; she’s learned something new about the world. How could that be? This is what qualia are about. Mary the color scientist, confined in her black-and-white world with access to all the knowledge about colors, sounds a lot like an LLM to me. In AI, we should therefore wonder what it is like when a model is further fed with different modalities, which are currently still reduced to, e.g., image tokens, and what effects that could have on the model. Again, I think there are a few lessons that could be learned from philosophy.
Counterfactual Reasoning
Counterfactual reasoning is particularly important in moral reasoning. It is the ability to imagine different possible worlds and make decisions considering alternative scenarios. While LLMs output text that sometimes seems to engage in such reasoning, I haven’t seen much research in AI in this direction so far. However, I believe that builders of AI who want to create more capable moral systems would need to further incentivize counterfactual reasoning in foundation models. In fact, counterfactual reasoning is also important in order to learn what the difference between customs and norms is, or which ethical core principles would hold across all possible worlds. Basically, it is one of the most complex, creative, and distinctive forms of cognition that humans excel at. The topic is further entwined with debates around rule following and rules more generally, something I’ve already mentioned with regard to learning how to count. I believe counterfactual reasoning is a field in philosophy that is worth understanding, and one day we’ll see new rounds of reasoning models that are tuned on this.
Mechanistic Interpretability Continued
Mechanistic interpretability has become an important strain in ML research, where different methods have been developed to shed light on Transformer models in particular. However, from a philosophical perspective, the field still seems rather chaotic, because rarely do most of the authors understand the difference between explanandum, explanans, and explanatory links, nor do they have a more focused agenda about what they would actually like to explain and interpret. These points have been made in Dear XAI Community, We Need to Talk!. Why mechanistic interpretability needs philosophy has been further argued in: Mechanistic Interpretability Needs Philosophy. The authors argue that “mechanistic interpretability needs philosophy: not as an afterthought, but as an ongoing partner in clarifying its concepts, refining its methods, and assessing the epistemic and ethical stakes of interpreting AI systems.” For example, mechanistic interpretability has often focused on one aspect, one level of mechanism, to explain a phenomenon; however, philosophers of science have shown that systems rarely have a unique and natural order of decomposition. In practice, what level of mechanism should be used and how their entanglement is interpreted would often depend on the pragmatic goals of the research agenda.
Not Hot Yet: AI Sentience & Memory and Identity
Finally, I’d like to briefly mention two more topics which are not hot yet but where I see great potential for an increasing cross-disciplinary engagement. The first is AI sentience, that is, the investigation of to what degree a so-far non-living entity can eventually develop sentience, and whether we ever have the chance of making this observation in the first place. I’ve so far seen little research in this direction. The best and deepest consideration of this topic has not been presented by a trained philosopher, but by a computer scientist who has now rather become a philosopher: Berkeley’s Edward Lee. I can highly recommend watching Lee’s lecture Will Embodied AI Become Sentient? where you will learn about zero-knowledge proofs along the way (fun!). In short, his answer is that we’d have trouble ever knowing about sentience in another being (no surprise for the philosophically literate readers), but he’s found an intriguing way around the epistemic scepticism to make him more hopeful that we could still have something of a proof for sentience (with zero knowledge!). Finally, another hot topic in AI is the development of systems with better memories, since the vanilla neural nets that we’ve dealt with so far have no memory module as such. Ultimately, I can imagine that a memory would not only usher in new cognitive abilities (it is a prerequisite for consciousness) but might also be the foundation for something like an identity. The intrigued ML researcher is hereby invited to read Derek Parfit’s Reasons and Persons.
Enjoy Reading This Article?
Here are some more articles you might like to read next: