Computational research and AI
At the Wolfram Institute, I explore how ideas from mathematical physics can be carried into the framework of Wolfram models. My work focuses on designing the least complex discrete model capable of reproducing a macro phenomenon, with the potential to branch out and converge to known continuous structures in the limit of many computational steps. After implementing such a model and setting up a computational experiment, the empirical phase begins: making statistical observations, searching for emergent patterns, and identifying laws between well-designed observables as the complexity of the computational substrate generated by the experiment increases. It is exciting to think about how the seemingly continuous, unlimited, and unique world might emerge from a discrete one characterized by indivisibility, boundedness, and ambiguity.
In the exploratory phase—before committing to a concrete project—I brainstorm ideas, read papers, distill suitable constructions, implement them in Wolfram Language, and create notebooks for experimentation and preliminary presentation to colleagues. If the direction is promising, I zoom in and develop it further, eventually contributing to Wolfram Institute’s projects, paclets, or writing something myself. While I want full responsibility for the final product, the exploratory and polishing phases seem like natural candidates for AI assistance.
My utilization of AI
For two years I used AI in a basic way—for chat, reformats, and corrections. When working in a corporate environment, I was anxious about sharing unwanted context and responsibility for correctness. The peak of my AI usage was taking advantage of most of CodeCompanion’s functionality for Neovim, except agents.
A turning point came with the release of Claude Opus 4.5. I began experimenting with agentic workflows and got used to offloading tasks to Claude using only guidance and specifications, without touching the code directly. The results were surprisingly good. My first “wow” moment was vibe-coding a Swift app to manage and time-track my weekly tasks—generated entirely from a screenshot of my wife’s Excel table. I have never programmed in Swift and have never looked at the generated code, yet the app works perfectly. It feels like we are approaching a world where anyone can define the app they need and let the AI build it, like selecting a song on a jukebox.
Rapid prototyping—getting from A to B without deep dives—also seems essentially solved. When improvements are needed, the simplest approach is to direct the AI to reuse fragments of earlier code or templates. When I have time or need specific architecture, I still refine things myself, but AI gives me the option to strategically postpone that work.
Given this shift, I think it is time to consider agentic workflows and researcher best practices for computational research in the AI era. Since general standards and guidelines still seem to be missing, and I don’t have time to dive too deep, I decided to stay within the Claude ecosystem and began writing a plugin that captures my exploratory workflow.
Computational research plugin
The plugin—still a work in progress—bundles several Claude skills and tools useful in computational research. Currently it includes:
- wolfram-notebook, which creates a Wolfram notebook from a prompt via Markdown import without touching the front end (an idea by sw1sh).
- computational-exploration, which scaffolds a research project and performs an initial exploration.
The generated structure looks like this:
Infrageometry/
├── CLAUDE.md
├── Infrageometry1.nb
├── Code/
│ ├── Tools.wl
│ ├── Infrageometry.wl
│ └── InfrageometryVisualization.wl
├── Resources1.nb
├── Resources/
└── Article/
├── article1.tex
├── notes1.tex
└── references.bib
The skill searches arXiv and Wolfram Community for relevant papers and resources, downloads them, writes summaries, and produces organized notes with citations on the topics I specify. Since I always want to write the final article myself, these notes serve as well-formatted extended memory for both me and Claude: I can dump ideas and resources, and Claude handles their completion, organization, and polishing.
The core MCP servers are Wolfram MCP (or the unofficial wolfram-mcp with LSP support) for evaluating Wolfram Language code and creating notebooks, and arXiv-mcp for searching and downloading arXiv papers. Planned additions include:
- notes-to-article — moves a selected part of the notes into the article and integrates it cleanly.
- list-topics — extracts research topics from resources and notes.
- setup-experiment — creates a computational experiment notebook for a given topic.
- polish-research — refines and improves research artifacts.
Missing pieces
Several foundational components still seem necessary before agentic computational research becomes seamless:
- Central database for both verified and incomplete math. Something Lean-like but tolerant of partially defined objects, lowering the threshold for using it. This would remove the need to stitch together dry math papers and allow researchers to focus on novel contributions. A future mathematician could input a new result in any form—even blackboard photos—and the AI would place it correctly; readers could then generate a study-ready paper tailored to their needs.
- A clear definition of an agent. What exactly is an agent? Can an agent create other agents? Can such agents be reused across contexts?
- An orchestration graph. A workflow graph including agents, tools, and humans, ideally compilable into a minimal version where most tasks are performed by tools—a more capable successor to LLMGraph.
Human role
AI will surely make certain hard skills obsolete. Paradigm shifts are exciting, and letting things go and starting anew deepens life experience. But several human roles remain essential:
- Curiosity — the drive to explore a direction and push it as far as possible through questions and tasks.
- Ideas — identifying an AI-unsolvable problem that matters to your group of humans and challenging yourself to solve it.
- Coordination — defining the workflow for agents and deciding when humans intervene.
- Communication — spreading enthusiasm among humans about your problem and convincing them it is worth their time and resources.