← Dossiers > Building Your Own AI Operating System

Building Your Own AI Operating System

May 28, 2026

Claude, Obsidian, and the state of the personal-knowledge AI stack in 2026

Directed by Igor · researched by Claude Opus 4.8

Everybody's selling an "AI OS." Most of it's vapor. Here's what the term actually means in 2026.

In 2026, “AI operating system” has stopped being a single idea and become at least four overlapping ones: a conceptual framing in which a large language model acts as the kernel of a new kind of computer, a formal academic architecture for scheduling and managing fleets of AI agents, a marketing label for consumer platforms that bake AI into the traditional operating system, and a quieter, more personal meaning in which an AI agent maintains and reasons over a body of knowledge that you own. This report concentrates on that last meaning, because it is the one a builder can actually stand up today without a research lab or an enterprise contract. The center of gravity for the personal version is a pattern that crystallized over the past year: a plain-text notes vault — overwhelmingly Obsidian — paired with an AI agent — overwhelmingly Claude, usually through Claude Code — where the agent reads, writes, links, and synthesizes the vault on your behalf so that knowledge accumulates instead of being re-derived on every question. The report traces where that pattern came from (Andrej Karpathy’s “LLM as kernel” and “LLM Wiki” ideas), how people are wiring Claude into Obsidian in practice (three distinct integration architectures), and how the alternative note-apps-plus-AI systems compare (Notion, Mem, Tana, Capacities, Logseq, NotebookLM, and others). The honest summary up front is this: the tooling has matured enough that a personal AI OS is now an assembly job rather than a research project, the substrate choice (where your knowledge lives) matters more than the model choice, and the central trade-off is between commercial platforms racing to be your AI OS for you and the Obsidian-plus-Claude path that lets you build and own one yourself.

1. What “AI Operating System” Actually Means in 2026

The phrase is used loosely enough that any serious discussion has to start by separating the senses, because they imply completely different things to build and completely different things to buy. Pinning down the term is not pedantry here; the word “OS” is doing real work, and which work depends entirely on who is saying it. There are four live definitions in circulation, and they sit on a spectrum from the purely conceptual to the immediately practical. Understanding all four lets you see why the personal version — the one this report is about — is both the most achievable and the least hyped.

1.1 The conceptual root: the language model as kernel

The intellectual origin of the term, at least in its current usage, is Andrej Karpathy, a co-founder of OpenAI and former head of AI at Tesla, who in late 2023 began describing large language models not as chatbots but as the kernel process of a new kind of operating system ¹. The argument was an analogy: in a traditional computer the kernel coordinates the central processor, memory, peripheral devices, the file system, and the network, and Karpathy proposed that a language model could play the same coordinating role for a new computing layer, orchestrating input and output across text, audio, and vision, calling tools such as a code interpreter or a calculator, browsing the internet, and reading and writing a file system ^1,5. In his “Intro to Large Language Models” talk he extended the metaphor concretely, casting the model itself as the processor, the context window as a kind of working memory or “RAM,” external tools and other models as peripherals, and a vector database — a store of numerical representations of text that allows similarity search — as a form of disk storage ⁵. The point of the framing was that the model becomes the thing that runs the system, rather than a feature bolted onto an existing one. At Sequoia Capital’s AI Ascent event in 2024 he sharpened it into a prediction, saying that roughly speaking everyone in the field was now trying to build what he called a kind of “LLM OS,” and that this, rather than any single academic approach, was how progress toward very general AI was actually unfolding ⁴.

This conceptual definition matters because it set the vocabulary that everything downstream borrowed. When people now say they are building an “AI OS” for their notes, they are usually reaching, knowingly or not, for Karpathy’s image of a model that sits at the center and coordinates everything else. It is worth being precise that this version is a frame, not a product. Karpathy did not ship an operating system; he offered a way of seeing where the technology was heading, and a great deal of the 2025–26 activity around personal knowledge systems is people taking that frame and instantiating it at small scale on their own machines ^4,5.

1.2 The academic formalization: the agent kernel

The second definition is the one that has been worked out most rigorously, and it comes from systems research rather than from a tweet. A team at Rutgers University, working under the banner of an “AIOS Foundation,” published a paper titled “AIOS: LLM Agent Operating System,” first posted in early 2024 and subsequently accepted at the Conference on Language Modeling in 2025 ³. Their motivation is grounded and unglamorous: once you have many AI agents trying to use the same underlying model and the same tools at once, you hit exactly the problems a traditional operating system was invented to solve — scheduling, resource allocation, isolation between tasks, and keeping context straight across interactions ³. Letting every agent reach directly into the model and the tools, they argue, leads to inefficient and even unsafe resource use, where one agent can flood the model with requests while others wait ³. Their answer is to embed the language model into an operating system as, in their phrasing, the “brain” of the system, and to build a kernel around it that provides the classic services — scheduling, context management, memory management, storage management, and access control — for the agents running on top ³. The slogan that captures their vision is “LLM as OS, Agents as Apps,” with the model as the substrate and individual agents as the applications that run against it through a defined system-call interface and a software development kit ³.

This formalization is important for two reasons. First, it shows that “AI operating system” is not only a metaphor; there is a concrete, peer-reviewed architecture with the name, aimed squarely at the infrastructure problem of running agents at scale ³. Second, and more usefully for a personal builder, it surfaces the vocabulary that a good personal system quietly needs anyway: memory, context, storage, and access control. Even a single-user notes-and-agent setup is, in miniature, making decisions about all four — what the agent remembers between sessions, what it can see at once, where things are stored, and what it is allowed to change. The AIOS work is enterprise- and research-scale, but its categories translate directly down to the kitchen-table version.

1.3 The consumer capture of the term

The third definition is the one you are most likely to meet in a headline, and it is the loosest. Through 2025 and into 2026, the major platform companies began applying “AI operating system” to the traditional operating system with artificial intelligence integrated into its core, where the system interprets natural language, maintains awareness of what you are doing, and executes multi-step tasks rather than waiting for explicit commands ³³. The flagship examples are Microsoft’s Copilot features on so-called Copilot+ PCs, Apple’s “Apple Intelligence” across its devices, and Google’s Gemini integration, each marketed as moving the computer from a thing you operate by clicking to a thing you talk to ³³. A related but distinct industry framing, attributed in secondary reporting to the Stanford Digital Economy Lab, describes the shift from “AI assistant” to an “AI operating layer” in which the AI manages a user’s context, memory, and workflows much as a traditional operating system manages hardware resources ³². That “operating layer” phrasing is genuinely useful and worth keeping, because it captures what is common to all the serious definitions: the AI moves from being a feature you invoke to being the layer through which work flows.

A caution is warranted here. Much of the publicly available writing in this consumer category is search-optimized content with thin sourcing, and the market-sizing figures that circulate alongside it (various dollar projections for an “AI OS market”) trace back to single market-research vendors and could not be independently corroborated for this report; they should be treated with skepticism and are deliberately omitted rather than repeated as if settled ^32,33. The reliable takeaway from the consumer category is qualitative, not quantitative: the term has been claimed by the platform giants to describe AI woven into the conventional desktop and phone, which is a different project from the personal-knowledge system this report is about, and it is important not to confuse the two when reading the trade press.

1.4 The meaning that matters here: the personal AI OS

The fourth definition is narrower than the others and is the working definition for the rest of this report. A personal AI operating system, in the sense that is actually buildable today, is the orchestration layer where an AI agent maintains, organizes, and reasons over a body of knowledge that you own and control, sitting between you and your raw material and turning a passive archive into something that thinks back. It borrows Karpathy’s “model as the coordinating layer” frame, it quietly relies on the AIOS categories of memory and context and storage and access, and it expresses the “AI operating layer” idea at the scale of one person’s working life. Crucially, it is defined less by the model and more by the substrate: the knowledge has to live somewhere durable, structured, and legible to an agent, or there is nothing for the operating layer to operate on. This is why, in practice, the personal AI OS conversation in 2026 is really a conversation about notes — and overwhelmingly about one notes application, Obsidian, paired with one model family, Claude. The next section explains the pattern that makes that pairing more than a productivity hack, and why it represents a genuine architectural choice rather than a brand preference.

The following table summarizes the four senses so they can be held apart cleanly for the remainder of the discussion.

Sense of “AI OS”	Who uses it this way	What it actually means	Example	Buildable by you today?
Model-as-kernel (conceptual)	Karpathy and the research-adjacent community	A language model coordinating I/O, tools, files, and other models as the center of a new computing layer	Karpathy’s “LLM OS” framing ^1,5	Only at small, personal scale
Agent kernel (academic)	Systems researchers	A formal kernel providing scheduling, memory, context, storage, and access control for many agents	The AIOS paper and project ³	Not realistically as an individual
AI-in-the-OS (consumer)	Microsoft, Apple, Google, trade press	The conventional desktop/phone operating system with AI woven into its core	Copilot+ PCs, Apple Intelligence, Gemini ³³	No — you buy it
Personal AI OS (this report)	Builders, knowledge workers	An agent that maintains and reasons over a body of knowledge you own	Obsidian vault maintained by Claude ^2,6	Yes — assembly job

2. The Pattern Underneath the Hype: Karpathy’s LLM Wiki and Why It Beats Retrieval

Beneath the surface noise of “build an AI second brain” tutorials sits a single, surprisingly clean idea that gives the whole movement its intellectual backbone. It also comes from Karpathy, and it is more recent and more practical than his kernel metaphor. Understanding it is the difference between assembling a personal AI OS that compounds in value and assembling one that is just a chatbot pointed at a folder. The idea is usually called the “LLM Wiki” pattern, and it is worth spending real time on, because almost every credible Claude-and-Obsidian build in 2026 is, knowingly, an implementation of it ^16,18,31.

2.1 The idea file and the argument against plain retrieval

Karpathy published the pattern not as a product or even as code, but as what he called an “idea file”: a single short document, posted as a public GitHub gist, meant to be copied directly into an AI agent such as Claude Code so that the agent can build out the specifics in collaboration with you ². The document’s purpose is to communicate a high-level approach rather than to prescribe an implementation, and its central claim is a critique of how most people currently use language models with their documents ². That common approach is retrieval-augmented generation, usually shortened to RAG: you load a pile of files, and at the moment you ask a question, the system fetches the chunks that look most relevant and feeds them to the model to generate an answer ^2,16. RAG works, and it is the default for good reasons, but Karpathy’s objection is that under RAG the model is rediscovering your knowledge from scratch on every single question, with no accumulation — the system never gets smarter about your material, it just re-reads it each time ^2,16.

The LLM Wiki pattern proposes the opposite ordering. Instead of retrieving from raw documents at query time, you have the model compile your knowledge once: it reads a source, extracts what matters, updates the relevant existing pages, creates new ones where needed, fixes the cross-references between them, and notes explicitly where new information contradicts what was already written ^16,17. The knowledge gets compiled once and then stays compiled, so that when you later ask a question, the cross-references are already in place, the contradictions have already been surfaced, and the synthesis already reflects everything that has ever been ingested ^16,31. The shift is from stateless retrieval to stateful, compounding knowledge — and that one reordering is what separates a genuine knowledge system from a search box with extra steps ¹⁶. A single new source, in this model, can legitimately trigger edits across a dozen or more existing pages, and that ripple is not a bug; it is the compounding effect doing its job ²⁰.

2.2 The three layers and the division of labor

The pattern organizes everything into three cleanly separated layers, each with a strict ownership rule, and the discipline of keeping them separate is what keeps the system trustworthy over time ^16,20. The first is the raw or intake layer: the unprocessed material you want the system to learn from — research papers, article clippings, meeting notes, video transcripts, your own fleeting thoughts — kept immutable as a record of what actually came in ¹⁶. The second is the wiki layer: a directory of human-readable markdown pages, one per concept or entity, with summaries, comparisons, and synthesis, fully owned and maintained by the agent, which creates the pages, updates them, and keeps the cross-references consistent ^16,20. The third is simply the query surface: questions you ask, which run against the compiled wiki rather than against the raw pile ¹⁶. The cleanest way to describe the working relationship is that the human curates the sources and directs the analysis while the agent summarizes, files, cross-references, and maintains consistency — the human directs, the machine executes, and the knowledge accumulates between them ^16,17.

A detail that separates the mature implementations from the naive ones is provenance tracking, which is the system being honest about where each claim came from. Good implementations of the pattern tag every claim on a wiki page as either extracted directly from a source, inferred by the model’s own synthesis, or ambiguous because the sources disagree, and they summarize the mix in the page’s metadata so you can always tell what your wiki actually knows from what it merely guessed ²⁰. Some go further and run a linting step that flags pages which have drifted into mostly speculation, or attach confidence levels to fast-moving or single-source claims so they are not mistaken for well-established facts ²⁰. This matters enormously for anyone using the system for real work, because the failure mode of an AI-maintained knowledge base is not that it stops working; it is that it quietly fills with confident-sounding synthesis that no source actually supports. Provenance is the antidote, and it is the kind of thing you only appreciate once you have been burned by its absence.

2.3 Why plain markdown and “files over apps” is the load-bearing choice

It would be easy to treat the choice of plain markdown files as incidental, a matter of taste, but it is in fact the most consequential decision in the whole architecture, and the reasons are worth making explicit. Markdown is a plain-text format, which means the files are human-readable, portable to any editor, future-proof against any single company’s decline, and — critically — read natively by language models without any conversion or special tooling ^18,29. The argument that the personal-knowledge crowd makes, and that aligns with the long-standing “files over apps” philosophy associated with Steph Ango, the chief executive of Obsidian, is that your knowledge should not be trapped inside a single vendor’s database where it is hostage to that vendor’s pricing, feature decisions, and continued existence ²⁹. When your knowledge is a folder of plain files, it is agent-agnostic: it is organized in a way that any current or future AI tool can parse and work with, so you are not betting your second brain on one company’s roadmap ²⁹. This is the deeper logic behind why Obsidian, specifically, has become the default front-end for the pattern — not because of any AI feature it ships, but because it is essentially a polished, richly linked viewer and editor sitting on top of an ordinary folder of markdown files that you own outright ^18,29.

There is a second, subtler reason markdown wins, which is that the format already encodes structure the agent can use. Obsidian’s flavor of markdown supports wiki-style links between notes written in double brackets, structured metadata at the top of each file, inline tags, and embeds of one note inside another, and these conventions give an agent a native grammar for building the interlinked graph the LLM Wiki pattern depends on ^6,8. A folder of unstructured text would force the agent to invent its own linking scheme; Obsidian’s conventions hand it one for free, and the result is a knowledge graph the human can also navigate visually. This combination — files you own, plus a lightweight structure both human and machine understand — is precisely why the substrate choice dominates the model choice. You can swap Claude for a different model later; ripping your knowledge out of a proprietary database is a far more painful migration, which is exactly the pain that pushes people toward this pattern in the first place ³⁴.

2.4 The compounding loop in practice

Put together, the pattern produces a loop that is the actual engine of a personal AI OS, and seeing it as a loop rather than a set of features is what makes the whole thing click. Information is captured into the raw layer with as little friction as possible; the agent reads it and synthesizes it into the wiki layer, updating and cross-referencing existing pages; you query the compiled wiki and get answers grounded in everything ingested to date; and — the part that makes it compound — the valuable answers and syntheses themselves become new pages in the knowledge base, so the system’s best output feeds back in as new input ¹⁶. Karpathy himself, by his own account, has been running a version of this for months, keeping an AI agent open in one window and Obsidian open in another, directing edits as he discusses topics and watching the graph of linked pages expand in real time; the often-quoted shorthand for his workflow is that Obsidian functions as the development environment, the model acts as the programmer, and the wiki becomes the codebase ¹⁷. That framing should be read as a vivid description rather than a literal technical claim, but it captures the felt experience reported across many builders: the vault stops being a place where you store things and becomes a place where understanding is continuously assembled ^17,28. The remainder of this report is, in effect, about how to build and operate that loop — first by examining the concrete ways people connect Claude to it, then by comparing the alternatives, and finally by mapping it onto your own situation.

3. How People Wire Claude into Obsidian: The Three Integration Architectures

When you get past the philosophy and ask the literal question — how does Claude actually touch the vault? — the field sorts cleanly into three architectures. They are not mutually exclusive; many serious setups run two or all three at once, because each is strong where the others are weak. Understanding the three as distinct choices, rather than as a single undifferentiated “use AI with Obsidian,” is the key to assembling a system that fits how you actually work. The three are the agentic path, in which an autonomous coding agent operates directly on the vault files; the conversational path, in which a chat client reaches the vault through a connector; and the in-app path, in which AI features live inside Obsidian itself as plugins. This section walks through each, then treats the two practical problems every setup has to solve — getting information in, and getting it back out — and closes with the failure modes and the governance that keeps the whole thing from corrupting your notes.

3.1 The agentic path: Claude Code in the vault

The architecture that has come to define the 2026 personal AI OS is to run Claude Code — Anthropic’s command-line agent — directly inside the folder that holds your Obsidian vault, so that it can read, write, reorganize, and synthesize the markdown files natively, with no copying and pasting ^24,28,30. The setup is genuinely simple in outline: install Claude Code, open a terminal positioned inside the vault folder, launch the agent, and have it read your folder structure and write a context file describing your vault so that from the first session it already understands the lay of the land ²⁸. That context file, conventionally named so the agent reads it first in every session, is the single most important artifact in the whole setup; it holds your folder conventions, your writing rules, your active projects, and your voice, and it is the reason a well-configured agent’s output quality is so different from a cold prompt’s ^28,30. The often-repeated maxim in this community is that context beats prompts, every time: when the agent knows your structure, your projects, your rules, and your entire knowledge base, the quality of what it produces changes completely, and the setup compounds because every session leaves the agent better informed about how you think and work ²⁸.

The crucial enabling development for this path arrived at the start of 2026, when Steph Ango — Obsidian’s chief executive, who publishes on GitHub under the name kepano — released an official set of agent skills for Obsidian, a repository that gathered more than thirteen thousand stars within weeks of release ^6,7,8. The problem these skills solve is concrete and was a real obstacle before they existed: Claude Code does not, by default, know Obsidian’s particular file formats, so left to itself it would break the double-bracket link syntax, generate invalid data for Obsidian’s database files, or produce canvas files that simply would not open in the app ^6,8. The official skills — five of them — teach the agent the correct syntax for Obsidian-flavored markdown including links, callouts, metadata, tags, and embeds; for Bases, which is Obsidian’s structured-data layer of typed properties and views built on top of notes; for the JSON Canvas format used by Obsidian’s spatial whiteboards; for the Obsidian command-line interface; and for a web-extraction step that strips a web page down to clean markdown before saving it ^6,8. Because they follow an open agent-skills specification, they work not only with Claude Code but with other compatible agents, though the release was widely read as an implicit endorsement of Claude specifically as the AI companion for Obsidian — partly on technical grounds, since Claude Code reads files natively, and partly because Anthropic’s local-first, privacy-conscious posture aligns with Obsidian’s own “files over apps” philosophy ^6,7,8. The significance went beyond Obsidian users: it was the first time the creator of a major productivity tool had officially embraced the agent-skills approach and shipped production-quality skills for their own platform, which signals that tool vendors are starting to treat skills as the real integration layer for AI agents ⁷.

On top of this official foundation, a small ecosystem of community implementations has grown up, almost all of them explicitly built on Karpathy’s LLM Wiki pattern, packaging the capture-synthesize-maintain loop into reusable skill sets that you point at your vault — examples include a self-organizing “claude-obsidian” project, an “obsidian-wiki” framework with the provenance-tagging discipline described earlier, and standalone implementations of the LLM Wiki itself ^19,20. The pattern across all of them is the same: a set of markdown instruction files that any agentic coding tool can read and execute, turning a generic agent into a vault-aware knowledge engine ^19,20. This agentic path is the most powerful of the three, because the agent can act autonomously across hundreds of files at once — updating tags across an entire vault after a taxonomy change in minutes rather than an afternoon, for instance — but that same power is also its main risk, which the governance discussion below addresses ^24,30.

3.2 The conversational path: MCP servers and Claude Desktop

The second architecture keeps the interaction conversational, inside a chat client like the Claude desktop application, and connects that client to your vault through the Model Context Protocol, an open standard introduced by Anthropic that lets an AI client securely connect to outside data sources ^36,46. In this setup you add an “Obsidian MCP server” — a small local program — to your Claude configuration, and from then on Claude can read, search, create, and edit notes in your vault by sending structured requests through the protocol rather than by you pasting text into the chat window ^43,36. The practical value is the same as the agentic path but the feel is different: instead of pasting context, the assistant pulls exactly what it needs, so you can ask it to reference your architecture notes while reviewing a problem, or have it search last week’s meeting notes for a specific decision, all from an ordinary conversation ⁴⁶.

There are two meaningfully different flavors of these servers, and the distinction is worth knowing before you pick one. The first flavor talks to Obsidian through a community plugin called the Local REST API, which exposes your vault over a local web interface on a fixed port, and the most widely used server of this kind is a Python-based tool that connects through that plugin and offers operations to list files, read their contents, search, and surgically patch sections of a note ^9,51,46. These REST-based servers require Obsidian to be running with the plugin enabled, because they are essentially talking to the live app ⁴⁶. The second flavor reads the vault files directly from disk and therefore works even when Obsidian is closed, since it does not depend on the app being open at all ^46,47. There is also a newer design philosophy worth flagging, sometimes called “AI-native,” in which the server deliberately does not expose low-level create-read-update-delete operations but instead offers higher-level, task-oriented tools that the model can reason about more effectively — for example, listing a directory with pagination so the model can explore a large vault incrementally without overwhelming its limited context, or a single write tool that handles appending, prepending, and overwriting through one parameter to remove ambiguity ^45,52. The conversational path is generally the gentlest on-ramp and the best fit when you mostly want to ask your vault things, and it complements rather than replaces the agentic path, which is better when you want the system to do large-scale work on the vault autonomously.

3.3 The in-app path: plugins and local models

The third architecture puts the AI inside Obsidian itself, as community plugins, and it is the most accessible of the three because it requires neither a terminal nor a configuration file — and it is the path that makes private, fully local operation easiest ^53,55. Obsidian ships with no native AI of its own, by deliberate design, but its plugin ecosystem fills the gap thoroughly ²⁴. Two plugins anchor the field. The first is Smart Connections, which popularized vault-aware semantic search — the ability to surface related notes based on meaning rather than exact keyword matches — by computing numerical representations of your notes, called embeddings, and offering a sidebar of conceptually related material as you write; it can run those embeddings entirely on your own machine using local models, storing them inside the vault, which keeps the feature private and offline ^53,55,57. The second is Copilot for Obsidian, an in-vault chat assistant whose explicit design goal is portability with no provider lock-in, letting you use whatever model you like while keeping your data yours; it offers a vault question-and-answer mode that chats with your whole vault and cites its sources, a project mode that builds context from chosen folders and tags which its own documentation likens to having a source-grounded research tool inside your vault, and, on its paid tier, an autonomous agent mode that calls tools on its own when relevant ^54,12. Beyond these two, the ecosystem includes Text Generator for template-driven content, Smart Composer for inline writing, and several others such as Local GPT, the BMO chatbot, and a “Smart Second Brain” plugin, each covering a slightly different layer of the experience ^55,60.

The in-app path is also where the local-model story is most developed, which matters a great deal for anyone who cares about keeping sensitive notes off third-party servers. The two common local runtimes are Ollama and LM Studio, both of which expose a model on your own machine through an interface compatible with the standard chat-completion format, so the Obsidian plugins can point at a local address instead of a cloud provider ^55,60. A frequently recommended local configuration pairs Smart Connections for semantic search with Copilot for chat, both pointed at a local model, which is reported to cover the large majority of everyday “second brain” use — semantic linking plus conversation with your notes — without sending any vault content to the cloud at all ⁵⁵. The storage cost of the embeddings is modest, on the order of a few megabytes per thousand notes for a common embedding model, so even a large vault is cheap to index locally ^57,14. One genuine wrinkle is that locally stored embeddings live inside the vault and therefore have to be handled carefully when syncing across devices, regenerating rather than blindly copying under some sync methods ⁵⁵. The trade-off across the in-app path as a whole is that it is the easiest to start and the strongest on privacy, but its AI is generally less capable of large autonomous operations than a full coding agent — which is exactly why many people run an in-app plugin for fast private retrieval and a coding agent for heavy lifting.

3.4 The capture problem and the retrieval problem

Every personal AI OS has to solve two practical problems regardless of which architecture it uses, and tutorials often gloss over them even though they determine whether the system survives contact with daily life. The first is capture: information has to get into the raw layer with so little friction that you actually do it, because a knowledge system you do not feed is just an empty folder. Obsidian’s built-in daily-notes feature serves as a frictionless inbox for your own thoughts and quick logs, and the web-extraction skill in the official Obsidian skill set exists precisely to turn a messy web page into clean markdown worth keeping ^31,6,35. People also wire up scheduled capture — for instance, an automated step that pulls new items from another tool each morning, formats them as markdown, appends them to the vault, and triggers a synthesis pass, so the loop runs partly on its own ⁷⁰. The second problem is retrieval, and it is where the architectures genuinely differ: the in-app plugins lean on semantic search over embeddings to surface related material, the conversational MCP path lets the model search and read on demand, and the agentic path can read across many files and follow the links between them in a single pass ^46,53,9. A subtle but important point, made well by the documentation of the connector tools, is that Claude’s long context window combined with Obsidian’s structured links lets the model read several related notes, follow the wiki-links between them, and synthesize across the vault in one conversation — which is a different and often better experience than retrieving isolated chunks ⁴⁹. In practice the strongest setups solve capture once, generously, and then let whichever architecture fits the moment handle retrieval.

3.5 Failure modes and the governance that prevents disaster

The honest part of this section is that giving an AI agent write access to a knowledge base you have spent years building is genuinely risky, and the people running these systems seriously have converged on a set of guardrails that are not optional. The most cited risk is that the agent makes mistakes when merging or updating existing notes — it is capable, but it is not infallible, and an unreviewed edit can silently corrupt a note you care about ³⁰. The near-universal recommendation is to put the vault under version control with git, which Obsidian supports through a community plugin, so that you can see exactly what the agent changed as a difference before you accept it, and roll back anything unwanted ^30,46. The discipline that follows from this is to build a habit of reviewing what the agent writes rather than trusting it blindly, to back up the vault before granting any write access, and to keep the raw intake layer immutable so that even if the wiki layer gets mangled, the source material survives intact ^30,16. The provenance-tagging practice described earlier is part of this governance too, because it lets you audit which claims are grounded and which are speculation ²⁰. A sensible posture, especially at the start, is to grant the agent read access broadly but write access narrowly, expanding what it is allowed to change only as you build trust in its behavior on your particular vault — which is, not coincidentally, exactly the access-control concern the academic AIOS work identified as fundamental ³. The table below lays the three architectures side by side on the dimensions that matter when choosing among them.

Architecture	How Claude touches the vault	Typical model	Strongest for	Main risk / requirement
Agentic (Claude Code in vault)	Agent reads and writes the markdown files directly	Claude via Claude Code (cloud)	Large autonomous work: synthesis, mass reorganization, the LLM-Wiki loop	Powerful enough to cause damage; needs git review and the kepano skills ^6,30
Conversational (MCP server)	Chat client reaches vault through a connector	Claude Desktop or any MCP client	Asking the vault questions in natural conversation	REST-based servers need Obsidian running; filesystem-based do not ^46,9
In-app (plugins)	AI features run inside Obsidian	Any: cloud or local (Ollama, LM Studio)	Private/offline use; fast semantic search; inline writing	Less capable of big autonomous operations; embedding sync care ^55,57

4. The Comparison Set: Other Note-Apps-Plus-AI Systems

A personal AI OS does not have to be built on Obsidian, and a fair assessment has to look hard at the alternatives, because several are genuinely good and one of them may already be where your knowledge lives. The note-apps-plus-AI field in 2026 is crowded and noisy, and a warning about the sources is in order before the comparison begins: a great deal of the “best second brain app” writing online is produced by the vendors themselves, each of which crowns its own product, so the rankings in those pieces are close to worthless and only the factual, cross-corroborated feature descriptions are worth extracting ^21,22,23,15. With that filter applied, the field organizes cleanly, and the organizing distinction is the single most useful idea for comparing these tools.

4.1 The framing that organizes the field: AI-native versus AI-bolted-on

The cleanest way to sort the entire category is by whether the AI was built into the application’s core or bolted onto the side, because that one fact predicts most of the others ²¹. The bolted-on tools are, at bottom, a notes application from the pre-AI era with a model added in a sidebar: the AI is a separate panel, you move text into it and copy suggestions back, and the rest of the application behaves as though nothing has changed ²¹. The AI-native tools were rebuilt around the language model rather than around a sidebar widget, so the AI is woven into the primary surface of the app rather than parked beside it ²¹. This distinction cuts in an interesting direction for the subject of this report: Obsidian itself is, by this measure, a bolted-on case, since its AI lives in community plugins and an external agent rather than in the core app ²¹. But the agentic Claude-Code path arguably transcends the framing entirely, because the “app” doing the AI work is a full autonomous agent operating on the files, not a sidebar at all — which is part of why the Obsidian-plus-Claude pattern feels different in kind from “Notes plus AI.” The framing is a lens, not a verdict, and it is most useful for understanding what each commercial alternative is really offering.

4.2 The cloud workspaces: Notion and where it is going

Notion deserves the most careful treatment, both because it is the dominant all-in-one workspace and because it has bet enormously on AI, moving well beyond the bolted-on caricature. Across its version 3 releases through 2025 and 2026, Notion shipped autonomous AI agents that execute multi-step workflows using full workspace context — drafting documents, querying databases, and updating pages from a single prompt, reportedly working for up to around twenty minutes across hundreds of pages at once ^25,84. It added “custom agents” that users configure to run on schedules or triggers and that can act across connected tools like Slack and email; an “Ask Notion” search that answers plain-language questions grounded in your own workspace pages rather than the open web; automatic meeting transcription with summaries and action items; and an enterprise search that pulls context from connected applications such as Google Drive and code repositories ^25,27. Notably, recent versions give users in-workspace access to frontier models from multiple providers — including Claude and the latest OpenAI and Google models — so the workspace itself becomes a front-end to the same models you might otherwise pay for separately ²⁷. In the “AI operating layer” sense from Section 1, Notion is arguably the most operating-system-like of the commercial note tools, because it genuinely aims to let agents do work on your behalf across your whole workspace.

The trade-offs are equally real and bear directly on a decision to use it as your personal AI OS. Notion’s full AI capability is gated behind its Business tier, priced at roughly twenty dollars per user per month on annual billing — with sources showing minor variation around that figure — after the standalone AI add-on was retired in 2025; the free and entry tiers receive only a small one-time trial of AI responses ^26,82,24. On top of the plan price, the autonomous custom agents moved to a metered “credits” model in May 2026, billed separately from the seat price, which makes heavy agent use an additional and variable cost ^26,80. Pricing in this category changes frequently enough that any specific figure should be checked against the vendor’s own page before relying on it, and these numbers should be read as a mid-2026 snapshot rather than a fixed quote ^26,85. The deeper trade-off is structural rather than financial: Notion is a cloud workspace, so your knowledge lives in Notion’s database, the agents run on Notion’s terms, and the model choice and behavior are Notion’s to set — which is the precise opposite of the owned, file-based, model-agnostic substrate that the Obsidian pattern is built around ²⁴. Notion is racing to be your AI OS as a managed service; the Obsidian path lets you build one you own. Both are legitimate; they are simply different bets about who holds the substrate.

4.3 The AI-native auto-organizers

A second cluster is defined by the promise to eliminate manual organization entirely — no folders, no tags, no filing — by having the AI organize everything automatically. Mem, often described as a pioneer of the AI-native category, matured by 2026 into a tool whose premise is that you capture thoughts and the system auto-links them through embeddings, with no folders to maintain, surfacing related notes when relevant and letting you chat with your notes like an assistant; its reported pricing settled around twelve dollars per month after a reduction in late 2025 ^62,21,66. The honest limitation, noted even in otherwise favorable coverage, is that the folderless, auto-organized approach takes adjustment and does not support the heavy explicit linking and visual structure that committed personal-knowledge practitioners rely on, so it suits people who want capture-and-resurface more than people who want to build a deliberate graph ⁶¹. Reflect occupies an adjacent space, oriented around daily notes with AI assistance and less complexity, at a reported price near ten dollars per month — though one tracker noted its pricing page intermittently failing to load, which is worth verifying before committing ^66,24. A newer entrant, Saner, pushes the auto-organization idea toward action, integrating with mail, drive, and chat tools, turning brain-dumped thoughts into scheduled tasks, and positioning itself as an assistant that manages tasks and calendars rather than only notes — though, as one of the vendors producing the comparison content itself, its self-assessment should be discounted accordingly ^63,22. The common thread in this cluster is that they optimize for the intake end of the loop and for zero-maintenance recall, which is a genuine strength for scattered thinkers, at the cost of the explicit, owned, navigable structure that the LLM Wiki pattern treats as essential.

4.4 The structured-thinking tools

A third cluster takes the opposite stance, offering rich explicit structure for people who want to think in objects and relationships rather than in folders or auto-magic. Tana is built around what it calls supertags, which turn unstructured text into queryable structure that AI commands then operate on natively, making it a favorite of power users who want a database-like second brain that the AI can manipulate as structured data rather than as prose ^62,63,21. Capacities organizes everything as typed objects — a person, a project, a meeting — with templates and relations between them, presenting itself as a “studio” for the mind for people who prefer objects and types over folders, at a reported price around twelve dollars per month for its paid tier ^63,65,24. Anytype shares the object-based model but leads with privacy and local control, working fully offline with end-to-end encrypted synchronization so that no company can read your content, which makes it the structured-thinking choice for privacy advocates ⁶⁵. Logseq sits slightly apart as a privacy-first, local, open-source outliner — organizing thought as nested blocks rather than pages — and is the natural option for people who want local storage and open-source software above polish ^63,24. The trade-off across this cluster is that the explicit structure is genuinely powerful and AI-legible, but it imposes a modeling discipline and a learning curve, and with the partial exception of the local-first members, several of these still hold your structured data inside their own systems rather than as portable files you own.

4.5 The source-grounded researcher

NotebookLM, from Google, belongs in the comparison but in its own category, because it solves a different problem than a persistent second brain. Its design centers on source-grounded question-and-answer: you upload a set of documents — papers, transcripts, reports — and the AI becomes an expert strictly on that material, refusing to stray beyond the uploaded sources and providing citations for every answer, which directly attacks the hallucination problem by tying responses to the provided text ^65,68. Its distinctive 2026 feature is the ability to generate a realistic podcast-style audio conversation between two synthetic hosts discussing your material, which is a genuinely novel way to review a body of notes while away from the screen, and it leans on Google’s Gemini models to synthesize across large document sets quickly ⁶⁵. The reason it is not, by itself, a personal AI OS is that it is scoped to the sources inside a given notebook rather than to a durable, growing, owned knowledge base: it is superb for deeply interrogating a fixed corpus — a literature review, a case file, a stack of reports — but it does not maintain a compounding wiki of your accumulated understanding over time in the way the Karpathy pattern does ^65,68. It is best understood as a powerful complementary tool for the intake and interrogation end of the loop, something you might use to digest a hard set of sources before their distilled output flows into your owned vault.

4.6 What the comparison reveals

Stepping back from the individual tools, the field resolves onto two axes that matter for choosing a personal AI OS, and naming them makes the decision clearer than any feature list. The first axis is ownership and locality: at one end, your knowledge is plain files on your own disk that any tool can read (Obsidian, Logseq, Anytype to a degree); at the other, it lives in a vendor’s cloud database on the vendor’s terms (Notion, Mem, NotebookLM). The second axis is who does the organizing: at one end you structure things explicitly yourself or via an agent you direct (Obsidian, Tana, Capacities); at the other the system organizes automatically and invisibly (Mem, Saner). The Obsidian-plus-Claude pattern is distinctive precisely because it occupies the corner that the commercial tools mostly avoid — maximum ownership and locality, combined with an agent that does the heavy organizing under your direction — which is the corner that the LLM Wiki philosophy argues is the right one for knowledge that has to last and stay yours ^16,29. None of this makes the alternatives bad; a team that lives in Notion and wants turnkey agents is well served by Notion, and a researcher drowning in PDFs should reach for NotebookLM. It does mean that if the goal is specifically to build and own a personal AI OS rather than to rent one, the field narrows sharply. The table below summarizes the comparison on the dimensions that actually drive the decision; pricing figures are a mid-2026 snapshot and should be re-verified before relying on them.

System	AI approach	Data ownership / locality	Organizational model	Standout AI capability	Approx. price (mid-2026)	Best fit
Obsidian + Claude	Bolted-on plugins or external agent	Plain markdown files you own; local	Explicit links, agent-directed	Agentic vault maintenance (LLM Wiki loop)	App free; Sync ~$8/mo; model costs separate	Builders who want to own and compound knowledge ^29,6
Notion AI	Increasingly native; autonomous agents	Vendor cloud database	Databases and pages	Cross-workspace agents, Ask Notion search	~$20/user/mo Business + agent credits	Teams wanting turnkey managed AI ^25,26
Mem	AI-native	Vendor cloud	Folderless, auto-linked	Automatic organization and resurfacing	~$12/mo	Scattered thinkers wanting zero-maintenance recall ^62,66
Tana	AI-native-ish, structured	Vendor cloud	Supertags / structured nodes	AI operating on queryable structure	Tiered; varies	Power users wanting a database-like brain ^62,63
Capacities	Bolted-on AI	Vendor cloud	Typed objects and relations	Object-based knowledge studio	~$12/mo Pro	Thinkers who prefer objects over folders ^63,24
Anytype	Bolted-on AI	Local, encrypted, offline-first	Typed objects	Private structured PKM	Free / low	Privacy advocates wanting structure ⁶⁵
Logseq	Bolted-on AI	Local, open-source	Outliner / blocks	Local AI on an open base	Free	Open-source, privacy-first users ^63,24
NotebookLM	Native, source-grounded	Vendor cloud (uploaded sources)	Notebook of sources	Cited Q&A; audio overviews	Free tier available	Deep interrogation of a fixed corpus ^65,68

Conclusion

The state of play in 2026 is that a personal AI operating system has crossed the line from research curiosity to assembly job, and the pieces required to build a serious one are now mature, documented, and mostly things you already own. The term “AI OS” itself remains contested — split between Karpathy’s conceptual model-as-kernel, the academic AIOS architecture for managing fleets of agents, the consumer marketing of AI woven into the conventional desktop, and the personal sense this report adopted — but the personal sense is the one an individual can actually realize today, and it has a clear definition: an agent that maintains and reasons over a body of knowledge you own, turning a passive archive into an active thinking layer ^1,3,16. The pattern that gives this version its spine is Karpathy’s LLM Wiki, whose core insight is to compile knowledge once into interlinked, owned markdown files and keep it current, rather than re-deriving it from scratch on every query the way standard retrieval does — a shift from stateless search to compounding, stateful knowledge ^2,16. The dominant implementation pairs an Obsidian vault with Claude, most powerfully through Claude Code operating on the files directly, now meaningfully easier because Obsidian’s own chief executive shipped official agent skills in early 2026 that teach the agent to speak Obsidian natively ^6,7. Around that spine sit two other integration paths — conversational connectors through the Model Context Protocol, and in-app plugins that excel at private, local operation — and a comparison set of commercial alternatives, of which Notion is the most operating-system-like but also the most cloud-locked, and the rest trade ownership for either automatic organization or rich structure ^25,55,21.

Two honest limitations should temper all of this before you rely on it. First, much of the practical, step-by-step knowledge in this space lives in personal blogs, vendor marketing, and community repositories rather than in durable or independently audited sources, and while the central facts here — Karpathy’s gist, the AIOS paper, the kepano skills, the plugin and connector ecosystem — are corroborated across multiple independent sources and primary repositories, the specifics of any one tutorial or the exact pricing of any one product can change quickly and should be verified against the source before you commit. Second, this report deliberately scoped to note-apps-plus-AI and treated the personal AI OS as fundamentally a knowledge-management problem; it did not evaluate the broader agent-operating-system frameworks or AI hardware that some would include under “AI OS,” so if your ambitions later expand beyond an owned knowledge layer toward orchestrating many autonomous agents across your whole digital life, that is a different and less settled frontier than the one mapped here.

Final caveat on implementation, failure modes from Section 3 are real and the way to avoid them is to start small and instrument the system before trusting it. A sensible rollout grants the agent read access to the whole vault from the beginning but write access narrowly at first — perhaps only to a sandbox folder or to new pages it creates — expanding what it may modify only as you watch its behavior on your actual material and build confidence ^3,30. From the first day, the vault should be under git version control through Obsidian’s git plugin, and the discipline should be to review the agent’s changes as differences before accepting them, especially for any operation that merges into or rewrites existing notes ^30,46. The raw intake layer stays immutable so that the source of truth survives any mistake in the synthesized layer, and provenance tagging stays on so you can audit grounding at any time ^16,20. None of this is heavy once it is set up, and all of it is cheap insurance against the one outcome that would actually hurt — an agent quietly degrading a knowledge base you depend on. Approached this way, the system earns trust incrementally, which is exactly how you would want to onboard any new collaborator who was being handed the keys to years of your thinking.

Sources

Andrej Karpathy, post on X describing LLMs as “the kernel process of a new Operating System.” https://x.com/karpathy/status/1707437820045062561
Andrej Karpathy, “LLM Wiki” idea-file gist. https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f
Kai Mei et al. (Rutgers University / AIOS Foundation), “AIOS: LLM Agent Operating System,” arXiv 2403.16971 (COLM 2025). https://arxiv.org/abs/2403.16971
Analytics India Magazine, “Andrej Karpathy Says the Pathway to AGI is Through a Language Model Operating System.” https://analyticsindiamag.com/andrej-karpathy-says-the-pathway-to-agi-is-through-a-language-model-operating-system/
Promptmetheus LLM Knowledge Base, “LLM OS.” https://promptmetheus.com/resources/llm-knowledge-base/llm-os
kepano (Steph Ango), obsidian-skills repository / Claude Skills Hub guide. https://github.com/kepano and https://claudeskills.info/blog/obsidian-claude-skills-guide/
Addo Zhang, “Obsidian Skills — Empowering AI Agents to Master Obsidian Knowledge Management.” https://addozhang.medium.com/obsidian-skills-empowering-ai-agents-to-master-obsidian-knowledge-management-8b4f6d844b34
vibecoding.app, “Obsidian Skills Review 2026.” https://vibecoding.app/blog/obsidian-skills-review
MarkusPfundstein, mcp-obsidian repository. https://github.com/MarkusPfundstein/mcp-obsidian
Morph, “Obsidian MCP Server: Connect Your Vault to AI Agents (2026 Guide).” https://www.morphllm.com/obsidian-mcp-server
MCP Servers directory, obsidian-mcp and obsidian-local-rest-api listings. https://mcpservers.org/servers/takuya0206/obsidian-mcp
Copilot for Obsidian, community plugin page and repository. https://community.obsidian.md/plugins/copilot
PromptQuorum, “Obsidian + Local LLM: 5 Plugins (2026).” https://www.promptquorum.com/power-local-llm/local-llm-with-obsidian-2026
Local AI Master, “Local AI + Obsidian: A Second Brain That Thinks.” https://localaimaster.com/blog/local-ai-obsidian-integration
SystemSculpt, “Best Obsidian AI Plugins in 2026” (vendor-authored; rankings treated skeptically). https://systemsculpt.com/blog/best-obsidian-ai-plugins-2026
Plaban Nayak (Level Up Coding), “Beyond RAG: How Andrej Karpathy’s LLM Wiki Pattern Builds Knowledge That Actually Compounds.” https://levelup.gitconnected.com/beyond-rag-how-andrej-karpathys-llm-wiki-pattern-builds-knowledge-that-actually-compounds-31a08528665e
Tahir (Medium), “What is LLM Wiki Pattern? Persistent Knowledge with LLM Wikis.” https://medium.com/@tahirbalarabe2/what-is-llm-wiki-pattern-persistent-knowledge-with-llm-wikis-3227f561abc1
MindStudio, “What Is Andrej Karpathy’s LLM Wiki? How to Build a Personal Knowledge Base With Claude Code.” https://www.mindstudio.ai/blog/andrej-karpathy-llm-wiki-knowledge-base-claude-code
AgriciDaniel, claude-obsidian repository. https://github.com/AgriciDaniel/claude-obsidian
Ar9av, obsidian-wiki repository (provenance tagging). https://github.com/ar9av/obsidian-wiki
PopularAITools, “AI-native vs AI bolted-on” note-taking analysis (2026). https://popularaitools.ai/blog/best-ai-note-taking-app
Saner.AI, “10 Best Second Brain AI Apps in 2026” (vendor-authored; rankings treated skeptically). https://blog.saner.ai/10-best-second-brain-ai-apps/
Buildin.ai, “15 Best Second Brain Apps in 2026.” https://buildin.ai/blog/best-second-brain-apps-2026
Alfred, “Best AI Note-Taking Apps 2026: Notion AI vs Obsidian + 4.” https://get-alfred.ai/blog/best-ai-note-taking-apps
TechAhead, “Notion 3.0 AI Agents: Complete Guide (2026).” https://www.techaheadcorp.com/blog/notion-3-ai-agents/
Fello AI, “Notion AI Pricing 2026: Plans, Cost & Add-On Status.” https://felloai.com/notion-ai-pricing/
SmartProductivityTools, “The Complete Guide to Notion in 2026.” https://smartproductivitytools.com/notion-complete-guide/
Noah (Substack), “How to Build Your AI Second Brain Using Obsidian + Claude Code.” https://noahvnct.substack.com/p/how-to-build-your-ai-second-brain
WhyTryAI, “Build Your Second Brain With Claude Code & Obsidian.” https://www.whytryai.com/p/claude-code-obsidian
MindStudio, “How to Build an AI Second Brain with Obsidian and Claude Code.” https://www.mindstudio.ai/blog/how-to-build-ai-second-brain-obsidian-claude-code
The Tool Nerd, “Step-by-Step Guide: Build Your Own AI Second Brain with Obsidian and Karpathy’s LLM Wiki Pattern.” https://www.thetoolnerd.com/p/step-by-step-guide-build-your-own-second-brain-obsidian-kaparthy
Artic Sledge, “What Is an AI Operating System? Transform Your Work in 2026” (secondary, including a Stanford Digital Economy Lab framing). https://www.articsledge.com/post/artificial-intelligence-operating-system-ai-os
Picovoice, “AI Operating Systems Explained: Types, Examples, and Use Cases.” https://picovoice.ai/blog/ai-operating-system/
Evgeni Rusev (Medium), “How I Built My Second Brain with Obsidian + Claude Code.” https://medium.com/@evgeni.n.rusev/how-i-built-my-second-brain-with-obsidian-claude-code-9fb54b7665ca
How-To Geek, “Claude + Obsidian: The Cheat Code for Building a Second Brain.” https://www.howtogeek.com/claude-obsidian-the-cheat-code-for-building-a-second-brain/
QED42, “Supercharge your knowledge management — integrating Obsidian MCP with Claude.” https://www.qed42.com/insights/supercharge-your-knowledge-management---integrating-obsidian-mcp-with-claude