The Death of the User Interface As We Know It?

Mircha Emanuel D'Angelo ·
AI MCP SaaS User-Interface AI-Agents Open-Source Hacker-Culture
The Death of the User Interface As We Know It?
Mircha Emanuel D'Angelo

Note: This is an AI-generated translation from the original Italian article: La morte dell'interfaccia utente come la conosciamo?

There is something profoundly ironic about the condition of the digital user in 2026. Never in the history of computing have we had access to so many tools, so powerful, so affordable. And yet, anyone who works with a computer spends a significant portion of their day not doing things, but navigating between things. We open applications, search for features in menus, fill out forms, move data from one system to another, learn new interfaces that replace old ones for no obvious reason. Software promised to free us from repetitive work; instead, it has created new forms of repetitive work, disguised as "productivity."

The SaaS (Software as a Service) model has democratized access to tools that once required enormous investments. Today anyone can have a CRM, a project management system, an analytics suite, paying just a few euros per month. But this abundance has a hidden cost: every tool is a world unto itself, with its own logic, its own visual vocabulary, its own conventions. The average user in a company uses dozens of different applications, and each one needs to be learned. The user interface, born to make the computer accessible to everyone, has itself become a barrier, a cognitive tax we pay every day.

What if it is time to ask ourselves: was the graphical interface as we know it truly the endpoint, or merely a transitional phase? What if the next leap in human-machine interaction is already underway, quietly, while we stubbornly click on buttons designed in the 1980s?

From the Terminal to Touch, and Back Again

To understand where we are going, it is worth remembering where we came from. The history of the user interface has been told for decades as a linear progression toward "naturalness." First there was the command line, austere and powerful, but reserved for initiates who knew the arcane syntax. Then came the graphical interface (windows, icons, menus, pointer) and suddenly the computer became accessible to anyone who could move a mouse. The 1984 Macintosh, Windows 95, the web with its clickable links: each step seemed to bring us closer to a more human, more intuitive interaction.

The touchscreen appeared to complete this arc. With the iPhone in 2007, the interface became literally tactile: no more intermediaries, the finger touches the digital object directly. Children learn to use a tablet before they can read. What could be more natural?

But every transition has involved trade-offs that are rarely discussed. The GUI made the computer accessible to non-technical users, certainly, but it also hid complexity instead of eliminating it. The command-line user saw exactly what was happening; the GUI user sees metaphors (folders, trash cans, windows) that protect them from the underlying reality but also separate them from it. Touch made everything "intuitive" but dramatically impoverished the possibilities of input: a finger on a screen can do much less than a keyboard, and in fact on mobile devices we have gone back to typing, painfully, on virtual keyboards.

The command line never truly disappeared. Developers never abandoned it. I live in the terminal. And there is a reason: the CLI is composable. Small programs that do one thing well, connected by pipes, orchestrated by scripts. It is the Unix philosophy, articulated by Doug McIlroy in the 1970s: "Write programs that do one thing and do it well. Write programs to work together."

This philosophy never found an equivalent in the world of GUIs. Graphical applications are monoliths that do not talk to each other except through painstakingly built integrations. Every SaaS is an island.

And here in 2026 we are witnessing something unexpected: a return to the textual interface, but in a radically new form. Not the rigid syntax of the terminal, but natural language. Not commands to memorize, but intentions to express. And behind it, not a deterministic parser, but a language model that understands.

The SaaS Paradigm: Triumph and Crisis

Before exploring the new, it is necessary to understand why the old is creaking. The SaaS model was an extraordinary success. It eliminated the need to install software, slashed distribution costs, enabled continuous updates without user intervention. For software companies, it transformed one-time sales into recurring revenue streams. For users, it made tools accessible that were previously reserved for large organizations.

But the success of SaaS has brought with it structural pathologies that are now evident.

The first is fragmentation. Because every specific problem has spawned its own dedicated SaaS, the user finds themselves operating in dozens of different environments. Data is scattered everywhere: contacts in the CRM, tasks in the project manager, documents in cloud storage, conversations in the team chat, emails in the mail client. Tools like Zapier or Make were born precisely to build bridges between these islands, but they are fragile solutions requiring constant maintenance and non-trivial technical skills.

The second pathology is feature creep. A SaaS must justify its monthly subscription. The universal answer is to add features. Slack, born as a simple chat, has become a platform with canvas, workflows, huddles, clips, and dozens of integrations. Notion, born as an advanced notepad, has become a database, wiki, project manager, website. Every tool tends to expand until it covers adjacent territories, bloating the interface, multiplying menus, burying simple functions under layers of complexity.

The third pathology is more subtle: the inversion of the relationship between means and end. The user does not want to "use Notion"—they want to organize their ideas. They do not want to "navigate Salesforce"—they want to know if a client is ready to buy. The interface should be transparent, an invisible means toward a result. Instead, it has become opaque, a constant presence demanding attention, learning, and cognitive maintenance.

This inversion was perhaps inevitable. A graphical interface is, by its nature, an imposed language. The designer decides which actions are possible, where they are placed, what they are called. The user must learn that language. And since every application has different designers with different ideas, the user must learn dozens of different languages, often inconsistent with one another.

The Agent as a New Paradigm

The change underway is radical, even if on the surface it may seem like just an evolution. It is not about adding a chatbot to an existing application—that is how the old world tries to absorb the new without understanding it. It is about completely rethinking the relationship between user and software.

In the traditional paradigm, the user performs atomic actions through the interface: clicks a button, fills in a field, selects an option. The software executes that specific action. If the user wants to achieve a complex result, they must mentally decompose that result into a sequence of atomic actions and perform them one by one, often across different applications.

In the emerging paradigm, the user expresses an intention in natural language, and an AI agent takes care of translating that intention into concrete actions. The agent understands what the user wants to achieve, determines which tools are needed, orchestrates them, handles errors, and returns the result.

Consider a concrete example: "Prepare a weekly report for the team on Project Alpha's progress, including completed tasks, delayed ones, and a burndown chart, then email it to the whole team."

Today, this innocent sentence requires: opening the project manager, filtering tasks by project and week, exporting the data, opening a spreadsheet, creating the chart, opening a document editor, formatting the report, going back to the project manager for the team list, opening the email client, composing the message, attaching the document, sending. Twenty minutes of manual work, thirty if something goes wrong.

With a sufficiently capable and well-integrated agent, it is one sentence. The agent queries the project manager's API, processes the data, generates the chart, composes the document, retrieves the team's email addresses, and sends. The user expresses the intention, the agent manages the complexity.

I write this not as an outside observer, but as a daily practitioner. My workdays largely unfold in terminal windows where conversations with AI agents alternate with traditional commands. It is a hybrid experience, straddling two worlds: the precision of the classic CLI and the flexibility of natural language. And what is striking, after months of this practice, is how quickly the old way of working—opening applications, navigating menus, filling out forms—begins to feel inefficient, almost primitive. Not because it was wrong, but because there is now an alternative that more closely matches the way we think.

This is not science fiction. In 2026, we have tools that natively support dozens of integrations via MCP, with persistent memory that maintains context across sessions, with reasoning capabilities that allow handling multi-step tasks: we are already inside this transition!

MCP: The Protocol That Enables Composition

The Model Context Protocol, introduced by Anthropic and rapidly adopted as a de facto standard, is the technical piece that makes this new paradigm possible. To understand its importance, a web history analogy is useful.

In the 1980s, computers could already communicate over networks, but every system used different protocols. Then came HTTP, a simple and universal protocol: any client could talk to any server, as long as both followed the same conventions. This standardization enabled the explosion of the web. It did not matter if you used a Mac or a PC, Netscape or Internet Explorer: the web page was the same.

MCP does something similar for AI agents. It defines a standard way to expose capabilities—things that a service can do—and to allow agents to discover and use them. An MCP server can expose access to a filesystem, a database, a third-party API, any resource or functionality. An MCP client, typically an AI agent, can connect to these servers, discover which capabilities are available, and use them to perform actions.

The architecture is elegant in its simplicity. The MCP server declares which "tools" it makes available, with a natural language description of what they do and which parameters they accept. The AI agent, when it needs to perform an action, can examine the available tools, choose the appropriate one, invoke it with the right parameters, and interpret the result. The protocol handles communication, authentication, and error management.

What makes MCP powerful is not the individual integration, but the composition. An agent connected to ten different MCP servers can combine their capabilities in ways that none of the individual services had anticipated. It can read data from a database, process it, write the results to a document, and send it by email—all in a single conversation, orchestrating services that know nothing about each other.

There is an echo of the Unix philosophy in all of this: "Write programs that do one thing and do it well. Write programs to work together." MCP servers are the new Unix programs: small, focused, composable. But instead of being connected by pipes and orchestrated by bash scripts, they are connected by a standard protocol and orchestrated by a language model that understands intentions.

The Twilight of Traditional RAG

Another technical change deserves attention, because it illuminates the direction of the emerging paradigm. Until a few months ago, the standard solution for giving LLMs access to external knowledge was RAG (Retrieval-Augmented Generation). The idea is simple: take your documents, break them into chunks, transform them into numerical vectors (embeddings), save them in a vector database. When the user asks a question, transform the question into a vector, search for the most "similar" chunks, and inject them into the prompt alongside the question. The model answers using that information.

RAG worked, and continues to work for certain use cases. But its limitations have become increasingly evident. Chunking breaks context: a coherent document gets cut into pieces that lose the overall meaning. Retrieval is imperfect: vector similarity does not always capture semantic relevance. The user has no control over or visibility into what is retrieved. And the architecture assumes that knowledge is static—indexed once and then queried—while often the relevant knowledge changes continuously or needs to be verified at the source.

In 2026, several technical evolutions have eroded RAG's centrality. Context windows are enormously larger: models that accept hundreds of thousands of tokens can simply contain entire documents instead of retrieving fragments. Persistent memory, natively integrated even in consumer products like Claude, allows maintaining context across different conversations without the need for external infrastructure. Mature tool use allows querying sources in real time instead of relying on potentially outdated indexes.

RAG does not disappear, but changes its role. From a load-bearing architecture, it becomes one tool among many that the agent can choose to use when appropriate. For searching a massive archive of historical documents, a vector index still makes sense. For answering a question about the current quarter's data, it is better to query the database directly. The agent, not the architecture, decides which approach to use.

This reflects a broader pattern: in the new paradigm, architectural decisions shift from design-time to run-time. It is not the system designer who decides how to retrieve information; it is the agent that decides, case by case, based on context and the user's intention.

What Remains of the Interface?

If the primary interaction with software becomes conversational, mediated by agents that understand natural language, what is the fate of the graphical interface? Is the GUI destined to disappear?

The answer is probably more nuanced. The graphical interface does not disappear, but its role changes. From control panel—the place where the user performs actions—it becomes display—the place where the user views results and confirms decisions.

Some functions of the graphical interface remain essential. The visualization of complex information (charts, maps, tables, diagrams) benefits enormously from visual representation. No textual description can replace a well-made chart for grasping a trend. The agent can generate the visualization, but the interface must display it.

Confirmation of critical actions is another area where the interface maintains a role. Do we really want an agent to delete files, send important emails, or modify sensitive data without an explicit confirmation step? A "Confirm" or "Cancel" button may seem primitive, but it is an essential security checkpoint.

Exploration without a precise question is a third case. Sometimes we do not know what we are looking for. We want to browse, get a sense of things, discover what is there. A navigable interface allows for serendipity; a conversation requires already knowing what to ask.

Finally, there is the question of accessibility. Not everyone can or wants to interact via text. Language difficulties, cognitive disabilities, personal preferences: a visual interface with clickable elements remains more accessible for many users.

But the center of gravity shifts. The graphical interface becomes a complement to conversational interaction, not the primary channel. The user speaks with the agent to say what they want; the interface displays the results and allows targeted interventions. It is an inversion compared to the current model, where the interface is primary and any AI assistant is an add-on.

The Hacker Perspective: Liberation or New Dependence?

AI Agents: the new interaction paradigm

For those who grew up in hacker culture, who read the Jargon File and Raymond's essays, who consider software freedom a founding value, this new paradigm presents a profound tension.

On one hand, there is something genuinely liberating. For decades, the non-technical user has been a prisoner of the interfaces others designed for them. They could not modify them, could not extend them, often could not even truly understand them. They had to adapt their way of thinking to the language of the software. Now, for the first time, the relationship can be inverted: it is the software that adapts to the user's language. Anyone who can express an intention can command a complex machine. It is a democratization of computational power that would have made Engelbart smile.

MCP is an open protocol. The specification is public, implementations are available, anyone can build servers and clients. There is a vibrant ecosystem of open-source integrations. The Unix philosophy of composition, which seemed lost in the world of SaaS monoliths, returns in a new form.

But on the other hand, the concerns are serious. Who controls the agent? In the SaaS model, at least, data resided on specific servers, interfaces were inspectable, workflows were predictable. In the agent model, every intention passes through a language model. If that model is proprietary, if it is a service from Anthropic, OpenAI, or Google, the dependence is total and opaque.

Richard Stallman has always insisted on the difference between "free software" and "gratis software." Free software is that whose code you can study, modify, and redistribute. It guarantees freedom, not just price. But the most capable LLMs are not free software in any meaningful sense. Even "open weight" models (those whose weights are released) are not truly open source: you cannot retrain them without enormous resources, you cannot know exactly what they learned and from where, you cannot modify them in any significant way.

The enshittification cycle of platforms according to Cory Doctorow

Cory Doctorow coined the term enshittification to describe the inevitable cycle of platforms: first they attract users with genuine value, then they extract value from users to give it to business partners, and finally they extract value from everyone to maximize profit, until the platform becomes unusable and collapses. If AI agents become the universal layer of interaction with software, what prevents the same cycle?

Transparency is another crucial concern. When I click a button in an interface, I know what I am doing—or at least I can know, if I want to inspect. When I ask an agent to do something, what actions does it actually take? What data does it read? To which services does it transmit it? The opacity of LLM reasoning extends to the opacity of actions.

Bruce Schneier and the concept of trust in technological systems

Bruce Schneier talks about trust in technological systems: whom we trust, why, and what happens when that trust is betrayed. Relying on an AI agent means trusting the company that develops it, its infrastructure, its policies, its future decisions. It is a very concentrated trust.

Future Scenarios and Open Questions

It is impossible to predict with certainty how this paradigm will evolve. But we can imagine alternative scenarios and ask ourselves which conditions would make them more or less likely.

The optimistic scenario sees the emergence of an open and competitive ecosystem. MCP and similar protocols become consolidated standards, universally adopted. Open-source models reach competitive capabilities with proprietary ones, allowing anyone to control their own agent. MCP servers proliferate, offering integrations with every imaginable service. The user can choose which model to use, which capabilities to enable, where their data resides. Software becomes truly composable and customizable as never before. Hackers build elaborate configurations of agents and tools, sharing them as they once shared dotfiles and scripts.

The pessimistic scenario sees the concentration of power in a few gatekeepers. Two or three companies control the dominant agents, because training competitive models requires resources that only they can afford. Every interaction with software is mediated, recorded, analyzed, monetized. MCP integrations exist, but the truly useful ones require premium subscriptions. Open-source models remain always a step behind, sufficient for experimenting but not for serious work. The conversational interface becomes the new walled garden, more insidious because more "natural": they do not look like walls, it just seems like a conversation with a friendly assistant.

The most likely scenario, perhaps, is a middle ground. Proprietary agents will dominate the consumer market, where ease of use and integration are priorities. But in the enterprise world, in the technical community, among those who have the skills and motivation to build alternatives, hybrid solutions will emerge. Local models for sensitive tasks, self-hosted agents for organizations that do not want to depend on external clouds, open-source tool ecosystems for those willing to invest time in exchange for control.

Open questions remain that will require answers in the years to come. How do we ensure transparency in the agent's decisions? Is a log of actions taken sufficient, or is something more needed? How do we preserve privacy when every intention passes through a model that might be trained on user data? How do we prevent "not knowing how to talk to the AI" from becoming the new form of digital illiteracy, excluding those who struggle to express themselves clearly and in a structured way? How do we regulate a paradigm where actions are mediated by systems that no one fully understands?

A Journey That Begins Again

About 38 years ago (yes, it feels strange writing that), in front of an Amstrad CPC464 with its green phosphor monitor, I was learning to dialogue with a machine. There were no icons to click, no windows to drag: there was a blinking cursor and the need to know what to type. Every command was a small conquest, every syntax error a lesson. Then came the BBSes, nights spent exploring textual worlds through a crackling modem, and finally Metro Olografix, the discovery that behind those terminals there was a community, a culture, a way of thinking about the relationship between humans and machines. The GUI, when it arrived, was promising but also suspect: it simplified, sure, but at what cost? It hid things, but what?

Today we are at a similar passage, perhaps a deeper one. The graphical interface does not die (nothing truly dies in software—layers accumulate), but it changes roles, retreats to the background, yields center stage to something new. Conversational interaction, mediated by agents that understand and act, promises to free us from the tyranny of imposed interfaces. But every liberation brings new dependencies, every democratization conceals new concentrations of power.

The Model Context Protocol is open. Agents can be self-hosted. Alternatives exist, for those who seek them. But the default direction—the one followed by those who do not consciously choose—leads toward a few large providers mediating every interaction with the digital world.

The question is not whether this future will arrive: it is already arriving. The question is who will build it, according to which values, with what safeguards. It is a technical question, but also a political, ethical, and social one. It is the question that hacker culture has been asking since the beginning, every time a new technology redefines the relationship between humans and machines.

The answer, as always, will depend on who takes the responsibility of giving it.

#AI #MCP #SaaS #User-Interface #AI-Agents #Open-Source #Hacker-Culture