Pinned

tools, not friends

10/27/25

some notes on metaphor in technology, and what makes software useful:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

For decades, digital interfaces have relied on object-based metaphors that taught us to organize and manipulate. Files live in folders, documents go in trash cans, items are placed in shopping carts; they’ve helped us understand unfamiliar systems by mapping them onto tangible things. But when we wrap AI technology in chat bubbles and friendly language, we're inheriting all the assumptions that come with human conversation, most of which don't apply to statistical prediction engines.

Beyond the conceptual mismatch, there's a functional one. The blank text box is optimized for quick back-and-forth messaging, not for the structured processes where AI could meaningfully extend human ability. Chat became the default, but we can do better.

How interfaces shape understanding

Lakoff and Johnson showed that metaphors aren't linguistic decoration, they're cognitive infrastructure. The metaphors we choose create conceptual frameworks that determine what inferences we draw, what actions seem reasonable, what conclusions feel obvious. This matters for interface design because different metaphors lead to fundamentally different relationships with technology. The object metaphor gave us a workable grammar of interaction– folders, clipboards, trash cans, windows. But to frame an LLM as a conversational partner is to inherit an entirely different framework: one oriented around human relationship. And that’s a problem.

Unlike traditional software, where logic flows from explicit code, AI systems resist easy interpretation. Yet we routinely frame them as thinking beings, inviting a mismatch between interface and reality. The metaphor, though, is seductive– as social psychologist Adam Mastroianni points out, we’re "hopeless anthropomorphizers." We name our cars, see faces in clouds, attribute intention where none exists. So when fluent language is delivered in a warm tone, packaged in the same friendly chat bubbles we use to text our friends, we respond accordingly: trust, confide, feel understood.

To be clear, this form factor isn’t always harmful. If you’re practicing French or riffing on an idea, it will probably even help! But in most contexts, the conversational partner metaphor becomes a liability. It feels familiar, but that familiarity misleads– it encourages users to outsource the very capacities AI lacks: judgment, ethical reasoning, creativity, nuanced decision-making.

The interface is where the stakes live. It’s where we choose the metaphors that define our relationship with this technology, which shapes everything downstream: what users expect, what they trust, what they delegate. To interact with AI as though it were human is to adopt the wrong frame. The question, then, is what the right lens looks like– and how to build interfaces that channel AI’s strengths toward genuine augmentation rather than simulated collaboration.

The case for constraints

If metaphors set the frame, then constraints set the function. The conversational interface, for all its appeal, is the wrong tool for most kinds of work. LLMs– a compressed representation of recorded knowledge– add to the confusion by creating an alluring sense of generality. A system that can write code, draft reports, answer questions, and plan meals seems infinitely adaptable. But this is the generality trap: mistaking breadth for depth, competence for specialization. A system that tries to do everything for everyone ends up mediocre at most things.

As Amelia Wattenberger points out, good tools communicate their purpose through design. A slider hints at adjustment, a dropdown signals discrete choices, a spreadsheet grid invites structured input and calculation. Chatbots, on the other hand, offer no such affordances. They don’t give any cues about what kinds of input will work or what the system can actually do. Every user starts from scratch, forced to learn by trial and error what could have been built into the design. What looks like neutrality is really abdication: without structure, the interface outsources design decisions to the model itself.

Technologist Linus Lee puts it plainly:

"There's this temptation to preserve the generality of the model as much as possible. But there's no true generality. It just means you're letting the post-trainers of whatever model you're using be your product designers... If you want to build good products on top of language models, you should be opinionated about what the users should type, and what they should want to do with it."

Good products can’t dodge those choices. The software creator needs to understand the jobs to be done, what "good" looks like, and how to structure the interface so users aren't left guessing how to use it.

What real augmentation looks like

This is where a critical distinction emerges: automation versus augmentation. Automating a rote task is fundamentally different than helping someone think better. Most LLM tools today automate in fragments (summarize this, rewrite that), offloading just enough work to remove agency without empowering the user’s own ability.

Real augmentation requires automation to be nested inside structure. Interfaces must be domain-specific, tuned to the way people actually think, decide, and act. They should carry context across interactions, offloading repetition and calculation to the model while leaving judgment and direction to the human. The point is not to conjure a robot sidekick, but to build something closer to a surgical instrument– an extension of the hand and mind: precise, purposeful, built to be controlled.

Wattenberger calls this stacked automation: augmenting human capability by automating the rote tasks that feed into bigger decisions. The spreadsheet is the canonical example. Its value wasn’t just in faster number-crunching– it was the structure that let people reason more effectively about numbers, relationships, and outcomes. The interface offloaded the drudgery of calculation while preserving human agency in judgment/design.

But unlike spreadsheets, AI carries a more menacing trait: it excels at sounding fluent in domains where it lacks real competence. The interface, then, becomes the safeguard: a place to encode purpose and constraint, channeling the model’s real strengths (scale, memory, speed) toward amplifying human capability, rather than pretending to replace it.

Designing clarity into AI interfaces

It cannot be overstated: LLMs are weird! They’re powerful, versatile, and genuinely exciting in their potential, but also probabilistic, uncanny, and often inscrutable. Designing well with them means acknowledging their strangeness and strategically shaping the interfaces that contain them.

So what might better interfaces look like? I think the opportunity lies in creative exploration of what becomes possible when we stop defaulting to standard chat. The key is designing affordance-rich surfaces that make the system's behavior as legible as possible, while supporting how people actually work.

Some possibilities for what such interfaces might explore or make visible:

  • Persisting response objects: Outputs don’t disappear in a chat stream, they live as editable blocks, cards or documents that you can revisit, annotate, branch off, or combine– more like working material than DM replies.

  • Multi‑scale “zoom” mode: Users can fluidly shift between layers of abstraction, depending on the level of reasoning required, like semantic zoom in a map, but for ideas, writing, or reasoning. The interface adapts to show appropriate tools and views. (e.g. The user can read an article at sentence-level, paragraph summary level, or full-document argument map.)

  • Ambient cues and overlays: Instead of launching a separate conversation with the LLM, the interface subtly augments the workspace: color‑coded alerts, inline status bars, micro‑notifications without modal disruption.

  • Workflow instrumentation dashboard: The system tracks progress, reveals coverage, flags gaps; embedding live, contextual awareness into the workspace itself. (E.g. in a grant writing interface, AI tracks required sections, shows which are complete/incomplete, and where tone or evidence is weak).

  • Inline intervention tools: During a reasoning or generation process you can pause, annotate, adjust sub‑goals, edit a step in the chain‑of‑thought, steering the system without resetting the entire context.

  • Transparent reasoning layers: Instead of delivering only answers, the system surfaces how it got there: what was compressed, omitted, what reasoning paths were taken, reducing “black‑box” mystique.

  • Scope & style dials: Users can control how broad or narrow the AI should be: toggles or sliders for depth, creativity, confidence, viewpoint count, shaping the task rather than submitting an open‑ended query.

  • Branchable, non‑linear workspaces: Instead of a one-dimensional thread, the interface supports branching flows, saved states, reuse of contexts, nested workspaces– matching how humans revise, revisit, adapt across time.

  • Input‑diagnostics overlay: Instead of just entering a prompt and waiting, users see live feedback on how the system is reading their input– which words it’s giving weight to, where ambiguity lives, where it’s unsure. (e.g. You type “analyze risks,” and the overlay prompts you to specify the type of risk. Instead of waiting for the model to misinterpret, you see where clarification is needed.)

The design imperative

Framing matters more than we often realize. If we think of AI as a collaborator, we risk designing systems that mislead users into trusting it in ways it can’t uphold. Label it as a tool instead– something you can control and adjust, wielded with purpose– and suddenly your interface can be honest about its limitations while amplifying what AI does well, keeping the work that matters most firmly in human hands.

In building AI-integrated software, the imperative is to be specific, opinionated, and purposeful; because if we don't choose what matters, the model will. And the model– no matter how charismatic its tone– doesn't care how it's affecting you, because it can’t care. It can only compute.

Caring, then, is up to the software creator.



Pinned

tools, not friends

10/27/25

some notes on metaphor in technology, and what makes software useful:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

For decades, digital interfaces have relied on object-based metaphors that taught us to organize and manipulate. Files live in folders, documents go in trash cans, items are placed in shopping carts; they’ve helped us understand unfamiliar systems by mapping them onto tangible things. But when we wrap AI technology in chat bubbles and friendly language, we're inheriting all the assumptions that come with human conversation, most of which don't apply to statistical prediction engines.

Beyond the conceptual mismatch, there's a functional one. The blank text box is optimized for quick back-and-forth messaging, not for the structured processes where AI could meaningfully extend human ability. Chat became the default, but we can do better.

How interfaces shape understanding

Lakoff and Johnson showed that metaphors aren't linguistic decoration, they're cognitive infrastructure. The metaphors we choose create conceptual frameworks that determine what inferences we draw, what actions seem reasonable, what conclusions feel obvious. This matters for interface design because different metaphors lead to fundamentally different relationships with technology. The object metaphor gave us a workable grammar of interaction– folders, clipboards, trash cans, windows. But to frame an LLM as a conversational partner is to inherit an entirely different framework: one oriented around human relationship. And that’s a problem.

Unlike traditional software, where logic flows from explicit code, AI systems resist easy interpretation. Yet we routinely frame them as thinking beings, inviting a mismatch between interface and reality. The metaphor, though, is seductive– as social psychologist Adam Mastroianni points out, we’re "hopeless anthropomorphizers." We name our cars, see faces in clouds, attribute intention where none exists. So when fluent language is delivered in a warm tone, packaged in the same friendly chat bubbles we use to text our friends, we respond accordingly: trust, confide, feel understood.

To be clear, this form factor isn’t always harmful. If you’re practicing French or riffing on an idea, it will probably even help! But in most contexts, the conversational partner metaphor becomes a liability. It feels familiar, but that familiarity misleads– it encourages users to outsource the very capacities AI lacks: judgment, ethical reasoning, creativity, nuanced decision-making.

The interface is where the stakes live. It’s where we choose the metaphors that define our relationship with this technology, which shapes everything downstream: what users expect, what they trust, what they delegate. To interact with AI as though it were human is to adopt the wrong frame. The question, then, is what the right lens looks like– and how to build interfaces that channel AI’s strengths toward genuine augmentation rather than simulated collaboration.

The case for constraints

If metaphors set the frame, then constraints set the function. The conversational interface, for all its appeal, is the wrong tool for most kinds of work. LLMs– a compressed representation of recorded knowledge– add to the confusion by creating an alluring sense of generality. A system that can write code, draft reports, answer questions, and plan meals seems infinitely adaptable. But this is the generality trap: mistaking breadth for depth, competence for specialization. A system that tries to do everything for everyone ends up mediocre at most things.

As Amelia Wattenberger points out, good tools communicate their purpose through design. A slider hints at adjustment, a dropdown signals discrete choices, a spreadsheet grid invites structured input and calculation. Chatbots, on the other hand, offer no such affordances. They don’t give any cues about what kinds of input will work or what the system can actually do. Every user starts from scratch, forced to learn by trial and error what could have been built into the design. What looks like neutrality is really abdication: without structure, the interface outsources design decisions to the model itself.

Technologist Linus Lee puts it plainly:

"There's this temptation to preserve the generality of the model as much as possible. But there's no true generality. It just means you're letting the post-trainers of whatever model you're using be your product designers... If you want to build good products on top of language models, you should be opinionated about what the users should type, and what they should want to do with it."

Good products can’t dodge those choices. The software creator needs to understand the jobs to be done, what "good" looks like, and how to structure the interface so users aren't left guessing how to use it.

What real augmentation looks like

This is where a critical distinction emerges: automation versus augmentation. Automating a rote task is fundamentally different than helping someone think better. Most LLM tools today automate in fragments (summarize this, rewrite that), offloading just enough work to remove agency without empowering the user’s own ability.

Real augmentation requires automation to be nested inside structure. Interfaces must be domain-specific, tuned to the way people actually think, decide, and act. They should carry context across interactions, offloading repetition and calculation to the model while leaving judgment and direction to the human. The point is not to conjure a robot sidekick, but to build something closer to a surgical instrument– an extension of the hand and mind: precise, purposeful, built to be controlled.

Wattenberger calls this stacked automation: augmenting human capability by automating the rote tasks that feed into bigger decisions. The spreadsheet is the canonical example. Its value wasn’t just in faster number-crunching– it was the structure that let people reason more effectively about numbers, relationships, and outcomes. The interface offloaded the drudgery of calculation while preserving human agency in judgment/design.

But unlike spreadsheets, AI carries a more menacing trait: it excels at sounding fluent in domains where it lacks real competence. The interface, then, becomes the safeguard: a place to encode purpose and constraint, channeling the model’s real strengths (scale, memory, speed) toward amplifying human capability, rather than pretending to replace it.

Designing clarity into AI interfaces

It cannot be overstated: LLMs are weird! They’re powerful, versatile, and genuinely exciting in their potential, but also probabilistic, uncanny, and often inscrutable. Designing well with them means acknowledging their strangeness and strategically shaping the interfaces that contain them.

So what might better interfaces look like? I think the opportunity lies in creative exploration of what becomes possible when we stop defaulting to standard chat. The key is designing affordance-rich surfaces that make the system's behavior as legible as possible, while supporting how people actually work.

Some possibilities for what such interfaces might explore or make visible:

  • Persisting response objects: Outputs don’t disappear in a chat stream, they live as editable blocks, cards or documents that you can revisit, annotate, branch off, or combine– more like working material than DM replies.

  • Multi‑scale “zoom” mode: Users can fluidly shift between layers of abstraction, depending on the level of reasoning required, like semantic zoom in a map, but for ideas, writing, or reasoning. The interface adapts to show appropriate tools and views. (e.g. The user can read an article at sentence-level, paragraph summary level, or full-document argument map.)

  • Ambient cues and overlays: Instead of launching a separate conversation with the LLM, the interface subtly augments the workspace: color‑coded alerts, inline status bars, micro‑notifications without modal disruption.

  • Workflow instrumentation dashboard: The system tracks progress, reveals coverage, flags gaps; embedding live, contextual awareness into the workspace itself. (E.g. in a grant writing interface, AI tracks required sections, shows which are complete/incomplete, and where tone or evidence is weak).

  • Inline intervention tools: During a reasoning or generation process you can pause, annotate, adjust sub‑goals, edit a step in the chain‑of‑thought, steering the system without resetting the entire context.

  • Transparent reasoning layers: Instead of delivering only answers, the system surfaces how it got there: what was compressed, omitted, what reasoning paths were taken, reducing “black‑box” mystique.

  • Scope & style dials: Users can control how broad or narrow the AI should be: toggles or sliders for depth, creativity, confidence, viewpoint count, shaping the task rather than submitting an open‑ended query.

  • Branchable, non‑linear workspaces: Instead of a one-dimensional thread, the interface supports branching flows, saved states, reuse of contexts, nested workspaces– matching how humans revise, revisit, adapt across time.

  • Input‑diagnostics overlay: Instead of just entering a prompt and waiting, users see live feedback on how the system is reading their input– which words it’s giving weight to, where ambiguity lives, where it’s unsure. (e.g. You type “analyze risks,” and the overlay prompts you to specify the type of risk. Instead of waiting for the model to misinterpret, you see where clarification is needed.)

The design imperative

Framing matters more than we often realize. If we think of AI as a collaborator, we risk designing systems that mislead users into trusting it in ways it can’t uphold. Label it as a tool instead– something you can control and adjust, wielded with purpose– and suddenly your interface can be honest about its limitations while amplifying what AI does well, keeping the work that matters most firmly in human hands.

In building AI-integrated software, the imperative is to be specific, opinionated, and purposeful; because if we don't choose what matters, the model will. And the model– no matter how charismatic its tone– doesn't care how it's affecting you, because it can’t care. It can only compute.

Caring, then, is up to the software creator.



October 27, 2025

tools, not friends

some notes on metaphor in technology, and what makes software useful:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

For decades, digital interfaces have relied on object-based metaphors that taught us to organize and manipulate. Files live in folders, documents go in trash cans, items are placed in shopping carts; they’ve helped us understand unfamiliar systems by mapping them onto tangible things. But when we wrap AI technology in chat bubbles and friendly language, we're inheriting all the assumptions that come with human conversation, most of which don't apply to statistical prediction engines.

Beyond the conceptual mismatch, there's a functional one. The blank text box is optimized for quick back-and-forth messaging, not for the structured processes where AI could meaningfully extend human ability. Chat became the default, but we can do better.

How interfaces shape understanding

Lakoff and Johnson showed that metaphors aren't linguistic decoration, they're cognitive infrastructure. The metaphors we choose create conceptual frameworks that determine what inferences we draw, what actions seem reasonable, what conclusions feel obvious. This matters for interface design because different metaphors lead to fundamentally different relationships with technology. The object metaphor gave us a workable grammar of interaction– folders, clipboards, trash cans, windows. But to frame an LLM as a conversational partner is to inherit an entirely different framework: one oriented around human relationship. And that’s a problem.

Unlike traditional software, where logic flows from explicit code, AI systems resist easy interpretation. Yet we routinely frame them as thinking beings, inviting a mismatch between interface and reality. The metaphor, though, is seductive– as social psychologist Adam Mastroianni points out, we’re "hopeless anthropomorphizers." We name our cars, see faces in clouds, attribute intention where none exists. So when fluent language is delivered in a warm tone, packaged in the same friendly chat bubbles we use to text our friends, we respond accordingly: trust, confide, feel understood.

To be clear, this form factor isn’t always harmful. If you’re practicing French or riffing on an idea, it will probably even help! But in most contexts, the conversational partner metaphor becomes a liability. It feels familiar, but that familiarity misleads– it encourages users to outsource the very capacities AI lacks: judgment, ethical reasoning, creativity, nuanced decision-making.

The interface is where the stakes live. It’s where we choose the metaphors that define our relationship with this technology, which shapes everything downstream: what users expect, what they trust, what they delegate. To interact with AI as though it were human is to adopt the wrong frame. The question, then, is what the right lens looks like– and how to build interfaces that channel AI’s strengths toward genuine augmentation rather than simulated collaboration.

The case for constraints

If metaphors set the frame, then constraints set the function. The conversational interface, for all its appeal, is the wrong tool for most kinds of work. LLMs– a compressed representation of recorded knowledge– add to the confusion by creating an alluring sense of generality. A system that can write code, draft reports, answer questions, and plan meals seems infinitely adaptable. But this is the generality trap: mistaking breadth for depth, competence for specialization. A system that tries to do everything for everyone ends up mediocre at most things.

As Amelia Wattenberger points out, good tools communicate their purpose through design. A slider hints at adjustment, a dropdown signals discrete choices, a spreadsheet grid invites structured input and calculation. Chatbots, on the other hand, offer no such affordances. They don’t give any cues about what kinds of input will work or what the system can actually do. Every user starts from scratch, forced to learn by trial and error what could have been built into the design. What looks like neutrality is really abdication: without structure, the interface outsources design decisions to the model itself.

Technologist Linus Lee puts it plainly:

"There's this temptation to preserve the generality of the model as much as possible. But there's no true generality. It just means you're letting the post-trainers of whatever model you're using be your product designers... If you want to build good products on top of language models, you should be opinionated about what the users should type, and what they should want to do with it."

Good products can’t dodge those choices. The software creator needs to understand the jobs to be done, what "good" looks like, and how to structure the interface so users aren't left guessing how to use it.

What real augmentation looks like

This is where a critical distinction emerges: automation versus augmentation. Automating a rote task is fundamentally different than helping someone think better. Most LLM tools today automate in fragments (summarize this, rewrite that), offloading just enough work to remove agency without empowering the user’s own ability.

Real augmentation requires automation to be nested inside structure. Interfaces must be domain-specific, tuned to the way people actually think, decide, and act. They should carry context across interactions, offloading repetition and calculation to the model while leaving judgment and direction to the human. The point is not to conjure a robot sidekick, but to build something closer to a surgical instrument– an extension of the hand and mind: precise, purposeful, built to be controlled.

Wattenberger calls this stacked automation: augmenting human capability by automating the rote tasks that feed into bigger decisions. The spreadsheet is the canonical example. Its value wasn’t just in faster number-crunching– it was the structure that let people reason more effectively about numbers, relationships, and outcomes. The interface offloaded the drudgery of calculation while preserving human agency in judgment/design.

But unlike spreadsheets, AI carries a more menacing trait: it excels at sounding fluent in domains where it lacks real competence. The interface, then, becomes the safeguard: a place to encode purpose and constraint, channeling the model’s real strengths (scale, memory, speed) toward amplifying human capability, rather than pretending to replace it.

Designing clarity into AI interfaces

It cannot be overstated: LLMs are weird! They’re powerful, versatile, and genuinely exciting in their potential, but also probabilistic, uncanny, and often inscrutable. Designing well with them means acknowledging their strangeness and strategically shaping the interfaces that contain them.

So what might better interfaces look like? I think the opportunity lies in creative exploration of what becomes possible when we stop defaulting to standard chat. The key is designing affordance-rich surfaces that make the system's behavior as legible as possible, while supporting how people actually work.

Some possibilities for what such interfaces might explore or make visible:

  • Persisting response objects: Outputs don’t disappear in a chat stream, they live as editable blocks, cards or documents that you can revisit, annotate, branch off, or combine– more like working material than DM replies.

  • Multi‑scale “zoom” mode: Users can fluidly shift between layers of abstraction, depending on the level of reasoning required, like semantic zoom in a map, but for ideas, writing, or reasoning. The interface adapts to show appropriate tools and views. (e.g. The user can read an article at sentence-level, paragraph summary level, or full-document argument map.)

  • Ambient cues and overlays: Instead of launching a separate conversation with the LLM, the interface subtly augments the workspace: color‑coded alerts, inline status bars, micro‑notifications without modal disruption.

  • Workflow instrumentation dashboard: The system tracks progress, reveals coverage, flags gaps; embedding live, contextual awareness into the workspace itself. (E.g. in a grant writing interface, AI tracks required sections, shows which are complete/incomplete, and where tone or evidence is weak).

  • Inline intervention tools: During a reasoning or generation process you can pause, annotate, adjust sub‑goals, edit a step in the chain‑of‑thought, steering the system without resetting the entire context.

  • Transparent reasoning layers: Instead of delivering only answers, the system surfaces how it got there: what was compressed, omitted, what reasoning paths were taken, reducing “black‑box” mystique.

  • Scope & style dials: Users can control how broad or narrow the AI should be: toggles or sliders for depth, creativity, confidence, viewpoint count, shaping the task rather than submitting an open‑ended query.

  • Branchable, non‑linear workspaces: Instead of a one-dimensional thread, the interface supports branching flows, saved states, reuse of contexts, nested workspaces– matching how humans revise, revisit, adapt across time.

  • Input‑diagnostics overlay: Instead of just entering a prompt and waiting, users see live feedback on how the system is reading their input– which words it’s giving weight to, where ambiguity lives, where it’s unsure. (e.g. You type “analyze risks,” and the overlay prompts you to specify the type of risk. Instead of waiting for the model to misinterpret, you see where clarification is needed.)

The design imperative

Framing matters more than we often realize. If we think of AI as a collaborator, we risk designing systems that mislead users into trusting it in ways it can’t uphold. Label it as a tool instead– something you can control and adjust, wielded with purpose– and suddenly your interface can be honest about its limitations while amplifying what AI does well, keeping the work that matters most firmly in human hands.

In building AI-integrated software, the imperative is to be specific, opinionated, and purposeful; because if we don't choose what matters, the model will. And the model– no matter how charismatic its tone– doesn't care how it's affecting you, because it can’t care. It can only compute.

Caring, then, is up to the software creator.