A peek inside CLI tools

Agents are LLMs with tool-use. They don’t just respond to you, they can go and do things for you. But what does ‘tool-use’ actually mean? What tools?

The most common tools are in the form of CLI. Agents communicate in text, CLIs are text in/text out, so it’s a natural fit. A CLI is a text-based way to control software. You type a command, something happens.

Here’s a simple example - organising files, using the bash tool.

"Rename all 400 product photos to match our SKU format, resize them to 1200x1200, and sort them into folders by category."

‘mkdir’ is the command for ‘make directory’ (directory is a folder), here it’s creating 5 - output, output/shoes, output/bags, output/jackets, output/hats
flags modify what a command does: -p here means ‘create any missing parent folders too.’ So if ./output/ doesn’t exist yet, it’ll make that too

It does all this in seconds. It would take you a couple of hours manually.

This is one CLI, called bash, the general-purpose command line that comes with your computer. But there are purpose-built CLIs for specific jobs too:

Stripe CLI — pull revenue data, manage subscriptions, test payments
Playwright — control a web browser: navigate, click, fill forms, take screenshots
AWS CLI — spin up servers, manage databases, scale infrastructure
Vercel CLI — deploy a website live in one command

Each of these is a separate tool an agent can use. The file organising example used one tool (bash). But give an agent the Stripe CLI too and now it can pull your revenue numbers. Add Playwright and it can browse the web. Add Vercel and it can deploy what it builds.

That’s what “tool use” means. The more CLIs you give an agent access to, the more it can do. Your job is to make sure it has the right ones for the task.

It all sounds a bit technical, and it is, but you’d only see those raw commands if you’re using a terminal or watching them fly by in tools like Claude Code. They’re present even when you don’t see them.

If an agent like Cowork is doing a task, you can click to expand what it ran and see the detail — like this example listing files to find recent fund updates.

Every agent is running commands like this under the hood. The interface just hides and abstracts them away.

Chronicle – Cursor for slides. Turn ideas and notes into stunning, professional decks in minutes.*
Paper Snapshot - Snapshot your live website and paste it into Paper as editable HTML/CSS layers.
Ghostwriter by Sierra - Chat with an agent to build more agents.
Mario, founder of the popular open source agent Pi, wrote a post yesterday, “Thoughts on slowing the fuck down“, that says software quality appears to be declining as more companies rely on agents.
Building CLIs for agents - Eric from Cursor wrote a thread on making CLIs that actually work for agents. ElevenLabs has already made their CLI agent-friendly using these tips.
Building deep research that works from your CLI with BrowserBase. (resulting code)
Hark – New AI lab from Brett Adcock (yes, the Figure robotics guy). 8 months in stealth, focused on "the most advanced personal intelligence" paired with next-gen hardware.
GitHub has been going down wayyy too often these days. Plans to fix it and alternatives are starting to show up.
How USV built a team of internal agents that live in their group email threads and learn from team feedback.
Feynman - Read papers, research and get cited meta-analysis for your question from your CLI.
Brave registered the .agent TLD and is making it a community effort. I tried to reserve 10 domains 😬
Lil Agents – Tiny AI companions that live above your dock. Each one has its own Claude session and mini window. Now open source. Adorable.

merch gifts have gone up a level ty

1:12 PM · Mar 26, 2026 · 620 Views

6 Replies · 5 Likes

Since we all know that terminals are made for complex UIs... I decided to make T1Code (1T, because a terminal is all you need). I know really likes this kind of complex UI right on the terminal... so lets hope he likes it!

9:53 PM · Mar 25, 2026 · 105K Views

53 Replies · 26 Reposts · 812 Likes

Cursor cloud agents can now run on your infrastructure. Get the same cloud agent harness and experience, but keep your code and tool execution entirely in your own network.

cursor.com

Run cloud agents in your own infrastructure · Cursor

6:32 PM · Mar 25, 2026 · 121K Views

92 Replies · 113 Reposts · 1.76K Likes

AI will help discover new science, such as cures for diseases, which is perhaps the most important way to increase quality of life long-term. AI will also present new threats to society that we have to address. No company can sufficiently mitigate these on their own; we will

5:01 PM · Mar 24, 2026 · 913K Views

1.67K Replies · 554 Reposts · 6.62K Likes

Introducing Expect Let agents test your code in a real browser 1. Run Claude Code / Codex to QA your app 2. Watch a video of every bug found 3. Fix and repeat until passing Run as a CLI or agent skill. Fully open source

4:06 PM · Mar 25, 2026 · 407K Views

162 Replies · 216 Reposts · 3.26K Likes

Introducing the new dev-browser cli. The fastest way for an agent to use a browser is to let it write code. Just `npm i -g dev-browser` and tell your agent to "use dev-browser"

4:27 PM · Mar 25, 2026 · 463K Views

94 Replies · 176 Reposts · 1.87K Likes

Announcing - the open standard for Agent Companies Import and run entire companies with a single command Just run `npx add <repo/company>` More 👇

4:12 PM · Mar 25, 2026 · 130K Views

108 Replies · 114 Reposts · 1.16K Likes

Daniel Griesser@DanielGri

I updated my interactive subagents to free up the main agent to be interactive as well (basically /btw but just a normal continuation) and the subagent asynchronously returns its result to the starting session

10:50 AM · Mar 24, 2026 · 27.2K Views

12 Replies · 15 Reposts · 225 Likes

Share Ben's Bites