We keep hearing about giving agents the right context - that’s our job now.
But how do you actually give it instruction files to write/design/code/whatever like YOU want it to?
A pattern I see a lot is getting your agent to interview you.
Then add a specific ask based on your goals…
I recently did this with the course I’m working on.
I hate courses. I don’t think the majority of them do all that much teaching. They walk you through steps to mimic getting you to the end goal. But once you ‘graduate’ (not many people do), you’re on your own.
But real life is never that straight forward, you’ll always hit bumps in the road and courses don’t give you the knowledge to navigate them.
I get stuck with blank page syndrome. I need something, anything, to start me off - even if it’s AI slop.
So I asked my agent
The agent asked me 20 questions and I spoke my rambling thoughts back. I was genuinely surprised how often it would remember to probe me, or ask clarifications like ‘do you actually mean to go down this route or this one’.
All in all a very helpful exercise in getting off the blank page. I now have a number of sections with ‘what this covers’. Even if on first glance I know I’m going to remove/merge/edit a lot, I’m moving forward.
What am I building this week?
I’m furiously working away on the course I mentioned above, named Fork Off.
I want to revisit my OpenClaw/personal agent memory system - has anyone found one that they absolutely love and swear by?
I really want to make a YouTube wrapper for my kids where I can pre-approve channels I let them watch. Fuck CocoMelon, ASMR, cutting coloured sand and all that crap. YT’s algo just constantly surfaces these. Also if you make a YT video for kids, please put the thumbnail scene at the start of the video - or you get meltdowns 🙃
Ben’s Bites is brought to you by Reevo
Go stackless and get back to selling. Remember when selling meant talking to people? Before the tab-switching and endless sync errors. Reevo brings it all back to one platform. Prospecting, calls, pipeline, and reporting all in a single tab. From prospect to close. Go Stackless. reevo.ai*
You can schedule recurring cloud-based tasks on Claude Code, and you can now enable Claude to use your computer to complete tasks. It uses your connectors first, but if there's no connector, it’ll use your computer to open the app (but your computer must be on!) Plus projects are now available in Cowork.
Long-running agents designed to automate large software tasks like building applications from scratch with Factory Missions. This is genuinely the closest feeling of AGI I’ve ever had. You spend decent time planning your mission but then it just does everything end to end.
ChatGPT now has a library of the files you upload, making it easier to reference them. OpenAI is also planning to simplify its product experience and launch one “superapp” - much like Claude has done with their Desktop product.
Cursor launched Composer 2 as their latest ‘in-house’ coding model. It came to light that the model was a tuned version of Kimi’s 2.5 open-source model (which they failed to mention, which caused some rumblings on X.) They boasted about their high scores on their own benchmark, CursorBench - but only compared their scores against Claude Code/Codex (not any other harnesses which outperform them), which feels weird considering they are a harness themselves. They also released ‘Glass’, which is their new interface that follows the 3-column layout that lots of apps are using.
SpaceX, Tesla and XAI launched TERAFAB, the largest chip manufacturing facility ever (1TW/year). This post from Sequoia Partner Shaun Maguire puts forward the idea that everyone is sleeping on XAI, and how it will win in AI.
New model, worse benchmark. Plot twist: the truth files were wrong. AssemblyAI found their AI penalized for transcribing things correctly that human labelers missed. Live workshop March 31 on why WER breaks and how to fix your eval pipeline.*
* sponsors who make this newsletter possible :)
Wanna partner with us for the next quarter?