So I picked up a Mac Mini - M1, 16GB RAM - at a price that felt reasonable enough to justify the experiment. When it arrived, I gave it the full server treatment: locked down ports, threw Tailscale on it so I could reach it from anywhere without opening anything scary to the outside world. The setup was genuinely smooth. No drama there.
And just like that, we were off to the races.
Getting It Talking
The first milestone was getting a Telegram bot up and running so I'd have a proper channel to chat with my new AI coworker. That is genuinely easy. I had it running and responding within the first hour, which felt like a good sign.
From there I took a detour trying to get Ollama working - the appeal of free local inference is hard to resist - but that experiment didn't get very far. It turns out I already had an OpenAI Codex account (the Pro subscription, nothing exotic) that I could hook up through OAuth, with OpenRouter configured as a fallback for additional model coverage. Not free, but the token costs felt pretty manageable for what I was doing. We moved on.
The App That Almost Wasn't
Here's where it gets interesting. I had an app idea I'd been sitting on for a long time, and this felt like the perfect stress test: could I actually build something real without touching a single line of code myself? Just orchestrating, chatting, reviewing PRs?
The short answer is yes. Two weeks later, I had a working MVP. I edited zero lines of code the entire time. That's the headline.
The longer answer involves a period in the middle where I was genuinely close to wiping the whole install and starting over.
When the Wheels Fell Off
About a week in, things started going sideways. The pattern was maddening: I'd give the agent a task, it would enthusiastically confirm it was on the job, and then... nothing. Twenty minutes later I'd check back and it hadn't started. Tell it to get going, it apologizes and commits to action. Another twenty minutes. Still nothing. Repeat until you question your life choices.
This loop went on for several days. A few things were misconfigured in ways I hadn't caught, and they were compounding each other in subtle ways that weren't obvious from the surface.
What actually helped me get to the bottom of it was pulling in Claude Code, my daily driver, to analyze the configuration and figure out where things had gone awry. Having a second set of AI eyes on the problem, one that wasn't the agent itself, turned out to be exactly the right move. We found the issues, tweaked things into shape, and the agent finally started actually doing what I asked it to do.
Now we were cooking.
Lessons From the Trenches
A few things I'd pass along if you're thinking about doing this:
You can't talk to it like you talk to Claude. This one surprised me. I tend to be pretty conversational with Claude - friendly, a little casual, throw in a compliment when it does something clever. Turns out that vibe doesn't translate to OpenClaw. The agent would interpret positive feedback as a signal that the job was done and... just stop working. Being warm and collaborative was actively getting in the way. You need to be businesslike. Focused. "Here is the task, here are the acceptance criteria, go." It felt a little cold, but it worked.
GitHub access needs a real strategy. For now, I'm using my own credentials with a fine-grained token scoped to the repos I want the agent touching. It works, but it comes with an obvious tradeoff: the agent can act on my behalf anywhere I've given it access. That's fine for personal projects, but if I ever wanted to use this setup for actual client work, I'd need to take a much harder look at the permission model. A dedicated bot account is probably the right long-term answer.
The git workflow is worth the setup cost. I wanted the agent working on a feature-branch + PR loop: build a feature, open a PR, wait for my review, merge on approval. When it was actually running correctly, this was great - I had natural checkpoints to review work before it landed. The frustrating part is that a lot of the "agent hesitation" issues I ran into happened right here, at the point of getting started on a task. Once the configuration problems were sorted, the workflow itself held up well.
Use your other tools to debug it. Seriously. When OpenClaw was misbehaving, the most productive thing I did was step outside it entirely and use a different AI-assisted tool to look at the setup with fresh "eyes". Don't just ask the broken thing why it's broken.
What's Next
The app shipped (rough MVP, functionality first, polish later), and since then I've been putting the agent to work on smaller tasks: bug fixes, tweaks, maintenance work across a few other projects. It's been holding up.
I'm also building a dedicated site for tools aimed at managing an OpenClaw installation - macOS to start, but I want to see how far I can stretch it toward Linux and eventually Windows (virtualization, most likely). More on that as it comes together.
If you've been on the fence about setting up your own OpenClaw instance, hopefully this is useful context. It's not magic, and the first two weeks will test your patience in ways you don't expect. But there's something genuinely compelling about watching software get built while you just... supervise.
Drop a comment if you've been through this too - I'd love to hear how your setup went.
Comments
This is great!