The most interesting thing about everyone trying to position themselves as AI experts is the futility of it: the technology explicitly promises tomorrows models will be better then todays, which means the skill investment is deflationary: the best time to learn anything is tomorrow when a better model will be better at doing the same work - because you don't need to be (conversely if you're not good at debugging and reverse engineering now...)
doesn’t that presume no value is being delivered by current models?
I can understand applying this logic to building a startup that solves today’s ai shortcomings… but value delivered today is still valuable even if it becomes more effective tomorrow.
You nailed it. Thats exactly how I feel. Wake me up when the dust settles, and i'll deep dive and learn all the ins and outs. The churn is just too exhausting.
I don't get the pressure. I don't know about you, but my job for a long time has been continually learning new systems. I don't get how so many of my peers fall into this head trip where they think they are gonna get left behind by what amounts to anticipated new features from some SaaS one day.
How do you both hold that the technology is so revolutionary because of its productive gains, but at the same time so esoteric that you better be ontop of everything all the time?
This stuff is all like a weird toy compared to other things I have taken the time to learn in my career, the sense of expertise people claim at all comes off to me like a guy who knows the Taco Bell secret menu, or the best set of coupons to use at Target. Its the opposite of intimidating!
A slow, deliberate approach is an excellent idea. We get nowhere by jumping at every shiny thing. But life may be too short to wait for the dust to settle.
I may just be a "doomer", but my current take is we have maybe 3-5 years of decent compensation left to "extract" from our profession. Being an AI expert will likely extend that range slightly, but at the cost of being one of the "traitors" that helps build you're own replacement (but it will happen with or without you).
I have a reading list of a bunch of papers i didn't get through over the past 2 years. it is crazy how many papers on this list are completely not talked about anymore.
I kinda regret going through the SeLU paper lol back in the late 2010s.
Is this the product? I don't want to jump on the detractor wagon, but I read the post and watched the video, and all I gathered is that it dumps the context into the commit. I already do this.
Git is totally fine keeping a few extra text files. These are ephemeral anyway. The working sessions just get squashed down and eliminated by the time I've got something worth saving anyway. At that point, I might keep a overview file around describing what the change does and how it was implemented.
(I will give the agent boom a bit of credit: I write a lot more documentation now, because it's essentially instruction and initial instruction to anything else that works on it. That's a total inversion, and I think it's good.)
The bigger problem is, like others have said, there's no one true flow. I use different agents for different things. I might summarize a lot of reasoning with a cheap model to create a design document, or use a higher reasoning model to sanity check a plan, whatever. It's a lot like programming in English. I don't want my tool to be prescriptive and imposing its technical restrictions on me.
All of that aside: it's impossible that this tool raised $60 million. The problem with this post is that it's supposed to be a hype post about changing the game "entirely" but it doesn't give us a glimpse into whatever we're supposed to by hyped about.
I have it (claude, codex) summarise what we've discussed about a design, big change, put it in an MD file and then I correct it, have it re-read it and then do the change.
Then later if it goes off piste in another session tell it to re-read the ADDs for x, y and z.
If someone could make that process less clunky, that would be great. However it's very much not just funnel every turd uttered in the prompt onto a git branch and trying a chug the lot down every session.
Very similar for me. I have a plans folder in my root where I store the plans while they're either under improvement or under implementation. Once they're done they're moved into the plans/old folder. So far it's worked great. It's a couple of manual steps extra but very helpful record.
Pretty much the same thing. I don't find it to be a burden. Regarding the product, I'm willing to believe I just don't see big picture, but without some peek at the magic, I don't know how much easier this could really be.
We went from having new JavaScript frameworks every week to having new AI frameworks every week. I'm thinking I should build a HN clone that filters out all posts about AI topics...
Looking at the most popular agent skills, heavily geared towards react and JS, I think a lot of the most breathless reports of LLM success are weighted towards the same group of fashion-dependant JavaScript developers.
The same very online group endlessly hyping messy techs and frontend JS frameworks, oblivious to the Facebook and Google sized mechanics driving said frameworks, are now 100x-ing themselves with things like “specs” and “tests” and dreaming big about type systems and compilers we’ve had for decades.
I don’t wanna say this cycle is us watching Node jockies discover systems programming in slow motion through LLMs, but it feels like that sometimes.
Just give me your bank account, claude API, Mother's maiden name, your zip code, your 3 digit security code, and anything else you think I might need to live as malfist the magnificant. Can I call you that?
I've long wished for a 'filter' feature for the hn feed -- namely the old trend of web3 slop -- but with little else than keywords to filter, it would likely be tedious and inaccurate. Ironically, I think with AI/LLMs it could be a little easier to analyze.
This is how software is being written now. What you propose is like joining a forum called "Small-Scale Manufacturing News" and filtering out all 3D-printing articles.
The context preservation problem is genuinely painful - I've been using task.md files and CLAUDE.md conventions to maintain agent state across sessions, and it's duct tape at best. First-class "checkpoints" that capture reasoning alongside diffs is an appealing idea.
But I'm skeptical of building this as a separate platform rather than as tooling on top of git. The most useful AI dev workflow improvements I've seen (cursor rules, aider conventions, claude hooks) all succeeded precisely because they stayed close to existing tools. The moment you ask developers to switch their entire SDLC stack, adoption becomes the real engineering challenge - not the tech.
Curious whether the open source commitment means the checkpoint format itself will be an open spec that other tools can build on.
The CLI is open source, everyone can use it and it does work with git only. So, no separate platform needed. The platform only provides convenience to view checkpoints at the moment. However you can also view them in the CLI. It's here https://github.com/entireio/cli
Agents can save their reasoning into markdown files, and commit those files to Git. Are "Checkpoints" just a marketing term for that, or there's more to it?
Claude Code already does this, you can access it with /resume, /rewind and /fork. I'd imagine building a version that saves in the repo instead of in the home folder would take very minimal effort.
The domain expired a few days ago and was purchased by someone else and then changed. There's a recreation of the original here https://html5zombo.com/
Exactly ... tired by all the marketing hyperbole talk. Just show what your product does in a simple example / showcase. If it's good, people will like it. You can save yourself a lot of text copy and user time that way.
The problem is that when it comes to (commercial) developer tools and services, everyone can/wants to be everything, so why let a simple statement or a showcase limit you? "Hey, we are a container scanning service... But we can also be a container registry too, a CI, a KeyValue store, an agent sandbox provider, git hosting? We can do quick dev deployments/preview too. Want a private npm registry? Automated pull request reviews? Code Signing service? We are working on a new text editor btw"
I feel like these types of pages are less geared towards actual users of the product and more towards the investors who love the vague and flowery language. We're no longer in a world where the path to profitability was the objective goal anyway, it makes sense to me that the marketing of software is becoming decreasingly detached from reality..
It's almost like an extension of the "if you're not paying for the product, you are the product" idea. If you're assessing a tool like this and the marketing isn't even trying to communicate to you, the user, what the product does, aren't you also kind of "the product" in this case too?
Seems they install a Git hook or something that executes on commit and saves your chatbot logs associated with the commit hash. This is expected to somehow improve on the issue that people are synthesising much more code than they could read and understand, and make it easier to pass along a bigger context next time you query your chatbots, supposedly to stop them from repeating "mistakes" that have already wasted your time.
What it does? Imagine a multi line commit message.
Yes yes a Dropbox comment. But the problem here is 1 million people are doing the same thing. For this to be worth 60M seed I suspect they need to do something more than you can achieve by messing around locally."
"Claude build me a script in bash to implement a Ralph loop with a KV store tied to my git commits for agent memory."
> Spec-driven development is becoming the primary driver of code generation.
This sounds like my current "phase" of AI coding. I have had so many project ideas for years that I can just spec out, everything I've thought about, all the little ideas and details, things I only had time to think about, never implement. I then feed it to Claude, and watch it meet my every specification, I can then test it, note any bugs, recompile and re-test. I can review the code, as you would a Junior you're mentoring, and have it rewrite it in a specific pattern.
Funnily enough, I love Beads, but did not like that it uses git hooks for the DB, and I can't tie tickets back to ticketing systems, so I've been building my own alternative, mine just syncs to and from github issues. I think this is probably overkill for whats been a solved thing: ticketing systems.
I am going lower level - every individual work item is a "task.md" file, starts initially as a user ask, then add planning, and then the agent checks gates "[ ]" on each subtask as it works through it. In the end the task files remain part of the project, documenting work done. I also keep an up to date mind map for the whole project to speed up start time.
And I use git hooks on the tool event to print the current open gate (subtask) from task.md so the agent never deviates from the plan, this is important if you use yolo mode. It might be an original technique I never heard anyone using it. A stickie note in the tool response, printed by a hook, that highlights the current task and where is the current task.md located. I have seen stretches of 10 or 15 minutes of good work done this way with no user intervention. Like a "Markdown Turing Machine".
That's hilarious, I called it gates too for my reimplementation of Beads. Still working on it a bit, but this is the one I built out a month back, got it into git a week ago.
For me a gate is: a dependency that must pass before a task is closed. It could be human verification, unit testing, or even "can I curl this?" "can I build this?" and gates can be re-used, but every task MUST have one gate.
My issue with git hooks integration at that level is and I know this sounds crazy, but not everyone is using git. I run into legacy projects, or maybe its still greenfield as heck, and all you have is a POC zip file your manager emailed you for whatever awful reason. I like my tooling to be agnostic to models and external tooling so it can easily integrate everywhere.
Yours sounds pretty awesome for what its worth, just not for me, wish you the best of luck.
Task management is fundamentally straightforward and yet workflow specific enough that I recommend everyone just spend a few hours building their own tools at this point.
Me too. I've been using spec-kitty [0], a fork of Spec Kit. Quite amazing how a short interview on an idea can produce full documents of requirements, specs, tasks, etc. After a few AI projects, this is my first time using spec driven development, and it is definitely an improvement.
I shall give the benefit of a doubt given they are "building in the open". I feel my current setup already does all this though, so I struggle to see the point
It’s funny. The whole “review intent", "learning" from past mistakes, etc, is exactly what my current set up does too. For free. Using .md files said agents generate as they go.
Hah. "If it's not too much trouble, would you mind if we disable the rimraf root feature?"
Gotta bully that thing man. There's probably room in the market for a local tool that strips the superfluous niceties from instructions. Probably gonna save a material amount of tokens in aggregate.
The AI fatigue is real, and the cooling-off period is going to hurt. We’re deep into concept overload now. Every week it’s another tool (don’t get me started on Gas Town) confidently claiming to solve… something. “Faster development”, apparently.
Unless you’re already ideologically committed to this space, I don’t see how the average engineer has the energy or motivation to even understand these tools, never mind meaningfully compare them. That’s before you factor in that many of them actively remove the parts of engineering people enjoy, while piling on yet another layer of abstraction, configuration, and cognitive load.
I’m so tired of being told we’re in yet another “paradigm shift”. Tools like Codex can be useful in small doses, but the moment it turns into a sprawling ecosystem of prompts, agents, workflows, and magical thinking, it stops feeling like leverage and starts feeling like self-inflicted complexity.
Your point about the overwhelming proliferation of AI tools and not knowing which are worth any attention and which are trash is very true I feel that a lot today (my solution is basically to just lean into one or two and ask for recommendations on other tools with mixed success).
The “I’m so tired of being told we’re in another paradigm shift” comments are widely heard and upvoted on HN and are just so hard to comprehend today. They are not seeing the writing on the wall and following where the ball is going to be even in 6-12 months. We have scaling laws, multiple METR benchmarks, internal and external evals of a variety of flavors.
“Tools like codex can be useful in small doses” the best and most prestigious engineers I know inside and outside my company do not code virtually at all. I’m not one of them but I also do not code at all whatsoever. Agents are sufficiently powerful to justify and explain themselves and walk you through as much of the code as you want them to.
Yeah, I’m not disputing that AI-assisted engineering is a real shift. It obviously is.
My issue is that we’ve now got a million secondary “paradigm shifts” layered on top: agent frameworks, orchestration patterns, prompt DSLs, eval harnesses, routing, memory, tool calling, “autonomous” workflows… all presented like you’re behind if you’re not constantly replatforming your brain.
Even if the end-state is “engineers code less”, the near-term reality for most engineers is still: deliver software, support customers, handle incidents, and now also become competent evaluators of rapidly changing bot stacks. That cognitive tax is brutal.
So yes, follow where the ball is going. I am. I’m just not pretending the current proliferation is anything other than noisy and expensive to keep up with.
> I don’t see how the average engineer has the energy or motivation to even understand these tools, never mind meaningfully compare them
This is why I use the copilot extension in VS code. They seem to just copy whatever useful thing climbs to the surface of the AI tool slop pile. Last week I loaded up and Opus 4.6 was there ready to use. Yesterday I found it has a new Claude tool built in which I used to do some refactoring... it worked fine. It's like having an AI tool curator.
I don't understand how this is different from giving an agent access to github logs? The landing page is terrible at explaining what it does.I guess they are just storing context in git aswell?
So is this just a few context.md files that you tell the agent to update as you work and then push it when you are done???
Huh, the checkpoint primitive is something that I've been thinking about for a while, excited to see how it's implemented in the CLI. Git-compatible structures seem to be a pretty big pull whenever they're talking about context management.
Actually interesting, but how's that different from just putting your learning / decision context into the normal commit text (body) ? An LLM can search that too, and doesn't require a new cli tool.
EDIT: Or just keep a proper (technical) changelog.txt file in the repo. A lot of the "agentic/LLM engineering frameworks" boil down to best approaches and proper standards the industry should have been following decades ago.
After I have an ai dona task, I ask the next one to look at that plan and git diff and so ble check validate
I don't see the need for a full platform that is separate from where my code already lives. If I'm migrating away, it's to something like tangled, not another VC funded company
Hey, is JJ compatibility in the cards? Considering the blog article hints at a goal of a developerless agent-to-agent automation platform I'm guessing developer conveniences are a side quest rn?
I had a similar, admitted poorly thought out idea a few months back.
I wanted to more or less build Jira for agents and track the context there.
If I had to guess 60 million is just enough to build the POC out. I don't see how this can compete though, Open AI or Anthro could easily spin up a competitor internally.
This is a good idea but I feel like you could get something similar by just adding an instruction for the agent to summarize the context for the commit into a .context/commit/<sha> file as a git hook.
Exactly. I don't want to wade through a whole session log just to get to reasoning, and more importantly, I don't want to taint my current agent context with a bunch of old context.
Context management is still an important human skill in working with an agent, and this makes it harder.
Checkpoints sounds like an interesting idea, and one I think we'll benefit from if they can make it useful.
I tried a similar(-ish) thing last year at https://github.com/imjasonh/cnotes (a Claude hook to write conversations to git notes) but ended up not getting much out of it. Making it integrated into the experience would have helped, I had a chrome extension to display it in the GitHub UI but even then just stopped using it eventually.
disclosure: i run a startup that will most likely be competitive in the future.
I welcome more innovation in the code forge space but if you’re looking for an oss alternative just for tracking agent sessions with your commits you should checkout agentblame
Another of your competitors here. It makes me giggle that we're going after the entire developer experience while Entire is only looking at a small corner of it.
Did you have to choose an adjective to name your product. Now it’s going to be very confusing for search engines and LLms.
“Tell me more about entire.”
“Entire what?”
“You know, that entire thing.”
love the shout but git-ai is decidedly not trying to replace the SCMs. there are teams building code review tools (commercial and internal) on top of the standard and I don't think it'll be long before GitHub, GitLab and the usual suspects start supporting it since folks the community have already been hacking it into Chrome extensions - this one got play on HN last week https://news.ycombinator.com/item?id=46871473
This feels a bit like when some Hubbers broke off to work on PlanetScale, except without the massively successful, proven-to-be-scalable open source tool to build off (Vitess).
If you're approaching this problem-space from the ground up, there are just so many fundamental problems to solve that it seems to me that no amount of money or quality of team can increase your likelihood of arriving at enough right answers to ensure success. Pulling off something like this vision in the current red-ocean market would require dozens of brilliant ideas and hundreds of correct bets.
The lack of explanation of what it is and does is a tell of what gullible audience they are seeking.
Tech marketing has become a lot like dating, no technical explanation and intellectual honesty, just word words words and unreasonable expectations.
People usually cannot be honest in their romantic affairs, and here it is the same. Nobody can state: we just want to be between you and whatever you want to accomplish, rent seeking forever!
Will they ever care to elaborate HOW things works and the rationale behind stating this provides any benefit whatsoever? Perhaps this is not intended for those type of humans that care about understanding and logic?
Essentially all software is augmented with agentic development now, or if not, built with technology or on platforms that is
It's like complaining about the availability of the printing press because it proliferated tabloid production, while preferring beautifully hand-crafted tomes. It's reactively trendy to hate on it because of the vulgar production it enables and to elevate the artisanal extremes that escape its apparent influence
What part of Voyager I and Voyager II are "augmented with agentic development?"
Surely if all software is augmented with agentic development now, our most important space probes have had their software augmented too, right?
What about my blog that I serve static pages on? What about the xray machine my dentist uses? What about the firmware in my toaster? Does the New York Stock Exchange use AI to action stock trades? What about my telescope's ACSOM driver?
You’re talking about a 1970s satellite? I guess you win the argument?
Blog: I use AI to make and blog developers are using agentic tools
X-ray machine: again a little late here, plus if you want to start dragging in places that likely have a huge amount of beaurocracy I don’t know that that’s very fair
Firmware in your toaster: cmon these are old basic things, if it’s new firmware maybe? But probably not? These are not strong examples
NYSE to action on stock trades; no they don’t use AI to action on stock trades (that would be dumb and slow and horribly inefficient and non-deterministic), but may very well now be using AI to work on the codebase that does
Let’s try to find maybe more impactful examples than small embodied components in toasters and telescopes, 1970s era telescopes that are already past our solar system.
Im saying you’re missing the point and the spirit of the argument. Yes, you are right, voyager doesn’t use agentic AI! I don’t even think the other examples you used are as agentic free as you think. They may or may not be! What’s the point you want to make?
It's really not as integral as you make it sound. If I make one PR on a widely used open source tool with a small fix, is most software development augmented by me?
Outside of simply not being true, the sentiment of what you're saying isn't much different than:
"Essentially all software is augmented with Stack Overflow now, or if not, built with technology or on platforms that is."
Agentic development isn't a panacea nor as widespread as you claim. I'd wager that the vast majority of developers treat AI is a more specified search engine to point them in the direction they're looking for.
AI hallucination is still as massive problem. Can't tell you the number of times I've used agentic prompting with a top model that writes code for a package based on the wrong version number or flat out invents functionality that doesn't exist.
I just cannot fathom how people can say something like this today, agentic tools have now passed an inflection point. People want to point out the short comings and fully ignore that you can now make a fully functioning iPhone app in a day without knowing swift or front end development? That I can at my company do two projects simultaneously, both of them done in about 1/4 the time and one would not have even been attempted before due to the SWE headcount you would have to steal. There are countless examples I have in my own personal projects that just are such an obvious counter example to the moaning “I appreciate the craft” people or “yea this will never work because people still have to read the code” (today sure and this is now made more manageable by good quality agents, tomorrow no. No you won’t need to read code.)
I've found that the effort required to get a good outcome is roughly equal to the effort of doing it myself.
If I do it myself, I get the added bonus of actually understanding what the code is doing, which makes debugging any issues down the line way easier. It's also in generally better for teams b/c you can ask the 'owner' of a part of the codebase what their intuition is on an issue (trying to have AI fill in for this purpose has been underwhelming for me so far).
Trying to maintain a vibecoded codebase essentially involves spelunking though a non-familliar codebase every time manual action is needed to fix an issue (including reviewing/verifying the output of an AI tool's fix for the issue).
(For small/pinpointed things, it has been very good. e.g.: write a python script to comb through this CSV and print x details about it/turn this into a dashboard)
In sonnet 4 and even 4.5 I would have said you are absolutely right, and in many cases it slows you down especially when you don’t know enough to sniff trouble.
Opus 4.5 and 4.6 is where those instances have gone down, waaay down (though still true). Two personal projects I had abandoned after sonnet built a large pile of semi working cruft it couldn’t quite reason about, opus 4.6 does it in almost one shot.
You are right about learning but consider: you can educate yourself along the way — in some cases it’s no substitute for writing the code yourself, and in many cases you learn a ton more because it’s an excellent teacher and you can try out ideas to see which work best or get feedback on them. I feel I have learned a TON about the space though unlike when I code it myself I may not be extremely comfortable with the details. I would argue we are about 30% of the way to the point where it’s not even no longer relevant it’s a disservice to your company to be writing things yourself.
The founder has only forked repositories on GitHub that are sort of light web development related.
His use of bombastic language in this announcement suggests that he has never personally worked on serious software. The deterioration of GitHub under his tenure is not confidence inspiring either, but that of course may have been dictated by Nadella.
If you are very generous, this is just another GitHub competitor dressed up in AI B.S. in order to get funding.
Founder here. I built commercial insurance software for Windows 95 in the 1990s, driver assistant systems at Mercedes and at Bosch in the early 2000s, dozens of iPhone apps as contractor, a startup called HockeyApp (acquired by Microsoft), and various smaller projects, mostly in Ruby on Rails. And of course, when I left Microsoft & GitHub, 10 years of green boxes were removed from my GitHub profile.
Sorry for not contributing to the discussion (as per the guidelines), but is it just me or this blog post reads a lot like LLM-filled mumble jumble? Seems like I could trim half of the words there and nothing would be lost.
Just have a data lake with annotated agent sessions and tool blobs (you should already be keeping this stuff for evals), then give your agent the ability to query it. No need for a special platform, or SaaS.
As for SDLC, you can do some good automations if you're very opinionated, but people have diverse tastes in the way they want to work, so it becomes a market selection thing.
I'm interested to see if they will try to tackle the segregation of human vs AI code.
The downside of agents is that they make too much changes to review, I prefer being able to track which changes I wrote or validated from the code the AI wrote.
For people trying to understand the product (so far), it seems that entire is essentially an implementation of the idea documented by http://agent-trace.dev.
I am already overloaded with information (generated by AI and humans) on my day to day job, why do I need this additional context, unless company I work for just wants to spend more money to store more slop?
How is it different than reversing it, given a PR -> generate prompt based on business context relevant to the repo or mentioned issues -> preserve it as part of PR description
I barely look at git commit history, why should I look for even higher cardinality data, in this case: WTF, are you doing, idiot, I said don't change the logic to make tests pass, I said properly write tests!
New agent framework / platform every week now. It's crazy how fast things move...just when you get comfortable with an AI flow something new comes out...
I don't see how we need a brand new paradigm just because LLMs evidently suck at sharing context in their Git commits. The rules for good commits still apply in The New Age. Git is still good enough, LLMs (i.e. their developer handlers) just need to leverage it.
Personally, I don't let LLMs commit directly. I git add -p and write my own commit messages -- with additional context where required -- because at the end of the day, I'm responsible for the code. If something's unclear or lacks context, it's my fault, not the robot's.
But I would like to see a better GitHub, so maybe they will end up there.
I did test it and use it and trashed it because there is very little value, actually none for me. These problems are easily being solved in other ways whoever has any experience with these tools. Getting $60M round for this stuff is ridiculous.
Which only reinforces someone just lit $60M on fire. It's trivial to do this and there are so many ways people do things, having the AI build custom for you is better than paying some VC funded platform to build something for the average
$60M seed to wrap git hooks in YAML config. The AI tooling bubble is just VCs subsidizing solutions looking for problems while developers want less complexity, not more.
I really hate this trend of naming companies using dictionary words just because they can afford to spend cash on the domain name instead of engineering. Render, fly, modal, entire and so on.
Really struggling to figure out what this is at a glance. Buried in the text is this line which I think is the tl;dr:
"As a result, every change can now be traced back not only to a diff, but to the reasoning that produced it."
This is a good idea, but I just don't see how you build an entire platform around this. This feels like a feature that should be added to GitHub. Something to see in the existing PR workflow. Why do I want to go to a separate developer platform to look at this information?
I'm sure i'm missing something but can you not ask the llm to add the reasoning behind the commit in the comments as part of the general llm instructions?
Oh man I'm tired. This reminds me of the docker era. It's all moving fast. Everyone's raising money. And 24 months from now it's all consolidating. It's all a nice hype game when you raise the funding but the execution depends on people finding value in your products and tools. I would argue yes many of these things are useful but I'd also argue there's far too much overlap, too many unknowns and too many people trying to reinvent the whole process. And just like the container era I think we're going to see a real race to zero. Where most of the dev tools get open sourced and only a handful of product companies survive, if that. I want to wish everyone the best of luck because I myself have raised money and spent countless years building Dev tools. This is no easy task especially as the landscape is changing. I just think when you raise $60m and announce a cli. You're already dead, you just don't know it. I'm sorry.
I see the value since I built a similar tool different approach. Then there's Beads, which is what inspired my project, with some tens of thousands of developers using it or more now? I'm not sure how they figure how many users they have.
In my case I don't want my tools to assume git, my tools should work whether I open SVN, TFS, Git, or a zip file. It should also sync back into my 'human' tooling, which is what I do currently. Still working on it, but its also free, just like Beads.
I wouldn't wanna be in the rat race myself, but I know people who salivate at the opportunity to create some popular dev tool to get acquired by MS, Google or Amazon or whichever of the big tech companies that decide this could work well in their cloud ecosystem.
On the one hand they think these things provide 1337x productivity gains, can be run autonomously, and will one day lead to "the first 1 person billion dollar company".
And in complete cognitive dissonance also somehow still have fantasies of future 'acquisition' by their oppressors.
Why acquire your trash dev tool?
They'll just have the agents copy it. Hell, you could even outright steal it, because apparently laundering any licensing issues through LLMs short circuits the brains of judges to protohuman clacking rocks together levels.
With 60 million you could have waited for a bigger announcement? There's "AI fatigue" among the target market for these sorts of tools, advertising unfinished products will take its toll on you later.
This is what Claude had to say about your comment if we're doing this now:
Imagine being so intellectually lazy that you can't even be bothered to form your own opinion about a product. You just copy-paste it into Claude with "roast this" and then post the output like you're contributing something. That's not criticism, that's outsourcing your personality to an API call. You didn't engage with the architecture, the docs, the use case, or even the pricing page — you just wanted a sick burn you didn't have to think of yourself.
Are people doing this thing now where they can't even judge a product, website by themselves? Or read and analyze anything without asking an LLM to do it for them.
2026: The year everyone fried their brain with Think for Me SaaS.
I think there's a distribution of agency in humans, hence why we have insults like "npcs". Its probably not fair to use that word to describe people, but the cliche has some truth in it and I think a lot of tech exploits this.
I personally rarely need to use google maps, and if I do its a glance at it on the beginning of a trip, and I can find my way there through normal navigation. I might look again if I get lost, whereas, I have friends that use it to give directions to go five blocks. I don't think sense of direction is innate either, but its a muscle you build and some people choose to not work on that muscle and they suffer the consequences, albeit minor consequences.
I think we are seeing something similar with LLMs with the development and maintenance of reading, planning, creative and critical thinking skills. While some people might have a higher baseline, I think everyone has the ability to strengthen those muscles and the world implores that us to in many situations, however, now we can pay Altman $0.0010 cents to offload that workout onto a GPU much like people do with navigation and maps. Tech companies love to exploit the dopamine driven response from taking shortcuts, getting somewhere quickly, its no different here.
I think (/know) the implications of this are much more hazardous than consequences of not exercising your navigational abilities, and at least with navigation there are fallback to assist people (signs, landmarks ect). There are no societal fallbacks for llm assisted thinking once someone becomes dependent on it for all aspects of analysis, planning and creativity. Once it is taken away (or they can't afford a quality of output the previously did), where do those natural abilities stand? The implications are very terrifying in my opinion.
I'm personally trying to stay as far away as possible from these things, I see where this is heading and its not as inconsequential as needing Maps to navigate 5 blocks. I do not want my critical thinking skills correlated 1:1 to the quality and quantity of tokens I can afford or have access too anymore than I do not want my navigational abilities correlated 1:1 to the quality of Maps service available to me.
People will say that this is cope, its the new calculator, whatever.. Have fun, I promise you that not knowing trigonometry but having access to an LLM does not give you the ability to write CAD software. I actually think not using these will give you a huge competitive advantage in the future. Someone who has great navigation skills will likely win a navigational competition in the mountains, or survive longer in certain situations.
While the scope of those skills is narrow, it still proves a point[0]. The scope of your reading, critical thinking, creativity and planning skills is not limited.
[0]: It should be noted that some of the worlds most high agency and successful people actually participate in navigation as sport called Orienteering, and spend boatloads of money in it.. I wonder why that is?
I know a guy who uses AI to answer every question in his life. It tells him how to raise his kids, how to spend time with his wife. He takes it to the park with him and asks it what he should do there (on his phone). When people ask him questions, he forwards those questions directly to his phone and uses the response.
For any new piece of technology, there are a subset of people for whom it will completely and utterly destroy.
I can understand applying this logic to building a startup that solves today’s ai shortcomings… but value delivered today is still valuable even if it becomes more effective tomorrow.
How do you both hold that the technology is so revolutionary because of its productive gains, but at the same time so esoteric that you better be ontop of everything all the time?
This stuff is all like a weird toy compared to other things I have taken the time to learn in my career, the sense of expertise people claim at all comes off to me like a guy who knows the Taco Bell secret menu, or the best set of coupons to use at Target. Its the opposite of intimidating!
I kinda regret going through the SeLU paper lol back in the late 2010s.
Is this the product? I don't want to jump on the detractor wagon, but I read the post and watched the video, and all I gathered is that it dumps the context into the commit. I already do this.
Hows your ability to get an enterprise to mandate their 5000 employees to use it? That's what most of these types of rounds are about.
(I will give the agent boom a bit of credit: I write a lot more documentation now, because it's essentially instruction and initial instruction to anything else that works on it. That's a total inversion, and I think it's good.)
The bigger problem is, like others have said, there's no one true flow. I use different agents for different things. I might summarize a lot of reasoning with a cheap model to create a design document, or use a higher reasoning model to sanity check a plan, whatever. It's a lot like programming in English. I don't want my tool to be prescriptive and imposing its technical restrictions on me.
All of that aside: it's impossible that this tool raised $60 million. The problem with this post is that it's supposed to be a hype post about changing the game "entirely" but it doesn't give us a glimpse into whatever we're supposed to by hyped about.
2. Don't put it in the message. Put it in files.
Then later if it goes off piste in another session tell it to re-read the ADDs for x, y and z.
If someone could make that process less clunky, that would be great. However it's very much not just funnel every turd uttered in the prompt onto a git branch and trying a chug the lot down every session.
The same very online group endlessly hyping messy techs and frontend JS frameworks, oblivious to the Facebook and Google sized mechanics driving said frameworks, are now 100x-ing themselves with things like “specs” and “tests” and dreaming big about type systems and compilers we’ve had for decades.
I don’t wanna say this cycle is us watching Node jockies discover systems programming in slow motion through LLMs, but it feels like that sometimes.
But I'm skeptical of building this as a separate platform rather than as tooling on top of git. The most useful AI dev workflow improvements I've seen (cursor rules, aider conventions, claude hooks) all succeeded precisely because they stayed close to existing tools. The moment you ask developers to switch their entire SDLC stack, adoption becomes the real engineering challenge - not the tech.
Curious whether the open source commitment means the checkpoint format itself will be an open spec that other tools can build on.
Just say what your thing does. Or, better yet, show it to me in under 60 seconds.
Web sites are the new banner ads and headings like that are the new `<blink>`.
It's been like this since the Dotcom era
Or did you forget that you can do anything at zombo.com?
It appears to be rather slow today, but here's a Wiki link for the uninitiated- https://en.wikipedia.org/wiki/Zombo.com
It's still around, but has been redesigned and it's under "new management". Further proof that the internet is dying.
Edit: Actually it may just be aimed at investors. Who cares about having a product?
The fact that the first image you see has "$60M seed" in big text, I have to agree, this does not feel aimed at devs.
It's almost like an extension of the "if you're not paying for the product, you are the product" idea. If you're assessing a tool like this and the marketing isn't even trying to communicate to you, the user, what the product does, aren't you also kind of "the product" in this case too?
Everything is AI at zombo.com.
Yes yes a Dropbox comment. But the problem here is 1 million people are doing the same thing. For this to be worth 60M seed I suspect they need to do something more than you can achieve by messing around locally."
"Claude build me a script in bash to implement a Ralph loop with a KV store tied to my git commits for agent memory."
This sounds like my current "phase" of AI coding. I have had so many project ideas for years that I can just spec out, everything I've thought about, all the little ideas and details, things I only had time to think about, never implement. I then feed it to Claude, and watch it meet my every specification, I can then test it, note any bugs, recompile and re-test. I can review the code, as you would a Junior you're mentoring, and have it rewrite it in a specific pattern.
Funnily enough, I love Beads, but did not like that it uses git hooks for the DB, and I can't tie tickets back to ticketing systems, so I've been building my own alternative, mine just syncs to and from github issues. I think this is probably overkill for whats been a solved thing: ticketing systems.
And I use git hooks on the tool event to print the current open gate (subtask) from task.md so the agent never deviates from the plan, this is important if you use yolo mode. It might be an original technique I never heard anyone using it. A stickie note in the tool response, printed by a hook, that highlights the current task and where is the current task.md located. I have seen stretches of 10 or 15 minutes of good work done this way with no user intervention. Like a "Markdown Turing Machine".
For me a gate is: a dependency that must pass before a task is closed. It could be human verification, unit testing, or even "can I curl this?" "can I build this?" and gates can be re-used, but every task MUST have one gate.
My issue with git hooks integration at that level is and I know this sounds crazy, but not everyone is using git. I run into legacy projects, or maybe its still greenfield as heck, and all you have is a POC zip file your manager emailed you for whatever awful reason. I like my tooling to be agnostic to models and external tooling so it can easily integrate everywhere.
Yours sounds pretty awesome for what its worth, just not for me, wish you the best of luck.
https://github.com/Giancarlos/GuardRails
I'm confused how this is any different to the pretty standard agentic coding workflow?
Beads is a nightmare.
[0]: https://github.com/Priivacy-ai/spec-kitty
Gotta bully that thing man. There's probably room in the market for a local tool that strips the superfluous niceties from instructions. Probably gonna save a material amount of tokens in aggregate.
The AI fatigue is real, and the cooling-off period is going to hurt. We’re deep into concept overload now. Every week it’s another tool (don’t get me started on Gas Town) confidently claiming to solve… something. “Faster development”, apparently.
Unless you’re already ideologically committed to this space, I don’t see how the average engineer has the energy or motivation to even understand these tools, never mind meaningfully compare them. That’s before you factor in that many of them actively remove the parts of engineering people enjoy, while piling on yet another layer of abstraction, configuration, and cognitive load.
I’m so tired of being told we’re in yet another “paradigm shift”. Tools like Codex can be useful in small doses, but the moment it turns into a sprawling ecosystem of prompts, agents, workflows, and magical thinking, it stops feeling like leverage and starts feeling like self-inflicted complexity.
The “I’m so tired of being told we’re in another paradigm shift” comments are widely heard and upvoted on HN and are just so hard to comprehend today. They are not seeing the writing on the wall and following where the ball is going to be even in 6-12 months. We have scaling laws, multiple METR benchmarks, internal and external evals of a variety of flavors.
“Tools like codex can be useful in small doses” the best and most prestigious engineers I know inside and outside my company do not code virtually at all. I’m not one of them but I also do not code at all whatsoever. Agents are sufficiently powerful to justify and explain themselves and walk you through as much of the code as you want them to.
My issue is that we’ve now got a million secondary “paradigm shifts” layered on top: agent frameworks, orchestration patterns, prompt DSLs, eval harnesses, routing, memory, tool calling, “autonomous” workflows… all presented like you’re behind if you’re not constantly replatforming your brain.
Even if the end-state is “engineers code less”, the near-term reality for most engineers is still: deliver software, support customers, handle incidents, and now also become competent evaluators of rapidly changing bot stacks. That cognitive tax is brutal.
So yes, follow where the ball is going. I am. I’m just not pretending the current proliferation is anything other than noisy and expensive to keep up with.
This is why I use the copilot extension in VS code. They seem to just copy whatever useful thing climbs to the surface of the AI tool slop pile. Last week I loaded up and Opus 4.6 was there ready to use. Yesterday I found it has a new Claude tool built in which I used to do some refactoring... it worked fine. It's like having an AI tool curator.
I also keep getting job applications for AI-native 'developers' whatever that means.
So is this just a few context.md files that you tell the agent to update as you work and then push it when you are done???
EDIT: Or just keep a proper (technical) changelog.txt file in the repo. A lot of the "agentic/LLM engineering frameworks" boil down to best approaches and proper standards the industry should have been following decades ago.
I don't see the need for a full platform that is separate from where my code already lives. If I'm migrating away, it's to something like tangled, not another VC funded company
I wanted to more or less build Jira for agents and track the context there.
If I had to guess 60 million is just enough to build the POC out. I don't see how this can compete though, Open AI or Anthro could easily spin up a competitor internally.
The readme is a bit more to the point.
Commit hook > Background agent summarizes (in a data structure) the work that went into the commit > saves to a note
Built similar (with a better name) a week ago at a hackathon: https://github.com/eqtylab/y
Context management is still an important human skill in working with an agent, and this makes it harder.
https://news.ycombinator.com/item?id=338286
I guess when you are Ex-Github CEO, it is that easy raising a $60M seed. I wonder what the record for a seed round is. This is crazy.
I tried a similar(-ish) thing last year at https://github.com/imjasonh/cnotes (a Claude hook to write conversations to git notes) but ended up not getting much out of it. Making it integrated into the experience would have helped, I had a chrome extension to display it in the GitHub UI but even then just stopped using it eventually.
https://github.com/eqtylab/y
I welcome more innovation in the code forge space but if you’re looking for an oss alternative just for tracking agent sessions with your commits you should checkout agentblame
https://github.com/mesa-dot-dev/agentblame
It's not like $60m in funding was given as charity.
If you're approaching this problem-space from the ground up, there are just so many fundamental problems to solve that it seems to me that no amount of money or quality of team can increase your likelihood of arriving at enough right answers to ensure success. Pulling off something like this vision in the current red-ocean market would require dozens of brilliant ideas and hundreds of correct bets.
But seriously, $300M valuation for a CLI tool that adds some metadata to Git commits. I don't know what to say.
Tech marketing has become a lot like dating, no technical explanation and intellectual honesty, just word words words and unreasonable expectations.
People usually cannot be honest in their romantic affairs, and here it is the same. Nobody can state: we just want to be between you and whatever you want to accomplish, rent seeking forever!
Will they ever care to elaborate HOW things works and the rationale behind stating this provides any benefit whatsoever? Perhaps this is not intended for those type of humans that care about understanding and logic?
It's like complaining about the availability of the printing press because it proliferated tabloid production, while preferring beautifully hand-crafted tomes. It's reactively trendy to hate on it because of the vulgar production it enables and to elevate the artisanal extremes that escape its apparent influence
Surely if all software is augmented with agentic development now, our most important space probes have had their software augmented too, right?
What about my blog that I serve static pages on? What about the xray machine my dentist uses? What about the firmware in my toaster? Does the New York Stock Exchange use AI to action stock trades? What about my telescope's ACSOM driver?
Blog: I use AI to make and blog developers are using agentic tools
X-ray machine: again a little late here, plus if you want to start dragging in places that likely have a huge amount of beaurocracy I don’t know that that’s very fair
Firmware in your toaster: cmon these are old basic things, if it’s new firmware maybe? But probably not? These are not strong examples
NYSE to action on stock trades; no they don’t use AI to action on stock trades (that would be dumb and slow and horribly inefficient and non-deterministic), but may very well now be using AI to work on the codebase that does
Let’s try to find maybe more impactful examples than small embodied components in toasters and telescopes, 1970s era telescopes that are already past our solar system.
The denial runs deep
"Essentially all software is augmented with Stack Overflow now, or if not, built with technology or on platforms that is."
Agentic development isn't a panacea nor as widespread as you claim. I'd wager that the vast majority of developers treat AI is a more specified search engine to point them in the direction they're looking for.
AI hallucination is still as massive problem. Can't tell you the number of times I've used agentic prompting with a top model that writes code for a package based on the wrong version number or flat out invents functionality that doesn't exist.
If I do it myself, I get the added bonus of actually understanding what the code is doing, which makes debugging any issues down the line way easier. It's also in generally better for teams b/c you can ask the 'owner' of a part of the codebase what their intuition is on an issue (trying to have AI fill in for this purpose has been underwhelming for me so far).
Trying to maintain a vibecoded codebase essentially involves spelunking though a non-familliar codebase every time manual action is needed to fix an issue (including reviewing/verifying the output of an AI tool's fix for the issue).
(For small/pinpointed things, it has been very good. e.g.: write a python script to comb through this CSV and print x details about it/turn this into a dashboard)
Opus 4.5 and 4.6 is where those instances have gone down, waaay down (though still true). Two personal projects I had abandoned after sonnet built a large pile of semi working cruft it couldn’t quite reason about, opus 4.6 does it in almost one shot.
You are right about learning but consider: you can educate yourself along the way — in some cases it’s no substitute for writing the code yourself, and in many cases you learn a ton more because it’s an excellent teacher and you can try out ideas to see which work best or get feedback on them. I feel I have learned a TON about the space though unlike when I code it myself I may not be extremely comfortable with the details. I would argue we are about 30% of the way to the point where it’s not even no longer relevant it’s a disservice to your company to be writing things yourself.
I see zero reason for a person to care about the checkpoints.
And for agents, full sessions just needlessly fill context.
So not sure what is being solved by this.
His use of bombastic language in this announcement suggests that he has never personally worked on serious software. The deterioration of GitHub under his tenure is not confidence inspiring either, but that of course may have been dictated by Nadella.
If you are very generous, this is just another GitHub competitor dressed up in AI B.S. in order to get funding.
Oh, nevermind, it’s some MS dude.
As for SDLC, you can do some good automations if you're very opinionated, but people have diverse tastes in the way they want to work, so it becomes a market selection thing.
Productizing the building blocks of the platform seems like the smart play in today's environment honestly.
I am already overloaded with information (generated by AI and humans) on my day to day job, why do I need this additional context, unless company I work for just wants to spend more money to store more slop?
How is it different than reversing it, given a PR -> generate prompt based on business context relevant to the repo or mentioned issues -> preserve it as part of PR description
I barely look at git commit history, why should I look for even higher cardinality data, in this case: WTF, are you doing, idiot, I said don't change the logic to make tests pass, I said properly write tests!
There is no Composer 2.0. There is Cursor 2.0 and Composer 1.5.
I couldn't find any references of Composer 2.0 anywhere. When did that come out?
- https://cursor.com/blog/composer-1-5
Personally, I don't let LLMs commit directly. I git add -p and write my own commit messages -- with additional context where required -- because at the end of the day, I'm responsible for the code. If something's unclear or lacks context, it's my fault, not the robot's.
But I would like to see a better GitHub, so maybe they will end up there.
Commit hook > Background agent summarizes (in a data structure) the work that went into the commit.
Built similar (with a better name) a week ago at a hackathon: https://github.com/eqtylab/y
Do we have new words for smaller amounts or is this inflation at work?
1. Tom Preston-Werner (Co-founder). 2008 – 2014 (Out for, eh... look it up)
2. Chris Wanstrath (Co-founder). 2014 – 2018
(2018: Acquisition by Microsoft: https://news.ycombinator.com/item?id=17227286)
3. Nat Friedman (Gnome/Ximian/Microsoft). 2018 – 2021
4. Thomas Dohmke (Founder of HockeyApp, some A/B testing thing, acquired by Microsoft in 2014). 2021 - 2025
There is no Github CEO now, it's just a team/org in Microsoft. (https://mrshu.github.io/github-statuses/)
"As a result, every change can now be traced back not only to a diff, but to the reasoning that produced it."
This is a good idea, but I just don't see how you build an entire platform around this. This feels like a feature that should be added to GitHub. Something to see in the existing PR workflow. Why do I want to go to a separate developer platform to look at this information?
https://github.com/entireio
In my case I don't want my tools to assume git, my tools should work whether I open SVN, TFS, Git, or a zip file. It should also sync back into my 'human' tooling, which is what I do currently. Still working on it, but its also free, just like Beads.
On the one hand they think these things provide 1337x productivity gains, can be run autonomously, and will one day lead to "the first 1 person billion dollar company".
And in complete cognitive dissonance also somehow still have fantasies of future 'acquisition' by their oppressors.
Why acquire your trash dev tool?
They'll just have the agents copy it. Hell, you could even outright steal it, because apparently laundering any licensing issues through LLMs short circuits the brains of judges to protohuman clacking rocks together levels.
Imagine being so intellectually lazy that you can't even be bothered to form your own opinion about a product. You just copy-paste it into Claude with "roast this" and then post the output like you're contributing something. That's not criticism, that's outsourcing your personality to an API call. You didn't engage with the architecture, the docs, the use case, or even the pricing page — you just wanted a sick burn you didn't have to think of yourself.
2026: The year everyone fried their brain with Think for Me SaaS.
I personally rarely need to use google maps, and if I do its a glance at it on the beginning of a trip, and I can find my way there through normal navigation. I might look again if I get lost, whereas, I have friends that use it to give directions to go five blocks. I don't think sense of direction is innate either, but its a muscle you build and some people choose to not work on that muscle and they suffer the consequences, albeit minor consequences.
I think we are seeing something similar with LLMs with the development and maintenance of reading, planning, creative and critical thinking skills. While some people might have a higher baseline, I think everyone has the ability to strengthen those muscles and the world implores that us to in many situations, however, now we can pay Altman $0.0010 cents to offload that workout onto a GPU much like people do with navigation and maps. Tech companies love to exploit the dopamine driven response from taking shortcuts, getting somewhere quickly, its no different here.
I think (/know) the implications of this are much more hazardous than consequences of not exercising your navigational abilities, and at least with navigation there are fallback to assist people (signs, landmarks ect). There are no societal fallbacks for llm assisted thinking once someone becomes dependent on it for all aspects of analysis, planning and creativity. Once it is taken away (or they can't afford a quality of output the previously did), where do those natural abilities stand? The implications are very terrifying in my opinion.
I'm personally trying to stay as far away as possible from these things, I see where this is heading and its not as inconsequential as needing Maps to navigate 5 blocks. I do not want my critical thinking skills correlated 1:1 to the quality and quantity of tokens I can afford or have access too anymore than I do not want my navigational abilities correlated 1:1 to the quality of Maps service available to me.
People will say that this is cope, its the new calculator, whatever.. Have fun, I promise you that not knowing trigonometry but having access to an LLM does not give you the ability to write CAD software. I actually think not using these will give you a huge competitive advantage in the future. Someone who has great navigation skills will likely win a navigational competition in the mountains, or survive longer in certain situations. While the scope of those skills is narrow, it still proves a point[0]. The scope of your reading, critical thinking, creativity and planning skills is not limited.
[0]: It should be noted that some of the worlds most high agency and successful people actually participate in navigation as sport called Orienteering, and spend boatloads of money in it.. I wonder why that is?
For any new piece of technology, there are a subset of people for whom it will completely and utterly destroy.