I would describe myself as “cautiously optimistic” about generative AI. LLMs are clearly capable of doing some useful things. Anyone who tries to argue otherwise at this point probably sounds a lot like they’re trying to argue the sky isn’t blue or the moon isn’t cheese.
For a little background, I first heard about GPT-2 back when it came out, a couple of years before ChatGPT hit the world stage, and some of the smartest nerds I know were marveling at the fact that you could get it to play chess competently. This was a surprising result, because LLMs use statistical techniques against words to predict outputs given an input. It shouldn’t know what “chess” is, and it shouldn’t be competent at the game.
Early ChatGPT didn’t really impress me that much, and it routinely got stuff wrong. My best example of this would violate confidentiality, so consider that I’m not necessarily trying to convince you that old ChatGPT sucked. I think that should be obvious. Mostly I’m mentioning it to explain why I didn’t even bother to start trying to use it or its competitors until much later.
Once Google announced Bard, I joined that bandwagon and made a few slight attempts to get it to do something useful. It mostly couldn’t, which also shouldn’t be surprising. An example of this was asking it to plan a Hawaiian vacation for me. I figured planning a vacation was the kind of thing that GenAI could probably handle, there should be lots of examples of vacations that people enjoyed. It really, really wanted me to go see the Pearl Harbor memorial. (I did not.)
Okay so that’s all fine and good. Jump to today. I use ChatGPT at work, and Claude at home, and for my use cases, I think Claude is a lot better, but I wanted to talk a little bit about what I’m actually using them for. What do I think they’re good at?
ChatGPT has proven itself quite adept at taking my stream of consciousness and making something other people could understand from it. If I start typing out the details of an email I would like to send, the important points, the audience, the tone, it will generate an email that’s perfectly cromulent. Same with a Slack message. But also same with a Jira ticket. If I know basically what needs to happen and the order it needs to happen in, then Jira saves me the trouble of spending time formatting it into a proper set of acceptance criteria.
Claude, with its Artifacts feature, has helped me develop and then fill out templates for data that I want to live inside my knowledge management system. I asked Claude for a template for a System of Systems document, and together we worked to refine it to have the detail level I wanted. Then I tossed it my list of systems, and it did a decent job of a first pass to fill it out.
The recurring theme is that I’ve found GenAI saves me from typing but not from thinking. Best case scenario it will prompt me with some questions that I might not have considered, but the important connection-making and organizational work is still happening in my brain. Given that, even as an unstructured stream of consciousness, the LLM can then enforce an order and produce an output that might be useful for someone else trying to understand what I thought. I don’t really expect this to change, and once the hype cycle properly dies down I think we’ll discover it’s roughly the limit of what an LLM can do. It’ll get better at doing it, but it can’t really have an “original” thought.
My question is how much money would you pay for the functionality I’ve just described? Right now both ChatGPT and Claude have generous free tiers, but that’s because their investors are expecting them to eat the world at some point, and, once eaten, for GenAI to have become so essential that people won’t be able to remember how not to use it. They’re building their captive audience today, and eventually that bill is going to come due.
I suspect the true cost of this service is a lot more than I’d be willing to pay to avoid typing an email, but time will tell. Maybe we’ll find a way to make “good enough” LLMs operate affordably. We’re still in the “a computer takes up an entire room and only universities have them” phase of this technology, and right now I have five different computers on my desk all of which are more capable than a mainframe was in 1982.
I’m also concerned about the environmental impact, in the sense that we burn a truly fantastic amount of electricity to provide this service. But I think that’s more general than LLMs. We spend a truly fantastic amount of electricity to do lots of things, and are generally incapable as a society of building real solutions to that problem.
Finally, I’m concerned that OpenAI, Anthropic, and every other GenAI company out there had to basically steal all of the world’s knowledge to build this thing. I mention this, because I think it’s worth mentioning. Copyright infringement is apparently okay when Microsoft does it, but if the consumer pirates a copy of Windows now suddenly they care? I am not compelled by arguments about how hard it would be to solve this problem, nor how much it would cost. If we wanted to, we would. But I definitely think it’s fair to consider this in your rubric for how seriously to respect the rights of corporations in turn.
For now I continue to stand by my policy of no GenAI content on this blog. It would defeat entirely the purpose. I’m not trying to impress anyone. I’m barely even trying to convince anyone. These are my thoughts, filtered through my own artistic sense of the right words to use to convey them. 90% of the value is that I took the time to express them, and 10% is whatever value other people might derive from consuming them. But it would be a tactical error to continue to avoid GenAI indefinitely. If you’re not engaging with it currently I’d strongly recommend thinking about where it might be able to help you; the answer is almost certainly not “nowhere.”