ChatGPT: Prose Over Poetry
Being the first out of the gate gives ChatGPT more name recognition than its competitors. In the same way that people use the brand name "Kleenex" to refer to a tissue, ChatGPT has become the catch-all reference to AI for a lot of people. Because it’s got the longest history, it also has more variations, options, and the confusion that comes with more variations and options.
ChatGPT is the customer-facing product that OpenAI offers to interface with its own LLM. Within ChatGPT, you can choose from among OpenAI’s family of different models—we're going to focus on the latest, GPT-4o. which also happens to be the one you can access with Type.
Pro-tip: If you have ChatGPT selected in Type and you're in Power Mode, you're using GPT-4o. If Power Mode is disabled, it's GPT-4o mini. The difference between the two is a trade off between speed and power: the mini version is faster and requires less power, but may not perform as well for more complex prompts.
TL;DR:
ChatGPT's Strengths
Complex Requests and Reasoning
ChatGPT became the first hot AI product because it was the first one to sufficiently blow our collective minds.
We'd all dealt with automated chatbots by the time of ChatGPT's release, so we were used to its limitations: you had to select from a variety of existing inputs to get a pre-scripted answer. Failing that, you could type something and the bot would latch onto the first keyword it recognized, ignore context, and give you a pre-scripted answer tied to that keyword.
ChatGPT, on the other hand, is great at complex reasoning and multi-step instructions. "Who wrote the first book about AI taking over the world, and is there any prediction in that book that came true?” It’s a question that requires a couple of layers of abstract and relational thought, which ChatGPT had no problem with:
You might say that this technically isn't the answer because it's not about an AI dystopia—GPT even admits as much in its response—but that's what’s so impressive about it. ChatGPT gets to the real intent of the question, which is when did we start thinking and worrying about this stuff? The question really isn't about AI, in other words, it's about the relationship humans have with it. This is what we mean by "complex reasoning."
Look at the response to the same question from Claude:
Claude goes for the literal answer, a book that specifically mentions AI taking over. The answer is right but maybe not as illustrative or helpful. Notice how the description just tells what the book is about, compared to ChatGPT talking about the book’s themes.
Reliably Accurate
Beyond these complex reasoning abilities, ChatGPT is also generally more reliable with factual information. When AI makes something up, it's called "hallucinating," and ChatGPT is less likely to just make stuff up. Perhaps that's because it's also fluent in more than 100 languages: science does say learning many languages sharpens the brain, maybe it benefits an AI one, too.
In any case, ChatGPT-4 Turbo has one of the lowest hallucination rates at around 1.7% (Note: this statistic is at the time of writing. The page the link takes you to updates itself frequently throughout the day, and the number may be a bit more or less than what we've quoted here).
GPT's Weaknesses
Robotic Writing
Have you ever curled up on a rainy Sunday morning with a cup of hot coffee and got lost in the lyrical prose of the owner's manual for your blender? Neither have we.
If you're looking for anything beyond dry, academic language, you'll need to go elsewhere—GPT is a fount of information but not really great at making it interesting. If you've ever asked a 10 year old about dinosaurs, it's the same vibe. Of course, writers using AI as a tool aren't actually handing off the copy they get without fact checking and significantly editing it first (you aren't doing that, right?), so this might not be a huge concern. But an LLM that has a more engaging way with words would be more helpful when you're needing inspiration.
ChatGPT is your assistant for research and first drafts, not the muse for your soul-baring memoir.
Small Context Window
Another drawback to ChatGPT is its relatively small context window. The context window is the amount of text an AI model can handle and remember at one time. It determines how much information the model can keep track of before it starts forgetting earlier parts of a conversation or text. Developers refer to the context window size in the terms of "tokens"—but we're going to express them in more practical terms for lay people: the number of words in that text the AI can “remember” at any time.
ChatGPT's latest model, GPT-4o, represents an increase over its previous versions, but it’s still a smaller context window compared to its competitors—roughly 96,000 words. You probably won't have a single conversation with ChatGPT that approaches that length, so no worries there. For research or summarization purposes, on the other hand, ChatGPT will likely lose the plot if you’re feeding it text any longer than 300 pages or so. Sorry, David Foster Wallace fans, but at 575,000 words your Infinite Jest summary will have to wait.
Claude: A More Expressive ChatGPT Alternative
Claude is a chat product that was designed almost as a response to ChatGPT's drawbacks. It was created by a company called Anthropic and runs on their own proprietary LLM, which is also called Claude.
Like ChatGPT, there are a number of different models Claude can work with (like Claude 3 Opus, or Claude 3.5 Sonnet). Just as we did in the previous section, we’ll use the term “Claude” to refer to the entire family of models and the interface.
Anthropic's mission in developing Claude was to create an AI model that prioritized humanity and ethics above all. They wanted to make AI more predictable, creating a system with better safety measures, so it acts more in line with what users want. Anthropic focused on the ethical side of AI to build a tool that is not only as good as others but also better at being trustworthy, safe, and more aligned with the human experience of conversation.
Where ChatGPT has high IQ, Claude has high EQ.
TL;DR:
Claude's Strengths
Large Context Window
The biggest technical difference between Claude and ChatGPT is the context window: Claude has consistently maintained a larger context window than ChatGPT, meaning it can handle larger jobs of ingesting and summarizing texts—up to about 150,000 words. Claude can carry on a coherent conversation with you for...well, however long it takes for the two of you to speak 150k words. The bigger context window also gives Claude the ability to ingest and summarize DFW’s 140k word Broom of the System—but analyzing Infinite Jest in full is still a pipe dream.
Expressive, Natural Language
What makes Claude stand out, and why it's become a favorite for writers, is how much more expressive it is than ChatGPT. It's probably "read" those two books mentioned above, in addition to mountains of other literature. The answers it gives read a lot more like a human, giving concise responses that ask the question without droning on, adding unnecessary context, or incessantly repeating jargon-y or corny words and phrases.
To show you what we mean, we fed all the text of this post—minus the conclusion—to Claude and ChatGPT and asked them to write the final word. It’s a stark contrast.
This facility with language also means Claude can, with direction, write in a variety of styles—conversational, casual, professional, even humorous styles are largely successful.
You can see that in the example above, where both AI assistants tried to close on a joke. Claude’s is pretty good and includes a callback to an earlier joke in the post. ChatGPT tried, so A for effort, but really all it did was reuse a comical phrase without adding to it, and so it just comes off as that Steve Buscemi meme.
Claude also includes many more ethical guardrails than ChatGPT or Gemini, as part of Anthropic’s mission is to ensure Claude’s output aligns with user values and isn’t providing harmful answers. This is why Claude is better suited to tasks that are more focused on the craft of writing: it’s great at helping you polish up your own dry prose, and makes a good colleague to bounce ideas off of and get feedback. It even has a knack for creative exercises, like dialogue and writing fiction. As you'll see, though, writing fiction is something Claude does even when you might not want it to.
Claude's Weaknesses
Regurgitation over Reasoning
As you saw in the ChatGPT section above, Claude doesn't reason quite as well, making it a less attractive solution if you're writing something that would benefit from that skill set. A product review, for example, wouldn't be ideal for Claude. It can easily find all the features, and then regurgitate the information back to you, but ChatGPT is better suited to suss out and then explain what the benefits or drawbacks of those features are.
High Rate of Hallucination
Still, Claude's biggest weakness is its tendency to hallucinate. Its latest model, 3.5 (aka Sonnet), outperforms its model 3 predecessor (Opus), but still has a pretty high making-stuff-up rate of 8.7%. This is just too high of a rate to deal with—you'll spend more time fact checking than just doing the research yourself—so Claude is really best for more creative endeavors.
And, while it can understand and speak around 50 languages, most of its training was in English, meaning you'll get your best results with that.
Gemini: You Just Never Know What to Expect
For the first time in decades, Google finds itself an outlier in a tech race. That probably won't last long, as its own engine is increasing in popularity and use. There's no option to use Gemini within Type, and Google has recently integrated a Gemini-based assistant inside of Google Docs.
Still, more people will be relying on its frontend chat for research and writing assistance as Google pushes its new tech, so it's a good idea to examine what it offers.
TL;DR:
Gemini's Strengths
Multimodal Input for Audiovisual Prompts
Gemini's biggest strength is its ability to process and understand more than just text. Feed it an image, audio recording, or a video clip; it can "see" and "hear" the contents before summarizing or taking questions on them. In a world where video content is increasingly prevalent online, this is a major reason to keep an eye out for Gemini-powered apps/assistants. We can see how this would be especially helpful for businesses wanting to record and summarize meetings, or making educational videos more accessible by transcribing them.
Largest Context Window
Another edge Gemini has is its context window size. At around 750,000 words, it beats out Claude in how much information it can handle, and that's just what the public has access to. Infinite Jest can finally be read in one sitting!
Behind the scenes, Google has been working to enlarge the size of Gemini's window—third party developers now have access to the latest version, which doubles the context window size of 2 million tokens, or roughly 1.5 million words. That's a lot of words, but this is likely more for video and audio purposes, both of which pack a lot more data around every word that's spoken. Gemini's 2 million tokens gives it the ability to process about 2 hours of video and 22 hours of audio—and this is very likely the direction that Google is headed with its AI.
Gemini's Weaknesses
Factually Unreliable
Without a doubt, Gemini is the weakest of the three when it comes to factual accuracy. Gemini's Model 1.5 has a hallucination rate of 9.1%, but this isn't even the worst of its offenses. The internet had a good laugh at Google's expense when they pushed Gemini out to the search engine and allowed it to answer search queries.
In the first few months of Gemini's release, the way it sourced information was the subject of scrutiny, as the app didn't seem to distinguish between reputable and non-reputable sources. It made no distinction between, say, the Washington Post and a user comment on Reddit or a conspiracy theory website.
Google's search ended up doling out answers that ranged from the absurd to the irresponsible: in fact, geologists did not suggest eating a rock a day, and gasoline is definitely not an ingredient you'd want to cook with. Six months after all this occurred, and the system has definitely improved, but that high hallucination rate does nothing to allay fears of misinformation.
It didn't help that just two months ago, the whole engine failed twice during a pre-scripted demo at the Google Pixel event.
None of this said to denigrate Gemini: it's the newest product of the bunch and bugs are to be expected. Plus, it's Google, and that name alone will keep people interested. For right now, though, your best use of Gemini is to quickly get summaries of audio and video files typed, which is a huge help for researchers.
The Last Word
So in a situation where it's Claude vs. Chat GPT vs. Gemini, which do you choose? When you're writing in Type, your options are Claude or ChatGPT and the answer is: it all depends on what you're doing. But because you can switch back and forth you can get sentence-by-sentence granularity using the models.
For longer passages that require superior abstract reasoning, ChatGPT is the go-to. If it comes back sounding like a marketing brochure full of jargon, you can switch over to Claude and ask it to rewrite it for lay people. And while you can't write on Type using Gemini, it's a great resource if you're using audio or video as your source.
And no matter, which one you choose for whatever task, remember the three most important rules of writing with AI:
- Fact Check
- Fact Check
- Fact Check
And don't try to pass off any drafts or revisions that have been purely generated by AI. Your voice needs to be front and center.
You are the writer, AI is your assistant.