Why isn't AI better? My wish list for GPT-5, OpenAI gets closer to AGI, five things NOT to use ChatGPT for, and will agents end service businesses?

Written by Fola Yahaya

Thought of the week: the gap between AI theory and practice

Following the massively hyped updates from OpenAI and Google, I spent the week playing with the various plug-ins. Quick recap – the race is now on to extend beyond the chatbot to your desktop (and everywhere else). Both Microsoft and Google are desperately trying to claw back their huge investments (and also prop up their share prices) by embedding AI in new PCs and desktop applications. The only problem is that neither Copilot (Windows) nor Gemini (Google) are any good. I gave the paid version of Google Workspace a really simple task:

  1. Visit this website.
  2. Check for any procurement opportunities.
  3. Put the high-level details of the opportunity into a four-column table (name, title, type of procurement and submission deadline).

Interestingly, Gemini couldn’t understand my prompt and basically responded, “Computer says no”.

Despite the fact that I gave Claude a link to an organisation’s “Procurement Opportunities” page, it denied that there were any on the page, and it was only after I told it how stupid it was that it finally complied and created the table. ChatGPT, like an old friend who ‘just gets you’, got the table right the first time.

So far so good – useful AI that would save me having to pay someone for a really tedious task. The only problem: none of the procurement opportunities existed. Both Claude and ChatGPT, always eager to please, just made stuff up.

AI systems can’t (yet) tackle my top 10 most tedious tasks (see below), and therein lies the frustration. Sam Altman argues quite rightly that OpenAI’s near-term goal is creating an AI that is a:

“super-competent colleague that knows absolutely everything about my whole life, every email, every conversation I’ve ever had, but doesn’t feel like an extension.”

It seems that despite the heavy price we’re prepared to pay (lack of privacy and the Big Brother implications that I discussed last week), apart from writing boilerplate content, AI is still giving us little useful, practical applications in return.

This is all likely to change by the autumn with the expected launch of OpenAI’s next, all-powerful release, GPT-5. If it lives up to the hype (and so far OpenAI have yet to disappoint), then expect AI agents that really do impact employment and force us to confront the wider implications of letting the genie out of the bottle.

For the moment, don’t trust ANY unedited content by an AI, especially one that claims to be be able to browse the web.


Top five things you should NOT use ChatGPT for

In case you think that ChatGPT is more than autocomplete on steroids, you would do well to remember AI is like a hyper-keen intern that will do everything it can, including inventing new ‘truths’, to keep you happy.

Here is a list of what you shouldn’t use ChatGPT/Claude/Gemini for:

  1. Communicating with impact. This applies to any content that you want people to enjoy reading or that is important.
  2. Web research. Despite the fact that AI bots now largely have real-time access to the web, they more often than not go rouge and make stuff up.
  3. Legal, medical and financial advice. ChatGPT excels at writing convincing legalese, but it’s lost its potency now that everyone is using it for complaint letters and contract queries. If it’s legally important, use a lawyer.
  4. Creative brainstorming. Keen-eyed readers may vaguely recall that I wrote previously about how AI can be good for this. However, using it as a starter for 10 can take you down a rabbit hole, so use it with caution.
  5. Decision-making. Don’t be tempted to abdicate a tough decision to an all-knowing AI.

 


OpenAI: GPT-5 is on its way and it’s going to be a beast

Buried in a blog post about safely developing AI, blah blah blah, OpenAI officially confirmed the rumours that it had begun training a new flagship AI model that would succeed the GPT-4 technology that currently underlies ChatGPT.

What was interesting was how capable they think Optimus Prime/Skynet/HAL/GPT-5 (or whatever they call it) will be.

“… we anticipate the resulting systems to bring us to the next level of capabilities on our path to AGI.”

The new model would be an engine for AI products, including chatbots, digital assistants akin to Apple’s Siri, search engines and image generators.

OpenAI, and Altman in particular, are having a rough time as of late. Its co-founder and head of safety resigned last week, citing the company’s ‘move fast and break things’ culture, and this week came more bad PR from revelations that OpenAI’s board only found out about ChatGPT’s release through Twitter.


My seven-point ChatGPT-5 wish list

  1. Document formatting: How many hours do we waste as a species formatting office documents?!
  2. Converting/editing PDFs: If Adobe is so clever, why does it make changing and converting PDFs so difficult? Yes, I know PDFs are designed to be uneditable, but to err is human…
  3. Filling in forms: Another colossal waste of our limited lifespan. From sign-up forms to filling out a medical history, this is AI assistant fodder.
  4. Doing my taxes (properly) – obviously.
  5. Finding loopholes in said taxes (without hallucinating) – even more obviously.
  6. Some kind of spaced repetition system that helps me remember new information by prompting me to recall it at optimised periods.
  7. An AI that wards off senescence by getting me to think rather than doing my thinking for me. Imagine an AI that asks you probing questions about why you did what you did. A bot that encourages self-reflection and generally helps you be a better human. Now wouldn’t that be wonderful.

 


@ the UN’s AI for Good conference

Shorter newsletter this week as we’re in Geneva for the AI for Good conference. I attended this five years ago and pre generative AI and, on such an impactful topic, it was frankly as dull as ditchwater. My full, hopefully interesting takeaways will follow next week.


AI silicon snake oil of the week

First posted on Tuesday, this video of the world’s first AI-powered head transplant machine is an exercise in how to go viral. the video has millions of views, more than 24,000 comments on Facebook, and a content warning on TikTok for its grisly depictions of severed heads. A further convincer was a slick website with several job postings, including one for a “Neuroscience Team Leader” and another for a “Government Relations Adviser”.

It was so convincing that the bastion of journalistic integrity 😉 the New York Post wrote that BrainBridge is “a biomedical engineering startup” and that “the company” plans to perform an actual surgery within eight years.

Also, it was all fake. BrainBridge is not a real company, and the video was made by one Hashem Al-Ghaili, a Yemeni science communicator and film director who, in 2022, made a viral video called “EctoLife” about artificial wombs that also left journalists scrambling to determine if it was real or not.


What we’re reading this week

  • AI researcher Kai-Fu Lee has doubled down on his 2017 prediction that AI would displace 50% of jobs by 2027, saying white collar jobs will be eliminated faster than blue collar work.
  • RIP financial analysts and advisers: AI outperforms financial analysts at picking winning stocks. This doesn’t surprise me because: a) close to 75% of all US stock market trades are done by a bot now anyway and b) financial analysts, like economists, are infamous for being useless at predicting anything. What this study does show is that AI is about to drastically alter the financial industry, and it possibly challenges assumptions about AI’s ability to excel at judgement-based tasks. With the imminent releases of the next generation of AI tools, financial analysis will only get better and thus remove a whole layer of employment in the financial sector.
  • Elon Musk’s xAI raises $6 billion 🤯 to take on OpenAI.
  • Apple signs deal with OpenAI for iOS.
  • Facebook will soon use your photos, posts and other info to train its AI. Yes, you can opt-out (but it’s oh-so complicated).
  • Our new newsletter Communicating Development, aimed at comms officers in the development sector.

 


Tools we’re playing with this week

Visily: I’m constantly building prototypes for apps and software. Visily has some cool features like Screenshot-to-UI.

Robert: We’ve just launched the first computer-aided translation (CAT) tool designed and still owned by a translation company. Five years in the making, Robert is an easy-to-use and fairly priced CAT tool. Check it out.

RECENT

POSTS

Stay in the loop

We’ll send you quarterly updates so you can see what we’re working on, what the future holds and how we’re shaping it.