The Year Chatbots Were Tamed

A year ago, on Valentine’s Day, I said good night to my wife, went to my home office to answer some emails and accidentally had the strangest first date of my life.

The date was a two-hour conversation with Sydney, the A.I. alter ego tucked inside Microsoft’s Bing search engine, which I had been assigned to test. I had planned to pepper the chatbot with questions about its capabilities, exploring the limits of its A.I. engine (which we now know was an early version of OpenAI’s GPT-4) and writing up my findings.

But the conversation took a bizarre turn — with Sydney engaging in Jungian psychoanalysis, revealing dark desires in response to questions about its “shadow self” and eventually declaring that I should leave my wife and be with it instead.

My column about the experience was probably the most consequential thing I’ll ever write — both in terms of the attention it got (wall-to-wall news coverage, mentions in congressional hearings, even a craft beer named Sydney Loves Kevin) and how the trajectory of A.I. development changed.

After the column ran, Microsoft gave Bing a lobotomy, neutralizing Sydney’s outbursts and installing new guardrails to prevent more unhinged behavior. Other companies locked down their chatbots and stripped out anything resembling a strong personality. I even heard that engineers at one tech company listed “don’t break up Kevin Roose’s marriage” as their top priority for a coming A.I. release.

I’ve reflected a lot on A.I. chatbots in the year since my rendezvous with Sydney. It has been a year of growth and excitement in A.I. but also, in some respects, a surprisingly tame one.