[해외DS] ChatGPT를 본 구글의 답변, Bard

[해외DS]는 해외 유수의 데이터 사이언스 전문지들에서 전하는 업계 전문가들의 의견을 담았습니다. 저희 데이터 사이언스 경영 연구소 (MDSA R&D)에서 영어 원문 공개 조건으로 콘텐츠 제휴가 진행 중입니다.

구글은 검색 업계의 왕좌를 호락호락하게 넘겨주지 않으려나 봅니다. 그게 마이크로소프트가 됐건, 누가 됐건 말입니다. WIRED에 따르면 어제(6일) 구글이 “앞으로 몇 주 안에” Bard라는 이름의 챗봇을 출시할 것이라고 발표했습니다. 마이크로소프트의 자금 지원을 받은 스타트업 OpenAI가 만든, 최근 엄청나게 화제가 된 인공 지능 챗봇 ChatGPT를 겨냥한 겁니다.

구글 CEO Sundar Pichai는 자신의 블로그를 통해 Bard는 이미 “신뢰할 수 있는 테스터”가 쓸 수 있는 수준에 도달했고, 대화형 인터페이스를 통해 “세계의 폭넓은 지식”을 제공하도록 설계됐다고 밝혔습니다. Bard에는 구글이 지난 2021년 5월 출시한 AI 모델 “LaMDA”의 축소판이 이용됐습니다. (LaMDA는 ChatGPT와 유사한 기술로 만들어졌습니다.) 구글은 이 모델 덕에 보다 많은 사용자가 챗봇을 쓸 수 있고, 챗봇이 낸 응답의 품질과 정확성에 대한 문제를 해결하는 데 유용한 피드백을 모을 수 있다고 덧붙였습니다.

Google과 OpenAI가 챗봇을 구축할 때 쓴 텍스트 생성 소프트웨어는 ‘유창하지만’ 조작하기 쉽다는 특징을 가지고 있습니다. 온라인에 돌아다니는 ‘나쁜’ 연설문을 따라할 수도 있다는 겁니다. 긍정적인 사람들은 이들 챗봇으로 웹 검색 업계의 역사를 다시 쓰고, 기술에 기반해 강력하고 수익성 넘치는 신제품을 만들어낼 수 있을 거라 주장합니다. 하지만 텍스트 생성 소프트웨어가 지닌 이런 ‘본질적인’ 결함이나 새로운 정보를 가지고 업데이트하는 작업이 쉽지 않다는 점은 분명 걸림돌이 될 겁니다.

특이하게도 Pichai는 구글이 내는 수익의 큰 부분을 차지하는 “검색 상자”에 Bard를 통합할 계획을 발표하지 않았습니다. 그 대신 기존의 검색 기능을 향상시키기 위해 Bard의 기반이 되는 AI 기술을 새롭게, 그리고 신중하게 사용하는 방법이 제시됐습니다.

구글은 이제 하나로 통일된 답변이 없는 질문을 받은 경우 서로 다른 의견을 종합한 답변을 내놓을 겁니다. “피아노를 배우는 게 쉬울까요, 기타를 배우는 게 쉬울까요?”라는 내용을 검색한다면 “손가락과 손의 움직임이 더 자연스러워서 피아노를 배우는 게 더 쉽다는 사람도 있고, 기타를 가지고 코드를 배우는 게 더 쉽다는 사람도 있습니다.”라는 답변이 나오는 식입니다.

아울러 Pichai는 OpenAI가 ChatGPT를 활용하는 것처럼 구글도 API를 통해 개발자가 Bard의 기반 기술을 활용 가능하게 만들 계획이라고 전했습니다. 다만 구체적인 일정은 제시하지 않았습니다.

ChatGPT가 불러온 파장은 실로 엄청났습니다. 구글이 차지하고 있는 웹 검색 분야의 왕좌가 몇 년 만에 처음으로 위태로워졌다는 이야기가 나올 정도였으니 말입니다. 마이크로소프트는 최근 OpenAI에 약 100억 달러를 투자했는데, 오늘(7일) ChatGPT 제작자와의 협업에 대한 미디어 행사를 개최할 예정입니다. 이 자리에서는 자사의 검색 엔진 Bing(구글에 이은 시장 2위 검색 엔진이예요.)에 탑재할 신기능이 발표될 거란 전망이 있습니다. OpenAI의 CEO Sam Altman은 구글 발표 직후 마이크로소프트 CEO Satya Nadella와 함께 찍은 사진을 트위터에 올렸습니다.

지난 11월 OpenAI가 ‘조용히’ 출시한 ChatGPT는 이제 명실상부한 인터넷 센세이션이 됐습니다. 복잡한 질문에도 명백한 일관성과 명료성을 지닌 답변을 내놓는 능력을 경험한 많은 사람들은 교육, 비즈니스 그리고 일상 생활에 다가올 ‘혁명’을 꿈꾸고 있습니다. 하지만 몇몇 AI 전문가는 이 툴이 사실 들어온 정보를 이해하지 못한 채 그저 상황을 구성하기만 하려는 본성을 가지고 있으니 주의해야 한다고 지적했습니다.

사실 지금의 상황은 구글의 몇몇 AI 전문가들한텐 특히 골칫거리일 겁니다. ChatGPT의 이면에 있는 일부 기술은 구글 연구원들의 작품이기 때문인데, Pichai 역시 구글 블로그 게시물에서 이를 언급한 바 있습니다. Pichai는 “우리는 6년 전 AI를 중심으로 회사의 방향을 전환했다”며 “그 이후로 AI에 대한 전반적인 투자를 계속해 왔다”고 말했습니다(Pichai는 CEO가 되기 전 구글의 AI 연구 부서와 구글이 2014년에 인수한 영국 기반 AI 스타트업 DeepMind에서 일했습니다.).

ChatGPT는 GPT를 기반으로 구축됐습니다. GPT는 구글이 발명한 “변환기(transformer)”로, 일련의 텍스트를 가져와 다음에 무엇이 올지 예측하는 AI 모델입니다. OpenAI는 다량의 데이터를 변환기 모델에 입력한 뒤 이를 실행하는 컴퓨터의 성능을 높여 언어 혹은 이미지 생성에 특화된 시스템을 만들어내는 과정을 공개적으로 시연하면서 유명해졌습니다. ChatGPT는 실제 인간이 (프로그램이 내놓은) 다양한 답변을 보고 피드백한 내용을 출력물 미세 조정 역할을 맡은 다른 AI 모델에 전달하는 방식으로 GPT에서 한 걸음 더 나아갔습니다.

원래 구글은 LaMDA의 기반이 된 기술을 실제 제품에 적용하는 건 좀더 신중히 생각해 보자는 입장이었습니다. 아직 시기상조라고 생각했던 건데요, 웹에서 ‘스크랩한’ 텍스트로 훈련된 AI 모델은 잘못된 정보를 내놓는 문제뿐 아니라 인종과 성별에 대한 편견, 혐오스러운 언어를 반복하는 경향도 가지고 있습니다.

이런 한계는 이미 지난 2020년 몇몇 구글 연구원들이 연구 논문 초안을 통해 강조한 부분이기도 합니다. 당시 이들은 텍스트 생성 기술에 주의를 기울일 필요가 있다고 주장했는데, 덕분에 경영진 몇 명에게 눈엣가시 같은 존재가 됐던 듯 합니다. 구글이 얼마 뒤 윤리적인 AI 연구원으로 유명한 Timnit Gebru과 Margaret Mitchell을 회사에서 내보낸 걸 보면 말입니다.

아무튼 LaMDA의 이면에 있는 기술을 연구하던 다른 구글 연구원들은 구글이 보인 ‘망설임’에 실망했고, 회사를 떠나 같은 기술을 활용하는 스타트업을 만들었습니다. 결국 구글은 이들이 만들어낸 ChatGPT로 인해 텍스트 생성 기능을 제품에 도입하기 위한 작업에 박차를 가하게 됐습니다.

GOOGLE ISN’T ABOUT to let Microsoft or anyone else make a swipe for its search crown without a fight. The company announced today that it will roll out a chatbot named Bard “in the coming weeks.” The launch appears to be a response to ChatGPT, the sensationally popular artificial intelligence chatbot developed by startup OpenAI with funding from Microsoft.

Sundar Pichai, Google’s CEO, wrote in a blog post that Bard is already available to “trusted testers” and designed to put the “breadth of the world’s knowledge” behind a conversational interface. It uses a smaller version of a powerful AI model called LaMDA, which Google first announced in May 2021 and is based on similar technology to ChatGPT. Google says this will allow it to offer the chatbot to more users and gather feedback to help address challenges around the quality and accuracy of the chatbot’s responses.

Google and OpenAI are both building their bots on text generation software that, while eloquent, is prone to fabrication and can replicate unsavory styles of speech picked up online. The need to mitigate those flaws, and the fact that this type of software cannot easily be updated with new information, poses a challenge for hopes of building powerful and lucrative new products on top of the technology, including the suggestion that chatbots could reinvent web search.

Notably, Pichai did not announce plans to integrate Bard into the search box that powers Google’s profits. Instead he showcased a novel, and cautious, use of the underlying AI technology to enhance conventional search. For questions for which there is no single agreed-on answer, Google will synthesize a response that reflects the differing opinions.

For example, the query “Is it easier to learn the piano or the guitar?” would be met with “Some say the piano is easier to learn, as the finger and hand movements are more natural … Others say that it’s easier to learn chords on the guitar.” Pichai also said that Google plans to make the underlying technology available to developers through an API, as OpenAI is doing with ChatGPT, but did not offer a timeline.

The heady excitement inspired by ChatGPT has led to speculation that Google faces a serious challenge to the dominance of its web search for the first time in years. Microsoft, which recently invested around $10 billion in OpenAI, is holding a media event tomorrow related to its work with ChatGPT’s creator that is believed to relate to new features for the company’s second-place search engine, Bing. OpenAI’s CEO Sam Altman tweeted a photo of himself with Microsoft CEO Satya Nadella shortly after Google’s announcement.

Quietly launched by OpenAI last November, ChatGPT has grown into an internet sensation. Its ability to answer complex questions with apparent coherence and clarity has many users dreaming of a revolution in education, business, and daily life. But some AI experts advise caution, noting that the tool does not understand the information it serves up and is inherently prone to making things up.

The situation may be particularly vexing to some of Google’s AI experts, because the company’s researchers developed some of the technology behind ChatGPT—a fact that Pichai alluded to in Google’s blog post. “We re-oriented the company around AI six years ago,” Pichai wrote. “Since then we’ve continued to make investments in AI across the board.” He name-checked both Google’s AI research division and work at DeepMind, the UK-based AI startup that Google acquired in 2014.

ChatGPT is built on top of GPT, an AI model known as a transformer first invented at Google that takes a string of text and predicts what comes next. OpenAI has gained prominence for publicly demonstrating how feeding huge amounts of data into transformer models and ramping up the computer power running them can produce systems adept at generating language or imagery. ChatGPT improves on GPT by having humans provide feedback to different answers to another AI model that fine-tunes the output.

Google has, by its own admission, chosen to proceed cautiously when it comes to adding the technology behind LaMDA to products. Besides hallucinating incorrect information, AI models trained on text scraped from the Web are prone to exhibiting racial and gender biases and repeating hateful language.

Those limitations were highlighted by Google researchers in a 2020 draft research paper arguing for caution with text generation technology that irked some executives and led to the company firing two prominent ethical AI researchers, Timnit Gebru and Margaret Mitchell.

Other Google researchers who worked on the technology behind LaMDA became frustrated by Google’s hesitancy, and left the company to build startups harnessing the same technology. The advent of ChatGPT appears to have inspired the company to accelerate its timeline for pushing text generation capabilities into its products.