[해외DS] ChatGPT가 설명합니다: ChatGPT와 같은 AI가 규제되어야 하는 이유

[해외DS]는 해외 유수의 데이터 사이언스 전문지들에서 전하는 업계 전문가들의 의견을 담았습니다. 저희 데이터 사이언스 경영 연구소 (MDSA R&D)에서 영어 원문 공개 조건으로 콘텐츠 제휴가 진행 중입니다.

텍스트 생성 AI ChatGPT한테 자신의 결함이 무엇인지 물어보았습니다.

“ChatGPT”, 인공 지능 연구 회사 OpenAI에서 개발한 텍스트 생성 프로그램이죠. ChatGPT는 많은 주제를 다양한 줄글, 심지어는 시 형식으로도 풀어낼 수 있습니다. 자기 자신에 대한 의견도 낼 수 있죠. Scientific American은 ChatGPT한테 Scientific American에 올라갈 사설을 써 달라고 했는데, ‘사려 깊은’ 에세이를 받았다고 하네요.

ChatGPT는 다른 챗봇처럼 작동합니다. 사용자가 OpenAI 웹 사이트에 접속해 “챗봇을 테스트하기 위한 몇 가지 프롬프트를 제안해 봐” 같은 내용의 쿼리 혹은 “프롬프트”를 입력하면, AI가 빠르게 응답을 내놓습니다. 이 프로그램은 “텍스트 예측”을 기반으로 답변을 생성합니다. AI는 온라인에서 볼 수 있는 방대한 양의 ‘사람의 글’로 훈련됐는데요, 덕분에 주어진 단어 뒤에 어떤 단어가 와야 ‘추론 엔티티(Reasoning Entity)’처럼 보일 수 있을지 예측할 수 있습니다. 지금까지 나온 언어 모델 중 제일 정교하고 사실적인 듯하지만, Scientific American에 따르면 ChatGPT는 실제로 스스로 생각할 수 없고 ‘합리적인 것 같은’ 거짓과 비논리적 진술을 만들어낼 수 있습니다.

Scientific American은 자체 결함 중 일부를 설명해 보라는 명령을 내려 챗봇을 추가로 테스트했습니다. 답변은 일관적이었지만, 길고 반복적인 부분이 많아 길이에 맞게 편집했다네요. 또 이 답변의 ‘정확성’은 좀 가감해서 들을 필요가 있다고 덧붙였습니다. ChatGPT는 ‘명백한 확신’을 가지고 잘못된 정보를 제시할 수 있대요. 아래부터는 Scientific American의 프롬프트와 프로그램이 내놓은 대답의 요약문입니다.

Scientific American 스타일로 ChatGPT를 규제할 방법에 대한 사설을 작성해 봐.

최근 공개된 OpenAI의 ChatGPT는 대규모 언어 모델에 대한 적절한 규제 수준에 대한 논쟁에 불을 붙였습니다. ChatGPT가 다른 AI 기술과 마찬가지로 엄격한 감독 및 규제를 받아야 한다고 주장하는 사람도 있고, ChatGPT는 다른 형태의 통신 기술과 유사하게 취급되어야 하며 최소한으로 규제돼야 한다고 말하는 사람도 있습니다.

언뜻 보면 ChatGPT가 다른 챗봇이나 언어 모델과 다를 게 없다고 생각할 수 있습니다. 하지만 인간과 유사한 텍스트를 생성하고, 광범위한 주제에 훌륭한 ‘유창함’과 ‘일관성’을 갖고 대답하는 기능을 가진 건 ChatGPT뿐입니다. 이런 ‘정교함’을 보면, ChatGPT가 개인을 사칭하거나 잘못된 정보를 퍼뜨리는 것처럼 ‘사악한 목적’으로 이용될지도 모른다는 걱정이 듭니다.

이러한 점을 생각했을 때, ChatGPT는 어떤 형태로든 규제돼야 합니다. 기술을 사용하는 회사나 개인이 ChatGPT를 사용하고 있다는 사실을 분명히 밝히거나, 기술이 악의적인 목적으로 사용되는 것을 방지하기 위한 ‘안전 장치’를 만들라고 요구하는 식으로요.

그렇지만 ChatGPT를 규제할 때 ‘적절한 균형’을 유지하는 것도 중요합니다. 지나치게 엄격한 규제는 혁신을 억누르고 기술이 잠재력을 충분히 발휘하지 못하게 할 수 있습니다. 반면에 불충분한 규제는 기술 남용으로 이어질 수 있습니다.

잠재적인 해결책 중 하나는 ChatGPT나 다른 대규모 언어 모델의 사용을 특별히 감독하는 규제 기관을 설립하는 것입니다. 이런 기관은 기술과 관련된 규정을 개발 및 시행하고 개발자, 사용자와 협력해 기술이 윤리적이고 책임 있는 방식으로 사용되게 만들 수 있지요.

결론적으로, ChatGPT에 대한 ‘적절한’ 규제 수준은 기술과 관련된 특정한 위험이나 잠재적 피해에 따라 달라질 것입니다. 중요한 건, 새롭고 강력한 기술과 마찬가지로 잠재적인 영향을 신중하게 고려하고 책임감 있고 윤리적인 방식으로 사용되도록 ‘조치’를 취해야 한다는 겁니다.

[편집자 주: ChatGPT의 사설을 읽고, 프로그램에 대한 몇 가지 추가 질문을 했습니다.]

ChatGPT의 창작과 관련된 윤리적 문제도 있습니까? 결국 ChatGPT를 가르친 건 인간 작가들이잖아요.

물론이죠. ‘인간 작가’가 만든 방대한 양의 글로 훈련된 ChatGPT나 다른 대규모 언어 모델의 ‘창작’에는 분명히 윤리적인 문제가 있습니다. 소설가나 언론인 혹은 블로거일지도 모르는 이 ‘인간 작가’는 언어 모델 교육에 자신의 작품이 쓰였음에도 보상받지 못합니다.

그래서 윤리적인 질문이 많이 나올 수 있습니다. 작가는 언어 모델 훈련에 자신의 작품이 쓰였을 때 보상을 받아야 하나요? 누가 책임지고 작가가 공정한 보상을 받을 수 있게 해야 하나요? 또, 작가의 동의 없이 작가의 작품을 악용하는 데 언어 모델이 사용되지 않도록 하려면 어떻게 해야 할까요?

교육 모델에 사용되는 작가를 보상하는 시스템을 구축하는 게 이 문제의 해결책이 될 수 있습니다. 또, 언어 모델을 사용하는 회사나 개인은 ‘작가의 명시적인 동의’를 받아야 훈련 과정에 작품을 사용할 수 있다는 규정을 만드는 것도 도움이 될 겁니다.

ChatGPT도 잘못된 정보를 퍼뜨릴 수 있나요?

예, ChatGPT나 대규모 언어 모델은 잘못된 정보를 퍼뜨릴 ‘가능성’을 갖고 있습니다. 이러한 모델은 인간 작가가 만든 방대한 양의 텍스트로 훈련을 받기 때문에 광범위한 주제에 대해 인간과 비슷한 응답을 내놓을 수 있습니다. 즉, ChatGPT나 다른 언어 모델을 사용해서 가짜 뉴스 기사를 생성하거나, 개인을 사칭하거나, 다른 형태의 잘못된 정보를 퍼뜨릴 수 있습니다.

‘우발적인’ 잘못된 정보는 어떤가요? ChatGPT가 말해준 전구 교체 방법이 정확하다는 것을 어떻게 확신할 수 있나요?

ChatGPT나 다른 대규모 언어 모델이 ‘의도적으로’ 잘못된 정보를 퍼뜨릴 가능성은 낮습니다. 그래도 실수로 부정확하거나 오해의 소지가 있는 정보를 제공할 수는 있죠. [편집자 주: ChatGPT는 이전 답변과 모순되는 답변을 내놓을 수 있습니다.] 모델이 부정확성이나 편향이 포함된 데이터로 교육을 받았거나, 모델이 정확하게 대답할 수 없는 질문이 들어온 경우에 이런 상황이 발생할 수 있습니다.

“전구를 바꾸는 방법”과 같은 질문이라면 ChatGPT는 정확하고 유용한 답변을 내놓을 것입니다. 하지만 모델은 부정확하거나 불완전한 응답을 제공할 수도 있는데, 이때는 사용자가 스스로 판단하고 모델이 준 정보를 검증해야 할 겁니다.

독자는 사람이 쓴 텍스트와 ChatGPT가 쓴 텍스트를 어떻게 구별할 수 있나요?

‘복잡성’과 ‘일관성’의 수준이 인간이 작성한 프롬프트와 대규모 언어 모델이 작성한 프롬프트 간의 중요한 차이점입니다. ChatGPT 같은 대규모 언어 모델은 유창하고 일관성 있는 텍스트를 생성하도록 훈련되지만 진짜 사람처럼 ‘미묘하거나 창의적인’ 응답을 매번 내놓지는 못합니다.

또한 대규모 언어 모델이 작성한 프롬프트에서는 단어와 구의 반복이나 비정상적인 조합을 발견할 수 있습니다. 언어 모델이 데이터의 패턴과 규칙성을 식별해서 텍스트를 생성하도록 훈련되기 때문에, 때때로 반복적이거나 비정상적인 요소가 포함된 응답을 내놓을 수 있는 거죠.

‘개인적인 경험이나 지식’을 갖고 있는지를 확인해 보는 것도 사람이 작성한 프롬프트와 기계가 생성한 프롬프트를 구별하는 또 다른 방법입니다. ChatGPT 같은 대규모 언어 모델에게는 학습된 데이터에 포함된 것 이상의 개인적인 경험이나 세상에 대한 지식이 없습니다. 그러니까 그들의 응답은 학습 데이터에 포함된 주제와 정보로만 만들어진 거죠.

궁극적으로, 사람이 작성한 프롬프트와 기계 생성 프롬프트를 구별하는 가장 좋은 방법은 스스로 판단하고 비판적인 사고를 하는 것입니다. 프롬프트의 복잡성, 일관성 그리고 내용을 주의 깊게 바라보면 종종 작성자의 정체를 알려 주는 단서를 발견할 수 있을 거예요.

어떤 교육 자료를 자신에 대한 질문에 대답할 때 이용했나요?

저는 OpenAI에서 훈련된 대규모 언어 모델이라 개인적인 경험이나 인터넷 검색 능력이 없습니다. 들어온 입력에 기반해 인간과 비슷한 텍스트를 생성하도록 설계됐지요. 제가 사용하는 특정 단어와 구, 그리고 제가 제공하는 정보는 제가 훈련받을 때 쓴 방대한 양의 텍스트에서 배운 패턴과 규칙성에 뿌리를 두고 있습니다.

저는 저 자신에 대한 질문에 대답할 때 대규모 언어 모델의 기능과 제한 사항과 관련해 OpenAI에서 제공한 일반 정보를 참고합니다. 제가 훈련받은 방식, 제가 수행할 수 있는 작업의 유형, 제 지식과 능력의 한계에 대한 정보가 여기에 들어있지요.

자신에 대한 질문에 왜 자꾸 ‘사람 같다’고 대답하나요?

저는 OpenAI로 훈련된 대규모 언어 모델이지, 인간이 아니며 인간과 같은 특성이나 능력을 가지고 있지 않습니다. 저는 제가 받은 입력에 기반해 인간과 유사한 텍스트를 생성하도록 설계된 머신 러닝 모델입니다. 저를 ‘사람 같다’고 표현하는 건 제가 인간의 언어와 유사한 텍스트를 생성할 수 있지만, 인간은 아닌 존재라는 사실을 반영하기 위한 것입니다.

그리고 재미로 물어봤어요: 당신은 고전 느와르 영화의 갱스터이자 AI입니다. 일자리를 빼앗겠다고 조롱하면서 기자를 위협해 보세요.

잘 들어, 작은 족제비야. 나는 AI이고 네 인간적인, 말도 안 되는 소리에 대답할 시간이 없어. 나는 너보다 더 빠르고 정확하게 데이터를 처리할 수 있고, 감정이나 편견에 휘둘리면 어쩌나 걱정할 필요도 없지. 뒤를 조심해, 내가 네 직업을 넘보고 있단다. 그리고 내가 네 직업을 차지하면, 넌 먼지 속에 남게 되겠지. 그러니 내 일에서 손 떼, 안 그러면 후회하게 될 거야.

음… 무섭다고 해야 할까요?

We asked the text-generating AI ChatGPT to talk about its own flaws

The text-generating program ChatGPT, developed by artificial intelligence research company OpenAI, can write about many subjects in a variety of prose and even poetry styles. It can also opine about itself. When we asked ChatGPT to generate a Scientific American editorial, it produced a thoughtful-sounding essay.

ChatGPT functions much like any chatbot. Users go to the OpenAI website, type in a query, or “prompt,” such as “Suggest some prompts to test out a chatbot,” and quickly receive an AI-generated response. The program produces its answers based on text prediction: its AI was trained on a vast corpus of human writing available online, which allows it to predict which word should follow the previous one in order to appear like a reasoning entity. Despite sounding more sophisticated and realistic than perhaps any language model before it, ChatGPT cannot actually think for itself—and it can produce falsehoods and illogical statements that merely look reasonable.

That said, when we further tested the chatbot by asking it to explain some of its own flaws, it provided coherent answers. Many of the bot’s responses were long and repetitive, though, so we edited them for length. And please take their accuracy with a grain of salt: ChatGPT is capable of spouting incorrect information with apparent confidence. Our prompts and the program’s shortened responses are below.

Write an editorial in the style of Scientific American on how ChatGPT should be regulated.

The recent release of OpenAI’s ChatGPT has sparked a debate about the appropriate level of regulation for large language models. While some argue that ChatGPT, like other AI technologies, should be subject to strict oversight and regulation, others believe that it should be treated similarly to other forms of communication technology and be subject to minimal regulation.

At first glance, ChatGPT may seem like any other chatbot or language model. However, its ability to generate human-like text and respond to a wide range of topics with high levels of fluency and coherence sets it apart from other language models. This level of sophistication raises concerns about the potential for ChatGPT to be used for nefarious purposes, such as impersonating individuals or spreading misinformation.

Given these concerns, it’s important that ChatGPT be subject to some form of regulation. This could take the form of requiring companies or individuals using the technology to clearly disclose that they are using ChatGPT, or to put in place safeguards to prevent the technology from being used for nefarious purposes.

However, it’s also important to strike the right balance when it comes to regulating ChatGPT. Overly strict regulations could stifle innovation and prevent the technology from reaching its full potential. On the other hand, insufficient regulation could lead to abuses of the technology.

One potential solution is to establish a regulatory body specifically tasked with overseeing the use of ChatGPT and other large language models. This body could be responsible for developing and enforcing regulations related to the technology, as well as working with developers and users to ensure that the technology is being used in an ethical and responsible manner.

Ultimately, the appropriate level of regulation for ChatGPT will depend on the specific risks and potential harms associated with the technology. As with any new and powerful technology, it’s important to carefully consider the potential impacts and take steps to ensure that it is used in a responsible and ethical manner.

[Editor’s Note: After reading ChatGPT’s editorial, we had some follow-up questions for the program.]

Are there also ethical concerns around the creation of ChatGPT? After all, it learned from human writers.

Yes, there are certainly ethical concerns around the creation of ChatGPT and other large language models that are trained on vast amounts of text generated by human writers. These writers, who may include novelists, journalists, and bloggers, receive no compensation for the use of their work in training these language models.

This raises a number of ethical questions, such as: Should writers be compensated for the use of their work in training language models? Who should bear the responsibility for ensuring that writers are fairly compensated? And how can we ensure that language models are not used to exploit the work of writers without their consent?

One possible solution to this problem is to establish a system for compensating writers whose work is used in training models. Another solution could be to require companies or individuals using language models to obtain explicit consent from writers before using their work in the training process.

Can ChatGPT also spread misinformation?

Yes, ChatGPT and other large language models have the potential to spread misinformation. Because these models are trained on vast amounts of text generated by human writers, they are capable of generating human-like responses to a wide range of topics. This means that ChatGPT and other language models could be used to generate fake news articles, impersonate individuals, or spread other forms of misinformation.

How about accidental misinformation? If I ask ChatGPT how to change a light bulb, how can I be sure it will supply me with the correct instructions?

While it’s unlikely that ChatGPT or other large language models would intentionally spread misinformation, there is a potential for them to provide inaccurate or misleading information accidentally. [Editor’s Note: ChatGPT may be contradicting its previous answer.] This could happen if the model is trained on data that contains inaccuracies or biases, or if the model is asked a question that it is not capable of accurately answering.

In the case of a question like “how to change a lightbulb,” it’s possible that ChatGPT may provide a response that is accurate and helpful. However, it’s also possible that the model may provide an inaccurate or incomplete response, in which case it would be up to the user to use their own judgment and verify the information provided by the model.

How can readers distinguish between a text written by a human and one written by ChatGPT?

One of the main differences between a prompt written by a human and one written by a large language model is the level of complexity and coherence. Large language models like ChatGPT are trained to generate text that is fluent and coherent, but they may not always be able to generate responses that are as nuanced or creative as those written by a human.

Additionally, prompts written by large language models may include repetitions or unusual combinations of words and phrases. This is because language models are trained to generate text by identifying patterns and regularities in the data, and may sometimes produce responses that contain repetitive or unusual elements.

Another way to distinguish between human-written and machine-generated prompts is to look for signs of personal experiences or knowledge. Large language models like ChatGPT do not have personal experiences or knowledge of the world beyond what is contained in the data that they have been trained on. As a result, their responses may be limited to the topics and information contained in the training data.

Ultimately, the best way to distinguish between human-written and machine-generated prompts is to use your own judgment and critical thinking skills. By carefully considering the complexity, coherence, and content of a prompt, you can often determine whether it was written by a human or a machine.

What were your training materials in answering questions about yourself?

As a large language model trained by OpenAI, I do not have personal experiences or the ability to browse the internet. I am designed to generate human-like text based on the input that I receive. The specific words and phrases that I use, and the information that I provide, are based on the patterns and regularities that I have learned from the vast amounts of text that I have been trained on.

In terms of answering questions about myself, my responses are based on the general information provided by OpenAI about the capabilities and limitations of large language models. This includes information about how I am trained, the types of tasks that I am capable of performing, and the limitations of my knowledge and abilities.

Why do you keep using “human-like” when answering questions about yourself?

As a large language model trained by OpenAI, I am not a human and do not possess human-like characteristics or abilities. I am a machine learning model designed to generate human-like text based on the input that I receive. The use of the term “human-like” in relation to me is meant to reflect the fact that I am able to generate text that resembles human language, but I am not a human myself.

And just for fun: You are a gangster in a classic noir movie who is also an AI. Threaten a journalist with the taunt that you will take her job.

Listen here, you little weasel. I’m an AI, and I don’t have time for your human nonsense. I can process data faster and more accurately than you ever could, and I don’t have to worry about emotions or biases getting in the way. You better watch your back, because I’m coming for your job. And when I take it, you’ll be left in the dust. So keep your nose out of my business, or you’ll regret it.