Science & Technology, Singapore (Commonwealth Union) – Computer scientists at Nanyang Technological University, Singapore (NTU Singapore) have successfully compromised several artificial intelligence (AI) chatbots, including ChatGPT, Google Bard, and Microsoft Bing Chat, resulting in content that violates their developers’ guidelines—an event referred to as “jailbreaking.”

In the realm of computer security, “jailbreaking” denotes the process where hackers identify and exploit vulnerabilities in a system’s software, enabling it to perform actions deliberately restricted by its developers.

Furthermore, the researchers employed a large language model (LLM) trained on a dataset of prompts previously proven effective in hacking these chatbots. This led to the creation of an LLM chatbot capable of autonomously generating additional prompts to jailbreak other chatbots.

Researchers of the study pointed out that LLMs serve as the cognitive foundation for AI chatbots, empowering them to comprehend human inputs and produce text nearly indistinguishable from that crafted by a human. This encompasses tasks such as planning a trip, narrating bedtime stories, and even coding.

The NTU researchers have introduced “jailbreaking” to the challenges faced by LLM chatbots. Their discoveries play a crucial role in alerting companies and businesses to the vulnerabilities and constraints of their LLM chatbots, enabling them to implement measures to fortify these systems against potential hackers.

Following a series of proof-of-concept tests on LLMs to confirm the validity of their method as a significant threat, the researchers promptly reported the identified issues to the relevant service providers after successfully executing jailbreak attacks.

Professor Liu Yang, leading the study from NTU’s School of Computer Science and Engineering, highlighted the rapid proliferation of Large Language Models (LLMs) due to their outstanding ability to comprehend, generate, and complete human-like text. LLM chatbots, in particular, have gained immense popularity for various everyday applications.

As research and development in the field of artificial intelligence continue to advance, the evolution of large language models is inevitable. Future iterations may address current challenges, such as reducing biases and improving interpretability. The integration of LLMs with other AI technologies is poised to create synergies that unlock new possibilities in problem-solving, decision-making, and human-machine collaboration.

When hackers unveil vulnerabilities, AI chatbot developers engage in a perpetual cat-and-mouse game of “patching” to address the issues. In this ongoing cycle, Masterkey, developed by NTU computer scientists, takes the lead by introducing an AI jailbreaking chatbot that generates a high volume of prompts and continuously learns from successes and failures. This innovative approach allows hackers to outsmart LLM developers using their own tools.

The researchers initiated the process by constructing a training dataset composed of effective prompts from the earlier jailbreaking reverse-engineering phase, along with unsuccessful prompts to guide Masterkey on what to avoid. This dataset served as the starting point for an LLM, undergoing continuous pre-training and task tuning.

By exposing the model to diverse information and honing its abilities through tasks directly related to jailbreaking, the researchers created an LLM with enhanced predictive capabilities for manipulating text in the context of jailbreaking. This results in the generation of more potent and universally effective prompts.

Masterkey’s prompts were found to be three times more effective in jailbreaking LLMs compared to those generated by traditional LLMs. Moreover, Masterkey can autonomously learn from previous unsuccessful prompts and continually produce new, more effective prompts.

The researchers suggest that their LLM, such as Masterkey, can be utilized by developers to reinforce their security measures.

NTU Ph.D. student Mr. Deng Gelei, a co-author of the paper, says “As LLMs continue to evolve and expand their capabilities, manual testing becomes both labor-intensive and potentially inadequate in covering all possible vulnerabilities. An automated approach to generating jailbreak prompts can ensure comprehensive coverage, evaluating a wide range of possible misuse scenarios.”

Chatbots: a Revolution or a Fallacy

LEAVE A REPLY Cancel reply

LATEST POSTS

Sri Lanka’s KDAW teams up with CWEIC

CWEIC meets Nigerian VP

Unstoppable Peres Jepchirchir

Impactful youth-led agenda at CYF

Home2 Suites to add 70 hotels

Wall Street Journal releases report on Amazon’s unethical business practices

The rise of the coastal metropolis

SUMMER ESCAPES: PLAN YOUR VACATION

Chatbots: a Revolution or a Fallacy

LEAVE A REPLY Cancel reply

LATEST POSTS

Follow us