The Importance of Busting Open the Black Box of LLMs
As the world settles into the era of AI, industries are lining up to harvest the fruit of its labour. Powering everything from chatbots, to document summarisers to code generators, its effects in a mere two years have been monumental for both a company’s internal operations and offerings it gives to its customers.
Yet behind these marvels lies the systems in place working away to make this all happen. Deep learning. The very complexity that grants LLMs their power to create such applications like ChatGPT also shrouds them in mystery, earning them an ominous moniker of 'black boxes'.
What this means is visibility is limited into the machinations of just how everything gets done. Like how do we get from this point A to B. For the casual consumer, they don’t need all this added information, they just want the end product.
But for the enterprise user, operating AI in an age of regulation and business intelligence, understanding what LLMs are doing to arrive at their conclusion has become essential.
The evolution of LLMs
The development of LLMs has been marked by significant milestones in recent years. Yet, it is the introduction of a few key advancements that has been the game-changer.
“LLMs have undergone rapid evolution in recent years, with three primary branches emerging on their evolutionary tree: Encoder-only, Encoder-Decoder, and Decoder-only groups of models,” says Pramod Beligere, Vice President of Generative AI Practice Head, Hexaware.
This helped make way for the introduction of the Transformer architecture, aiding developments of key releases like language model BERT in 2018, which introduced bidirectional training, and the subsequent release of GPT-2 in 2019, which demonstrated impressive text generation capabilities.
All this paved the way for the Gen AI revolution we now find ourselves in, with the release of models such as GPT-3, which boasts an impressive 175 billion parameters, significantly enhancing context comprehension and text generation capabilities.
But it is these advancements working in tandem with the deep learning that occurs in LLMs that brought about the new capabilities.
“By leveraging neural networks, especially transformer architectures, deep learning enables LLMs to process and generate human-like text. Techniques such as attention mechanisms allow models to focus on relevant sections of the input data, enabling better understanding of the context,” explains Pramod.
The black box problem
Despite their impressive capabilities, it is difficult to see inside LLMs to understand their decision-making processes. Considering a model like GPT-3 with 175 billion parameters, the sheer number creates an intricate web of interconnections, making it nearly impossible to trace the exact path from input to output.
This complexity is further compounded by the model’s use of attention mechanisms, which allow it to focus on different parts of the input when generating each word of the output, which means it's not linear or easily interpretable, but rather a result of countless subtle interactions between these parameters.
Understanding how specific inputs lead to specific outputs is therefore comparable to finding a needle in a haystack.
From an enterprise perspective, not knowing how a faulty assumption has been made may mean they do not know where the problem from an output came from, making error rectification difficult.
Equally, this lack of transparency in LLMs has led to significant issues and misunderstandings.
“The ‘black box’ nature of LLMs raises concerns about bias and accountability,” explains Fred Werner, Co-founder of the UN’s AI for Good.
Indeed OpenAI’s and even Google’s models have been criticised for generating biased or discriminatory content, enabling the creation of misinformation.
The importance of understanding this is not only to avoid litigation from companies, but is necessary to stay on the right side of regulations. As the use of LLMs becomes more widespread, regulatory frameworks are evolving to address transparency issues.
“Regulations like the GDPR emphasise data protection and the right to explanation, requiring organisations to provide understandable information about automated decision-making processes,” says Pramod. “Also, the EU AI Act aims to set stricter transparency and accountability standards for high-risk AI systems.”
This underscores the need for greater transparency and accountability in LLMs as they become more and more integral to operations around the globe. But how can you get a window into these internal machinations and make them inherently more understandable?
Busting open black boxes
To make deep learning models like LLMs more understandable, researchers are developing tools to enhance the transparency of deep learning models.
“These efforts contribute directly to the UN’s Sustainable Development Goals (SDG) by making AI more accessible and reliable for applications in areas like healthcare, agriculture, and disaster management, ensuring that AI can be trusted and effectively used in real-world scenarios,” explains Fred.
These include visualisation tools, which use techniques like attention maps and saliency maps to track how these models make decisions, such as highlighting which parts of the input data are most important for a given output.
Explainable AI techniques like LIME (Local Interpretable Model-agnostic Explanations) and SHAP (SHapley Additive exPlanations) can also help interpret model predictions by highlighting important features.
And perhaps most importantly, developing inherently interpretable models, such as decision trees or rule-based systems, alongside deep learning models that put demystification at the very heart of any AI project.
“Future developments of LLMs will likely focus on enhancing interpretability and reducing biases to foster greater trust, ethical use and regulatory compliance,” explains Pramod.
While increased transparency in LLMs offers numerous benefits, it also comes with potential risks. Increased complexity, potential exposure of proprietary information and security vulnerabilities all become a possibility.
Yet this does not need to be a situation of being inbetween a rock and a hard place, businesses can adopt a layered transparency approach.
“Businesses can balance the pros of visibility with the cons of a more open security, providing sufficient detail to stakeholders without compromising proprietary information or security, and implementing robust governance frameworks and regularly auditing models to help manage risks while reaping the benefits,” says Pramod.
Looking ahead, the way the landscape of LLMs is shaping is set to be one of greater inclusivity, transparency, and ethical use. Equally, regulations might further tighten to mandate more detailed documentation, bias audits and explainability requirements for AI models.
However it happens, and however the new challenges are balanced remains to be seen, yet what is becoming more clear is that the age of black box of LLMs may soon be coming to an end.
To read the full story in the magazine, click HERE.
Make sure you check out the latest edition of AI Magazine and also sign up to our global conference series - Tech & AI LIVE 2024
AI Magazine is a BizClik brand