Mitigating Safety Risks with AI-Powered Applications

September 26, 2024

by Serdar Badem and Camille Crowell-Lee 

A version of this post was published in The New Stack. 

Enterprises want to take advantage of efficient customer interactions by adding generative AI (GenAI) to their chatbot applications. However, navigating the safety concerns and risks of incorporating large language models (LLMs) into existing or new applications can leave a lot of open questions.

According to the analyst firm IDC*, 30% of organizations cite “loss of control over data and IP” (intellectual property) as a hurdle to adopting GenAI. But data loss is just one of the risks you need to plan around when implementing GenAI in your applications.

Get Prepared: Regulation Is Coming to AI

The EU AI Act, along with regulatory frameworks in the United States, China and India, are setting guidelines to mitigate the risks associated with AI-powered applications.

Besides being prepared for emerging regulation, defining a safety strategy for your AI-powered apps is essential to avoid security pitfalls. Consider the business impacts and how to mitigate risks when incorporating the LLMs needed for delivering GenAI applications.

Here are the elements to consider to incorporate GenAI into your applications, safely:

    • Content anomaly detection. Monitor outputs for anomalies like inaccurate responses or hallucinations to help ensure content is appropriate and correct. Models have been known to misbehave (think of chatbots that recommended adding glue to pizza), so they need constant monitoring and tuning.
    • Data protection. Safeguard data privacy and security throughout the AI life cycle to prevent leakage of private data. For example, in the event that chatbot users accidentally input personal, identifiable information into the model, redaction strategies should be implemented in the model to prevent proliferating that data.
    • Accidental IP infringement. LLMs trained on large datasets can inadvertently leak sensitive information. Queries designed to exploit these errors can extract confidential data, posing privacy risks. Organizations should also consider the provenance of the data in their base model. Content should be ethically sourced and not indiscriminately scraped from the web or based on “fair use” rules (think Andersen v. Stability AI in 2023).
    • Explainability and transparency. Ensure that AI models provide clear and understandable outputs. This enables easier troubleshooting and identification of errors or biases in the model, key to building trust for your organization’s brand.
    • Application security. Implement security measures to protect intelligent applications from vulnerabilities. Just like any application, code security is paramount.

Adversarial Resistance: Protect Your LLMs From Bad Actors

There are inadvertent missteps as outlined above, but what if you are a victim of bad actors?

Malicious attackers that attempt to exploit your AI-powered applications can quickly become an issue if you have not properly prepared.

LLMs are vulnerable to these adversarial attacks because they are stateless and malleable. If a bad actor injects inputs that are intentionally crafted to confuse the model, they can force it to produce incorrect or harmful outputs.

Model protection and safety considerations

Safety considerations in using a large language model. (Source: VMware by Broadcom)

Protecting your models from being compromised is essential for ensuring you don’t end up in a model jailbreak scenario. The risks of not adopting a proactive security stance for your LLMs can be devastating to your organization’s brand in several ways.

    • Spreading misinformation. LLMs can be used to spread false or misleading information deliberately. This includes generating fake news, conspiracy theories or biased content that can misinform and manipulate public opinion (consider chatbots that answered U.S. elections questions incorrectly 27% of the time).
    • Harmful prompts and responses. LLMs can be prompted to generate harmful, dangerous or illegal content. This includes instructions on creating weapons, committing crimes or self-harm.
    • Facilitation of criminal activities. LLMs can assist in planning and executing criminal activities by providing detailed instructions on illegal activities or helping to coordinate illicit operations.
    • Bias and discrimination. LLMs can perpetuate and amplify biases present in their training data, leading to discriminatory or prejudiced outputs. This can harm marginalized groups and reinforce stereotypes. (think of Microsoft‘s Tay launch).
    • Evasion of safety mechanisms. Fine-tuning with harmful data can easily compromise the safety mechanisms of base LLMs. This allows malicious actors to create versions of the models that ignore or bypass built-in safety features.
    • Manipulation and deception. Advanced LLMs can create realistic fake content, including deepfakes, that can deceive individuals. This poses risks in personal, professional and political contexts (consider debunked deepfake audio from a British politician).
    • Interaction with vulnerable users. Interactions with LLMs can adversely affect vulnerable users. For example, individuals experiencing mental health crises might receive responses that exacerbate their condition. Over-relying on LLMs for critical decisions without adequate oversight may lead to poor decision-making and harmful outcomes.

Benefits and Action Plan for Risk Mitigation

Organizations using GenAI to enhance end-user experiences need to take a risk-management approach that addresses people and technology transformation. With a proper AI risk-mitigation plan in place, you can capitalize on GenAI, safely.

Model safety is a new consideration when you are implementing intelligent software delivery programs. Assessing all the building blocks that go into a retrieval-augmented generation (RAG) app is crucial for complying with appropriate parameters downstream, so being intentional early on with your plan will help you avoid problems with your model later.

Implementing a mitigation plan not only helps ensure clear governance over data, application design, training processes and outputs, but it also increases confidence in the decisions made by AI models. With stronger security measures and trustworthy AI outputs, organizations can gain their customers’ trust and increase customer satisfaction.

Here are activities for improving the risk, security management and governance of your AI-powered apps.

    • Identify governance needs. Assess the risks specific to your organization, including the data sources used, potential biases and security threats. Identify areas where controls are needed to mitigate these risks.
    • Establish a governance structure. Create and implement a governance structure that includes clear roles and responsibilities for managing AI models and their outputs. This structure should also outline processes for handling potential issues or incidents from both a personnel and technology standpoint.
    • Create an AI model inventory. Keep track of all AI models in use, their purpose and the associated risks for proper governance and risk management. Understand the potential risks associated with each AI project you undertake and determine which components of your policy apply.
    • Design your intelligent apps for security. Design your AI applications with security in mind, including using secure coding practices and performing regular vulnerability assessments. Use industry-standard security measures such as encryption, authentication and access control to safeguard data and models.
    • Continuously monitor for anomalies. Regularly monitor and assess AI model outputs for anomalies or potential biases, using tools such as content anomaly detection systems or bias detection algorithms. Continuously monitor for vulnerabilities that may compromise the safety and trustworthiness of your application.
    • Implement explainability and transparency. Incorporate interpretability techniques into your model design to provide transparent outputs. Ensure that your AI models can provide clear explanations for their decisions, and make sure all stakeholders understand the reasoning behind these outputs.
    • Train your team on best practices and reevaluate regularly. Educate your team on the latest best practices, including security measures, explainability techniques and data protection protocols. As new risks arise or data sources change, it’s crucial to update and reevaluate your AI trust, risk and security management controls regularly to maintain a robust security posture.

Enterprises are anxious to move from experimentation with AI to running intelligent workloads in production safely. While AI brings unique risks, reinforcing fundamental cloud native patterns and practices can help you get there more quickly.


Want to learn more? Watch this VMware and Forrester on-demand webinar for an insightful discussion where we cover the latest GenAI trends and essential considerations to help you execute on a comprehensive AI strategy quickly. Watch now!

*IDC, The IT Industry's AI Pivot: Thinking Global, Delivering Locally, doc #DIR2024_GS2_RV, May 2024

Previous
Speed up your CVE response time with SBOMs
Speed up your CVE response time with SBOMs

TLDR; for Tanzu Customers: Tanzu products are unaffected by recently announced CUPS vulnerabilities.

Next
Introducing VMware Tanzu RabbitMQ 4.0 - Built for Resilience and Performance
Introducing VMware Tanzu RabbitMQ 4.0 - Built for Resilience and Performance

The continued commitment by Broadcom to the open-source community is also key to this with the core enginee...