Generative AI Product Problems #5: Security

What are the security risks of deploying LLMs to production and what can you do to stay prepared?

Sundar Solai

Oct 31, 2023 — 2 min read

Don’t let bad actors hijack your LLM.

If you’re going to put your AI assistant out into the world, you need to be confident you aren’t introducing vulnerabilities.

You’ll need to be prepared for a variety of LLM abuse vectors, including:

Prompt injection: An adversary could tell your model to ignore previous instructions and do something else.
Jailbreaking: Related to prompt injection, jailbreaking involves circumventing an LLM’s safety measures to elicit inappropriate responses.
Prompt leaking: Somebody could coax an LLM to unveil how exactly it’s been prompted, potentially exposing trade secrets or sensitive information.
Training data poisoning: Instead of directly attacking your LLM, somebody tampering with its training data or any dataset the model consumes can just as well corrupt its behavior.
Unauthorized code execution: The more you integrate your LLM with other systems and APIs, the larger the surface area you have to protect. Somebody malicious could try and use your LLM to break into other systems or execute arbitrary code.

So what should you do?

There’s no one-size-fits-all solution, especially because attackers are regularly devising new strategies.

That said, here are some things you can try:

Filtering: Scan inputs for abusive keywords or suspicious text and filter them out if detected. You can use a conventional ML model or even another LLM.
Defend your prompts: Incorporate instructions in your prompts that prepare your LLM for malicious inputs. This could be as simple as telling your LLM to ignore further instructions, or something more sophisticated like random sequence enclosure.
Defend your product: Consider how you’re exposing your LLM to users in the first place. It’s often safer to put an LLM behind the scenes and let it pull specific strings you’ve defined (much like a puppet show). Or, you can try limiting how users provide input to your LLM. Pre-processing their text before giving it to your model could limit vulnerabilities.

Throughout all of this, it’s important to have a clear understanding of how your users are interacting with your model. Context.ai is the analytics layer that gives you that visibility. With it, you can monitor behavior at the individual user level and put a stop to bad actors.

Request a demo today at context.ai/demo.

What product experiences are enabled by multi-agent LLM frameworks?

It feels like everyone is excited about multi-agent frameworks - even though their performance isn’t yet ready for prime-time. These performance problems are improving with increasingly powerful models like Claude 3 and GPT-4o - and great things are expected from GPT-5, a launch that will likely make agentic workflows

Launching Custom Conversion Events - Product Update | July 2024

Today we’re launching support for custom conversion events 🧾 This addresses one of the biggest challenges in the LLM ecosystem - proving ROI 📈 Context.ai users can now log custom conversion events with their LLM conversation transcripts, indicating where users completed an action: a purchase, a link click, or even

Are your LLM Products Guardrails working?

How do you know if the guardrails on your LLM product are working? 🛡️🎯 Some people wait until they show up in the The New York Times - like McDonald's, Air Canada, or Chevrolet Conversational LLM products are a challenging consumer experience as users can ask an infinite number

Is LLM progress slowing?

LLMs haven’t significantly improved since GPT4: is progress slowing? 🐢 Dramatically more powerful model training clusters are being built: 15 of them, with 31 times more power than trained GPT4 This means models much more powerful than GPT4 are coming 🐇 SemiAnalysis did a phenomenal deep dive into this topic -

Read more

What product experiences are enabled by multi-agent LLM frameworks?

Launching Custom Conversion Events - Product Update | July 2024

Are your LLM Products Guardrails working?

Is LLM progress slowing?