Are your LLM Products Guardrails working?

Henry Scott-Green

27 Jun 2024 — 1 min read

How do you know if the guardrails on your LLM product are working? 🛡️🎯

Some people wait until they show up in the The New York Times - like McDonald's, Air Canada, or Chevrolet

Conversational LLM products are a challenging consumer experience as users can ask an infinite number of inappropriate questions

You need to secure your application against any policy-violating inputs and ensure your product correctly handles them with appropriate answers

We see three stages to this guardrails development process:

1️⃣ Implement guardrails, with a framework like Guardrails AI or Lakera

2️⃣ Evaluate your guardrails before release, with an evaluation tool like LangSmith by LangChain

3️⃣ Analyze the guardrails effectiveness in production, with Context.ai

The key metric to track in production is the proportion of conversations that violate your content policies, and how this moves over time.

You should break this down to many categories of sensitive content, from discussion of politics, medical advice, gambling or general off-topic discussion. You additionally want to assess the severity of a violation - is this a borderline case, or a gross violation that justifies banning the user?

How should this be done?

Use classifiers for an initial judgment, then a human review to confirm a sample of negatives. This is particularly important for the period until you’re confident in the automated classification

What should you do with the classification results?

Make the guardrails better! Find problem areas where inappropriate content is getting through, and additionally identify areas where inoffensive content is being incorrectly blocked. Fix those issues, then return to your analytics tool to verify your guardrails have improved.

Category Creation Improvements - Product Update | September 2024

What’s been cooking at Context.ai? 👨‍🍳 Check out everything we’ve shipped in our latest product demo video 🎥 This month we’ve been focused on category creation, product latency, transcript filtering, custom graphs, and category backfills Read the product update to learn more! Category creation flow improvement Category creation

Custom Charts & Dashboards - Product Update | August 2024

July was all about configurability and scale at Context.ai We’ve always allowed the creation of custom graphs in our workspace, but these are now dramatically more powerful with many more dimensions available to filter and to group by, as well as a new UI. Custom dashboards can now

What product experiences are enabled by multi-agent LLM frameworks?

It feels like everyone is excited about multi-agent frameworks - even though their performance isn’t yet ready for prime-time. These performance problems are improving with increasingly powerful models like Claude 3 and GPT-4o - and great things are expected from GPT-5, a launch that will likely make agentic workflows

Launching Custom Conversion Events - Product Update | July 2024

Today we’re launching support for custom conversion events 🧾 This addresses one of the biggest challenges in the LLM ecosystem - proving ROI 📈 Context.ai users can now log custom conversion events with their LLM conversation transcripts, indicating where users completed an action: a purchase, a link click, or even

Read more

Category Creation Improvements - Product Update | September 2024

Custom Charts & Dashboards - Product Update | August 2024

What product experiences are enabled by multi-agent LLM frameworks?

Launching Custom Conversion Events - Product Update | July 2024