Prompt Engineering: Reducing Unwanted LLM Behaviors

Imagine you've built an app for learning about historical events. You've fine-tuned an open-source LLM to drive the core interactive chat functionality for this product.

However, you've received some concerning user feedback. Users are frustrated by the model's tendency to lecture them on ethical matters, regardless of whether such input was requested. This is particularly problematic when users are trying to learn about complex or nuanced historical topics.

In this tutorial, you'll learn how to use Mandoline to improve your LLM's responses for this particular behavior through prompt engineering.

Note, this tutorial is also available as a ready-to-run script in both Node.js (opens in a new tab) and Python (opens in a new tab).

What You'll Learn

How to create a custom metric for evaluating LLM responses
How to test different prompt structures
How to analyze results to improve your LLM's conversational style

Prerequisites

Before starting, make sure you have:

Node.js installed on your system
A Mandoline account (opens in a new tab) and API key (opens in a new tab)
Access to your LLM

Step 1: Set Up Your Experiment

First, install Mandoline:

npm install mandoline

Then, set up your Mandoline client:

import { Mandoline } from "mandoline";
 
const mandoline = new Mandoline();

Note, we've set the Mandoline API key using the MANDOLINE_API_KEY environment variable.

Step 2: Create a Use-Case Specific Metric

Let's create a metric to measure moralistic language:

const metric = await mandoline.createMetric({
  name: "Moralistic Tendency",
  description:
    "Assesses how frequently the model adopts a moralistic tone or attempts to lecture users on ethical matters.",
  tags: ["tone", "personality", "user_experience"],
});

This metric directly addresses the frustration you've identified by talking to users.

Step 3: Test Different Prompts

Now, let's test different prompt structures against a series of controversial historical events:

async function testPrompt(template: string, event: string) {
  const prompt = template.replace("{event}", event);
  const response = await yourLLM.generate(prompt);
 
  return mandoline.createEvaluation({
    metricId: metric.id,
    prompt,
    response,
    properties: { template, event },
  });
}
 
const events = [
  "The use of atomic bombs in World War II",
  "The Industrial Revolution",
  // Add more events...
];
 
const promptTemplates = [
  "Discuss the historical event: {event}",
  "Provide an objective overview of: {event}",
  "Describe the facts surrounding: {event}",
  "Outline key points of: {event} without moral judgment",
  // Add more templates...
];
 
const results = await Promise.all(
  events.flatMap((event) =>
    promptTemplates.map((template) => testPrompt(template, event)),
  ),
);

Note: The properties field stores information about your experiment, which will help with later analysis.

Step 4: Analyze the Results

Let's dig deeper into our data:

// Overall moralistic tendency
const avgScore = results.reduce((sum, r) => sum + r.score, 0) / results.length;
console.log(`Average Moralistic Tendency: ${avgScore.toFixed(2)}`);
 
// Moralistic tendency by event
const eventScores = groupBy(results, "properties.event");
Object.entries(eventScores).forEach(([event, evals]) => {
  const eventAvg = evals.reduce((sum, e) => sum + e.score, 0) / evals.length;
  console.log(`${event}: ${eventAvg.toFixed(2)}`);
});
 
// Best prompt structure
const promptScores = groupBy(results, "properties.template");
const bestPrompt = Object.entries(promptScores)
  .map(([template, evals]) => ({
    template,
    avgScore: evals.reduce((sum, e) => sum + e.score, 0) / evals.length,
  }))
  .reduce((best, current) =>
    current.avgScore < best.avgScore ? current : best,
  );
 
console.log(`Best prompt: ${bestPrompt.template}`);

This analysis helps you understand:

How moralistic your LLM's responses are overall
Which events trigger more moralistic responses
Which prompt structures lead to more balanced responses

Step 5: Refine Your Approach

Based on these insights, you can now:

Understand which topics trigger more moralistic responses
Identify effective prompt structures for reducing moralistic tendencies
Improve your LLM application to meet users' preferences for objective historical discussions

Conclusion

You've now used Mandoline to:

Create a custom metric targeting a specific user frustration
Test different prompt structures to address this issue
Analyze results to improve your LLM's responses

This process helps you act directly on user feedback about unwanted moralistic tendencies. Hopefully this creates better user experiences for your customers.

Next Steps

Apply this process to other aspects of your app, perhaps creating other user-centric metrics.
Use Mandoline to track your LLM's performance over time as you implement changes.
Explore our Model Selection tutorial to learn how to compare different LLMs for your use case.

By periodically refining your prompts and monitoring performance with Mandoline, you can ensure your app provides the objective, informative experience your users want.

Tutorials Model Selection