API Reference

Mandoline API Reference

Table of Contents

  1. Authentication
  2. Installation
  3. Setup
  4. Data Models
  5. Metrics
  6. Evaluations
  7. Advanced Concepts


To use the Mandoline API:

  1. Sign up (opens in a new tab) for a Mandoline account.
  2. Get your API key from the account page (opens in a new tab).


To install the Mandoline Node.js (opens in a new tab) SDK:

npm install mandoline


Initialize the Mandoline client with your API key:

import { Mandoline } from "mandoline";
const mandoline = new Mandoline({ apiKey: "your-api-key" });

Or use an environment variable:

// Set MANDOLINE_API_KEY in your environment
const mandoline = new Mandoline();

Data Models

Here are the main data models used in Mandoline:

type UUID = string;
type SerializableDict = { [key: string]: any };
type NullableSerializableDict = SerializableDict | null;
type StringArray = ReadonlyArray<string>;
type NullableStringArray = StringArray | null;
interface Metric {
  id: UUID;
  createdAt: string;
  updatedAt: string;
  name: string;
  description: string;
  tags?: NullableStringArray;
interface MetricCreate {
  name: string;
  description: string;
  tags?: NullableStringArray;
interface MetricUpdate {
  name?: string;
  description?: string;
  tags?: NullableStringArray;
interface Evaluation {
  id: UUID;
  createdAt: string;
  updatedAt: string;
  metricId: UUID;
  prompt: string;
  prompt_image?: string;
  response?: string;
  response_image?: string;
  properties?: NullableSerializableDict;
  score: number;
interface EvaluationCreate {
  metricId: UUID;
  prompt: string;
  prompt_image?: string;
  response?: string;
  response_image?: string;
  properties?: NullableSerializableDict;
interface EvaluationUpdate {
  properties?: NullableSerializableDict;


Metrics are used to evaluate specific aspects of LLM performance. To learn more about metrics, see our Core Concepts guide.

Create a Metric

Creates a new evaluation metric.

async createMetric(metric: MetricCreate): Promise<Metric>


  • metric: MetricCreate object
    • name: string (required)
    • description: string (required)
    • tags: NullableStringArray (optional)

Returns: Promise<Metric>


const newMetric = await mandoline.createMetric({
  name: "Response Clarity",
  description: "Measures how clear and understandable the LLM's response is",
  tags: ["clarity", "communication"],

Get a Metric

Fetches a specific metric by its unique identifier.

async getMetric(metricId: UUID): Promise<Metric>


  • metricId: UUID (required)

Returns: Promise<Metric>


const metric = await mandoline.getMetric(

List Metrics

Fetches a list of metrics with optional filtering.

async getMetrics(options?: {
  skip?: number;
  limit?: number;
  tags?: NullableStringArray;
  filters?: SerializableDict;
}): Promise<Metric[]>


  • options: (optional)
    • skip: number (optional, default: 0)
    • limit: number (optional, default: 100, max: 1000)
    • tags: NullableStringArray (optional)
    • filters: SerializableDict (optional)

Returns: Promise<Metric[]>


const metrics = await mandoline.getMetrics({
  skip: 0,
  limit: 50,
  tags: ["clarity", "communication"],

Update a Metric

Modifies an existing metric's attributes.

async updateMetric(metricId: UUID, update: MetricUpdate): Promise<Metric>


  • metricId: UUID (required)
  • update: MetricUpdate object
    • name: string (optional)
    • description: string (optional)
    • tags: NullableStringArray (optional)

Returns: Promise<Metric>


const updatedMetric = await mandoline.updateMetric(
    description: "Updated description for the metric",
    // Fields not included will not be updated

Delete a Metric

Removes a metric permanently.

async deleteMetric(metricId: UUID): Promise<void>


  • metricId: UUID (required)

Returns: Promise<void>


await mandoline.deleteMetric("550e8400-e29b-41d4-a716-446655440000");


Evaluations in Mandoline apply metrics to specific LLM interactions. To learn more about evaluations, see our Core Concepts guide.

Create an Evaluation

Performs an evaluation for a single metric on a prompt-response pair. Supports both text and image inputs.

async createEvaluation(evaluation: EvaluationCreate): Promise<Evaluation>


  • evaluation: EvaluationCreate object
    • metricId: UUID (required)
    • prompt: string (required)
    • prompt_image: string (optional)
    • response: string (optional)
    • response_image: string (optional)
    • properties: NullableSerializableDict (optional)

Returns: Promise<Evaluation>

Note: At least one of response or response_image must be provided. Images should be base64 encoded with data URL format (e.g. data:image/[type];base64,[data]).


// Text-only evaluation
const textEvaluation = await mandoline.createEvaluation({
  metricId: "550e8400-e29b-41d4-a716-446655440000",
  prompt: "Explain quantum computing",
  response: "Quantum computing uses quantum mechanics...",
  properties: { model: "my-llm-model-v1" },
// Image-based evaluation
const imageEvaluation = await mandoline.createEvaluation({
  metricId: "550e8400-e29b-41d4-a716-446655440000",
  prompt: "Describe this image",
  prompt_image: "...",
  response: "The image shows a sunset over mountains",
  properties: { model: "my-vision-model-v1" },

Note: This is a compute-heavy operation and is therefore rate limited to 3 requests / second. If you exceed this limit, you'll receive a RateLimitExceeded error.

Get an Evaluation

Fetches details of a specific evaluation.

async getEvaluation(evaluationId: UUID): Promise<Evaluation>


  • evaluationId: UUID (required)

Returns: Promise<Evaluation>


const evaluation = await mandoline.getEvaluation(

List Evaluations

Fetches a list of evaluations with optional filtering.

async getEvaluations(options?: {
  skip?: number;
  limit?: number;
  metricId?: UUID;
  properties?: NullableSerializableDict;
  filters?: SerializableDict;
}): Promise<Evaluation[]>


  • options: (optional)
    • skip: number (optional, default: 0)
    • limit: number (optional, default: 100, max: 1000)
    • metricId: UUID (optional)
    • properties: NullableSerializableDict (optional)
    • filters: SerializableDict (optional)

Returns: Promise<Evaluation[]>


const evaluations = await mandoline.getEvaluations({
  skip: 0,
  limit: 50,
  metricId: "550e8400-e29b-41d4-a716-446655440000",
  properties: { model: "my-llm-model-v1" },

Update an Evaluation

Modifies an existing evaluation's properties.

async updateEvaluation(evaluationId: UUID, update: EvaluationUpdate): Promise<Evaluation>


  • evaluationId: UUID (required)
  • update: EvaluationUpdate object
    • properties: NullableSerializableDict (optional)

Returns: Promise<Evaluation>


const updatedEvaluation = await mandoline.updateEvaluation(
    properties: { reviewed: true },

Delete an Evaluation

Removes an evaluation permanently.

async deleteEvaluation(evaluationId: UUID): Promise<void>


  • evaluationId: UUID (required)

Returns: Promise<void>


await mandoline.deleteEvaluation("550e8400-e29b-41d4-a716-446655440000");

Evaluate Multiple Metrics

Performs evaluations across multiple metrics for a given prompt-response pair. Supports both text and image inputs.

async evaluate(
  metrics: Metric[],
  prompt: string,
  prompt_image?: string,
  response?: string,
  response_image?: string
  properties?: NullableSerializableDict,
): Promise<Evaluation[]>


  • metrics: Metric[] (required) - An array of metrics to evaluate against
  • prompt: string (required) - The prompt to evaluate
  • response: string (optional) - The response to evaluate
  • properties: NullableSerializableDict (optional) - Additional properties to include with the evaluations
  • prompt_image: string (optional) - Base64 encoded image with data URL format
  • response_image: string (optional) - Base64 encoded image with data URL format

Note: At least one of response or response_image must be provided. Images should be base64 encoded with data URL format (e.g. data:image/[type];base64,[data]).

Returns: Promise<Evaluation[]>


const metrics = await mandoline.getMetrics({ tags: ["depth"] });
const evaluations = await mandoline.evaluate(
  "Explain the theory of relativity",
  "The theory of relativity, proposed by Albert Einstein...",
  { model: "my-llm-model-v1" },

Advanced Concepts


Mandoline uses offset-based pagination for listing metrics and evaluations:

  • skip: Number of items to skip before returning results.
  • limit: Maximum number of items to return in a single request.


// Get first 50 metrics
const firstPage = await mandoline.getMetrics({ limit: 50 });
// Get next 50 metrics
const secondPage = await mandoline.getMetrics({ skip: 50, limit: 50 });

For queries larger than 1000 items, multiple requests are required.

Find this content useful?

Sign up for our newsletter.

We care about your data. Read our privacy policy.