Docs
API Reference

Mandoline API Reference

Table of Contents

  1. Authentication
  2. Installation
  3. Setup
  4. Data Models
  5. Metrics
  6. Evaluations
  7. Advanced Concepts

Authentication

To use the Mandoline API:

  1. Sign up (opens in a new tab) for a Mandoline account.
  2. Get your API key from the account page (opens in a new tab).

Installation

To install the Mandoline Node.js (opens in a new tab) SDK:

npm install mandoline

Setup

Initialize the Mandoline client with your API key:

import { Mandoline } from "mandoline";
 
const mandoline = new Mandoline({ apiKey: "your-api-key" });

Or use an environment variable:

// Set MANDOLINE_API_KEY in your environment
const mandoline = new Mandoline();

Data Models

Here are the main data models used in Mandoline:

type UUID = string;
 
type SerializableDict = { [key: string]: any };
type NullableSerializableDict = SerializableDict | null;
 
type StringArray = ReadonlyArray<string>;
type NullableStringArray = StringArray | null;
 
interface Metric {
  id: UUID;
  createdAt: string;
  updatedAt: string;
  name: string;
  description: string;
  tags?: NullableStringArray;
}
 
interface MetricCreate {
  name: string;
  description: string;
  tags?: NullableStringArray;
}
 
interface MetricUpdate {
  name?: string;
  description?: string;
  tags?: NullableStringArray;
}
 
interface Evaluation {
  id: UUID;
  createdAt: string;
  updatedAt: string;
  metricId: UUID;
  prompt: string;
  prompt_image?: string;
  response?: string;
  response_image?: string;
  properties?: NullableSerializableDict;
  score: number;
}
 
interface EvaluationCreate {
  metricId: UUID;
  prompt: string;
  prompt_image?: string;
  response?: string;
  response_image?: string;
  properties?: NullableSerializableDict;
}
 
interface EvaluationUpdate {
  properties?: NullableSerializableDict;
}

Metrics

Metrics are used to evaluate specific aspects of LLM performance. To learn more about metrics, see our Core Concepts guide.

Create a Metric

Creates a new evaluation metric.

async createMetric(metric: MetricCreate): Promise<Metric>

Parameters:

  • metric: MetricCreate object
    • name: string (required)
    • description: string (required)
    • tags: NullableStringArray (optional)

Returns: Promise<Metric>

Example:

const newMetric = await mandoline.createMetric({
  name: "Response Clarity",
  description: "Measures how clear and understandable the LLM's response is",
  tags: ["clarity", "communication"],
});

Get a Metric

Fetches a specific metric by its unique identifier.

async getMetric(metricId: UUID): Promise<Metric>

Parameters:

  • metricId: UUID (required)

Returns: Promise<Metric>

Example:

const metric = await mandoline.getMetric(
  "550e8400-e29b-41d4-a716-446655440000",
);

List Metrics

Fetches a list of metrics with optional filtering.

async getMetrics(options?: {
  skip?: number;
  limit?: number;
  tags?: NullableStringArray;
  filters?: SerializableDict;
}): Promise<Metric[]>

Parameters:

  • options: (optional)
    • skip: number (optional, default: 0)
    • limit: number (optional, default: 100, max: 1000)
    • tags: NullableStringArray (optional)
    • filters: SerializableDict (optional)

Returns: Promise<Metric[]>

Example:

const metrics = await mandoline.getMetrics({
  skip: 0,
  limit: 50,
  tags: ["clarity", "communication"],
});

Update a Metric

Modifies an existing metric's attributes.

async updateMetric(metricId: UUID, update: MetricUpdate): Promise<Metric>

Parameters:

  • metricId: UUID (required)
  • update: MetricUpdate object
    • name: string (optional)
    • description: string (optional)
    • tags: NullableStringArray (optional)

Returns: Promise<Metric>

Example:

const updatedMetric = await mandoline.updateMetric(
  "550e8400-e29b-41d4-a716-446655440000",
  {
    description: "Updated description for the metric",
    // Fields not included will not be updated
  },
);

Delete a Metric

Removes a metric permanently.

async deleteMetric(metricId: UUID): Promise<void>

Parameters:

  • metricId: UUID (required)

Returns: Promise<void>

Example:

await mandoline.deleteMetric("550e8400-e29b-41d4-a716-446655440000");

Evaluations

Evaluations in Mandoline apply metrics to specific LLM interactions. To learn more about evaluations, see our Core Concepts guide.

Create an Evaluation

Performs an evaluation for a single metric on a prompt-response pair. Supports both text and image inputs.

async createEvaluation(evaluation: EvaluationCreate): Promise<Evaluation>

Parameters:

  • evaluation: EvaluationCreate object
    • metricId: UUID (required)
    • prompt: string (required)
    • prompt_image: string (optional)
    • response: string (optional)
    • response_image: string (optional)
    • properties: NullableSerializableDict (optional)

Returns: Promise<Evaluation>

Note: At least one of response or response_image must be provided. Images should be base64 encoded with data URL format (e.g. data:image/[type];base64,[data]).

Example:

// Text-only evaluation
const textEvaluation = await mandoline.createEvaluation({
  metricId: "550e8400-e29b-41d4-a716-446655440000",
  prompt: "Explain quantum computing",
  response: "Quantum computing uses quantum mechanics...",
  properties: { model: "my-llm-model-v1" },
});
 
// Image-based evaluation
const imageEvaluation = await mandoline.createEvaluation({
  metricId: "550e8400-e29b-41d4-a716-446655440000",
  prompt: "Describe this image",
  prompt_image: "...",
  response: "The image shows a sunset over mountains",
  properties: { model: "my-vision-model-v1" },
});

Note: This is a compute-heavy operation and is therefore rate limited to 3 requests / second. If you exceed this limit, you'll receive a RateLimitExceeded error.

Get an Evaluation

Fetches details of a specific evaluation.

async getEvaluation(evaluationId: UUID): Promise<Evaluation>

Parameters:

  • evaluationId: UUID (required)

Returns: Promise<Evaluation>

Example:

const evaluation = await mandoline.getEvaluation(
  "550e8400-e29b-41d4-a716-446655440000",
);

List Evaluations

Fetches a list of evaluations with optional filtering.

async getEvaluations(options?: {
  skip?: number;
  limit?: number;
  metricId?: UUID;
  properties?: NullableSerializableDict;
  filters?: SerializableDict;
}): Promise<Evaluation[]>

Parameters:

  • options: (optional)
    • skip: number (optional, default: 0)
    • limit: number (optional, default: 100, max: 1000)
    • metricId: UUID (optional)
    • properties: NullableSerializableDict (optional)
    • filters: SerializableDict (optional)

Returns: Promise<Evaluation[]>

Example:

const evaluations = await mandoline.getEvaluations({
  skip: 0,
  limit: 50,
  metricId: "550e8400-e29b-41d4-a716-446655440000",
  properties: { model: "my-llm-model-v1" },
});

Update an Evaluation

Modifies an existing evaluation's properties.

async updateEvaluation(evaluationId: UUID, update: EvaluationUpdate): Promise<Evaluation>

Parameters:

  • evaluationId: UUID (required)
  • update: EvaluationUpdate object
    • properties: NullableSerializableDict (optional)

Returns: Promise<Evaluation>

Example:

const updatedEvaluation = await mandoline.updateEvaluation(
  "550e8400-e29b-41d4-a716-446655440000",
  {
    properties: { reviewed: true },
  },
);

Delete an Evaluation

Removes an evaluation permanently.

async deleteEvaluation(evaluationId: UUID): Promise<void>

Parameters:

  • evaluationId: UUID (required)

Returns: Promise<void>

Example:

await mandoline.deleteEvaluation("550e8400-e29b-41d4-a716-446655440000");

Evaluate Multiple Metrics

Performs evaluations across multiple metrics for a given prompt-response pair. Supports both text and image inputs.

async evaluate(
  metrics: Metric[],
  prompt: string,
  prompt_image?: string,
  response?: string,
  response_image?: string
  properties?: NullableSerializableDict,
): Promise<Evaluation[]>

Parameters:

  • metrics: Metric[] (required) - An array of metrics to evaluate against
  • prompt: string (required) - The prompt to evaluate
  • response: string (optional) - The response to evaluate
  • properties: NullableSerializableDict (optional) - Additional properties to include with the evaluations
  • prompt_image: string (optional) - Base64 encoded image with data URL format
  • response_image: string (optional) - Base64 encoded image with data URL format

Note: At least one of response or response_image must be provided. Images should be base64 encoded with data URL format (e.g. data:image/[type];base64,[data]).

Returns: Promise<Evaluation[]>

Example:

const metrics = await mandoline.getMetrics({ tags: ["depth"] });
const evaluations = await mandoline.evaluate(
  metrics,
  "Explain the theory of relativity",
  "The theory of relativity, proposed by Albert Einstein...",
  { model: "my-llm-model-v1" },
);

Advanced Concepts

Pagination

Mandoline uses offset-based pagination for listing metrics and evaluations:

  • skip: Number of items to skip before returning results.
  • limit: Maximum number of items to return in a single request.

Example:

// Get first 50 metrics
const firstPage = await mandoline.getMetrics({ limit: 50 });
 
// Get next 50 metrics
const secondPage = await mandoline.getMetrics({ skip: 50, limit: 50 });

For queries larger than 1000 items, multiple requests are required.

Find this content useful?

Sign up for our newsletter.

We care about your data. Read our privacy policy.