Mandoline API Reference
Table of Contents
Authentication
To use the Mandoline API:
- Sign up (opens in a new tab) for a Mandoline account.
- Get your API key from the account page (opens in a new tab).
Installation
To install the Mandoline Node.js (opens in a new tab) SDK:
npm install mandolineSetup
Initialize the Mandoline client with your API key:
import { Mandoline } from "mandoline";
const mandoline = new Mandoline({ apiKey: "your-api-key" });Or use an environment variable:
// Set MANDOLINE_API_KEY in your environment
const mandoline = new Mandoline();Data Models
Here are the main data models used in Mandoline:
type UUID = string;
type SerializableDict = { [key: string]: any };
type NullableSerializableDict = SerializableDict | null;
type StringArray = ReadonlyArray<string>;
type NullableStringArray = StringArray | null;
interface Metric {
id: UUID;
createdAt: string;
updatedAt: string;
name: string;
description: string;
tags?: NullableStringArray;
}
interface MetricCreate {
name: string;
description: string;
tags?: NullableStringArray;
}
interface MetricUpdate {
name?: string;
description?: string;
tags?: NullableStringArray;
}
interface Evaluation {
id: UUID;
createdAt: string;
updatedAt: string;
metricId: UUID;
prompt: string;
prompt_image?: string;
response?: string;
response_image?: string;
properties?: NullableSerializableDict;
score: number;
}
interface EvaluationCreate {
metricId: UUID;
prompt: string;
prompt_image?: string;
response?: string;
response_image?: string;
properties?: NullableSerializableDict;
}
interface EvaluationUpdate {
properties?: NullableSerializableDict;
}Metrics
Metrics are used to evaluate specific aspects of LLM performance. To learn more about metrics, see our Core Concepts guide.
Create a Metric
Creates a new evaluation metric.
async createMetric(metric: MetricCreate): Promise<Metric>Parameters:
metric:MetricCreateobjectname:string(required)description:string(required)tags:NullableStringArray(optional)
Returns: Promise<Metric>
Example:
const newMetric = await mandoline.createMetric({
name: "Response Clarity",
description: "Measures how clear and understandable the LLM's response is",
tags: ["clarity", "communication"],
});Get a Metric
Fetches a specific metric by its unique identifier.
async getMetric(metricId: UUID): Promise<Metric>Parameters:
metricId:UUID(required)
Returns: Promise<Metric>
Example:
const metric = await mandoline.getMetric(
"550e8400-e29b-41d4-a716-446655440000",
);List Metrics
Fetches a list of metrics with optional filtering.
async getMetrics(options?: {
skip?: number;
limit?: number;
tags?: NullableStringArray;
filters?: SerializableDict;
}): Promise<Metric[]>Parameters:
options: (optional)skip:number(optional, default: 0)limit:number(optional, default: 100, max: 1000)tags:NullableStringArray(optional)filters:SerializableDict(optional)
Returns: Promise<Metric[]>
Example:
const metrics = await mandoline.getMetrics({
skip: 0,
limit: 50,
tags: ["clarity", "communication"],
});Update a Metric
Modifies an existing metric's attributes.
async updateMetric(metricId: UUID, update: MetricUpdate): Promise<Metric>Parameters:
metricId:UUID(required)update:MetricUpdateobjectname:string(optional)description:string(optional)tags:NullableStringArray(optional)
Returns: Promise<Metric>
Example:
const updatedMetric = await mandoline.updateMetric(
"550e8400-e29b-41d4-a716-446655440000",
{
description: "Updated description for the metric",
// Fields not included will not be updated
},
);Delete a Metric
Removes a metric permanently.
async deleteMetric(metricId: UUID): Promise<void>Parameters:
metricId:UUID(required)
Returns: Promise<void>
Example:
await mandoline.deleteMetric("550e8400-e29b-41d4-a716-446655440000");Evaluations
Evaluations in Mandoline apply metrics to specific LLM interactions. To learn more about evaluations, see our Core Concepts guide.
Create an Evaluation
Performs an evaluation for a single metric on a prompt-response pair. Supports both text and image inputs.
async createEvaluation(evaluation: EvaluationCreate): Promise<Evaluation>Parameters:
evaluation:EvaluationCreateobjectmetricId:UUID(required)prompt:string(required)prompt_image:string(optional)response:string(optional)response_image:string(optional)properties:NullableSerializableDict(optional)
Returns: Promise<Evaluation>
Note: At least one of response or response_image must be provided. Images should be base64 encoded with data URL format (e.g. data:image/[type];base64,[data]).
Example:
// Text-only evaluation
const textEvaluation = await mandoline.createEvaluation({
metricId: "550e8400-e29b-41d4-a716-446655440000",
prompt: "Explain quantum computing",
response: "Quantum computing uses quantum mechanics...",
properties: { model: "my-llm-model-v1" },
});
// Image-based evaluation
const imageEvaluation = await mandoline.createEvaluation({
metricId: "550e8400-e29b-41d4-a716-446655440000",
prompt: "Describe this image",
prompt_image: "data:image/jpeg;base64,/9j/4AAQSkZJRg...",
response: "The image shows a sunset over mountains",
properties: { model: "my-vision-model-v1" },
});Note: This is a compute-heavy operation and is therefore rate limited to 3 requests / second. If you exceed this limit, you'll receive a RateLimitExceeded error.
Get an Evaluation
Fetches details of a specific evaluation.
async getEvaluation(evaluationId: UUID): Promise<Evaluation>Parameters:
evaluationId:UUID(required)
Returns: Promise<Evaluation>
Example:
const evaluation = await mandoline.getEvaluation(
"550e8400-e29b-41d4-a716-446655440000",
);List Evaluations
Fetches a list of evaluations with optional filtering.
async getEvaluations(options?: {
skip?: number;
limit?: number;
metricId?: UUID;
properties?: NullableSerializableDict;
filters?: SerializableDict;
}): Promise<Evaluation[]>Parameters:
options: (optional)skip:number(optional, default: 0)limit:number(optional, default: 100, max: 1000)metricId:UUID(optional)properties:NullableSerializableDict(optional)filters:SerializableDict(optional)
Returns: Promise<Evaluation[]>
Example:
const evaluations = await mandoline.getEvaluations({
skip: 0,
limit: 50,
metricId: "550e8400-e29b-41d4-a716-446655440000",
properties: { model: "my-llm-model-v1" },
});Update an Evaluation
Modifies an existing evaluation's properties.
async updateEvaluation(evaluationId: UUID, update: EvaluationUpdate): Promise<Evaluation>Parameters:
evaluationId:UUID(required)update:EvaluationUpdateobjectproperties:NullableSerializableDict(optional)
Returns: Promise<Evaluation>
Example:
const updatedEvaluation = await mandoline.updateEvaluation(
"550e8400-e29b-41d4-a716-446655440000",
{
properties: { reviewed: true },
},
);Delete an Evaluation
Removes an evaluation permanently.
async deleteEvaluation(evaluationId: UUID): Promise<void>Parameters:
evaluationId:UUID(required)
Returns: Promise<void>
Example:
await mandoline.deleteEvaluation("550e8400-e29b-41d4-a716-446655440000");Evaluate Multiple Metrics
Performs evaluations across multiple metrics for a given prompt-response pair. Supports both text and image inputs.
async evaluate(
metrics: Metric[],
prompt: string,
prompt_image?: string,
response?: string,
response_image?: string
properties?: NullableSerializableDict,
): Promise<Evaluation[]>Parameters:
metrics:Metric[](required) - An array of metrics to evaluate againstprompt:string(required) - The prompt to evaluateresponse:string(optional) - The response to evaluateproperties:NullableSerializableDict(optional) - Additional properties to include with the evaluationsprompt_image:string(optional) - Base64 encoded image with data URL formatresponse_image:string(optional) - Base64 encoded image with data URL format
Note: At least one of response or response_image must be provided. Images should be base64 encoded with data URL format (e.g. data:image/[type];base64,[data]).
Returns: Promise<Evaluation[]>
Example:
const metrics = await mandoline.getMetrics({ tags: ["depth"] });
const evaluations = await mandoline.evaluate(
metrics,
"Explain the theory of relativity",
"The theory of relativity, proposed by Albert Einstein...",
{ model: "my-llm-model-v1" },
);Advanced Concepts
Pagination
Mandoline uses offset-based pagination for listing metrics and evaluations:
skip: Number of items to skip before returning results.limit: Maximum number of items to return in a single request.
Example:
// Get first 50 metrics
const firstPage = await mandoline.getMetrics({ limit: 50 });
// Get next 50 metrics
const secondPage = await mandoline.getMetrics({ skip: 50, limit: 50 });For queries larger than 1000 items, multiple requests are required.