Mandoline API Reference
Table of Contents
Authentication
To use the Mandoline API:
- Sign up (opens in a new tab) for a Mandoline account.
- Get your API key from the account page (opens in a new tab).
Installation
To install the Mandoline Node.js (opens in a new tab) SDK:
npm install mandoline
Setup
Initialize the Mandoline client with your API key:
import { Mandoline } from "mandoline";
const mandoline = new Mandoline({ apiKey: "your-api-key" });
Or use an environment variable:
// Set MANDOLINE_API_KEY in your environment
const mandoline = new Mandoline();
Data Models
Here are the main data models used in Mandoline:
type UUID = string;
type SerializableDict = { [key: string]: any };
type NullableSerializableDict = SerializableDict | null;
type StringArray = ReadonlyArray<string>;
type NullableStringArray = StringArray | null;
interface Metric {
id: UUID;
createdAt: string;
updatedAt: string;
name: string;
description: string;
tags?: NullableStringArray;
}
interface MetricCreate {
name: string;
description: string;
tags?: NullableStringArray;
}
interface MetricUpdate {
name?: string;
description?: string;
tags?: NullableStringArray;
}
interface Evaluation {
id: UUID;
createdAt: string;
updatedAt: string;
metricId: UUID;
prompt: string;
response: string;
properties?: NullableSerializableDict;
score: number;
}
interface EvaluationCreate {
metricId: UUID;
prompt: string;
response: string;
properties?: NullableSerializableDict;
}
interface EvaluationUpdate {
properties?: NullableSerializableDict;
}
Metrics
Metrics are used to evaluate specific aspects of LLM performance. To learn more about metrics, see our Core Concepts guide.
Create a Metric
Creates a new evaluation metric.
async createMetric(metric: MetricCreate): Promise<Metric>
Parameters:
metric
:MetricCreate
objectname
:string
(required)description
:string
(required)tags
:NullableStringArray
(optional)
Returns: Promise<Metric>
Example:
const newMetric = await mandoline.createMetric({
name: "Response Clarity",
description: "Measures how clear and understandable the LLM's response is",
tags: ["clarity", "communication"],
});
Get a Metric
Fetches a specific metric by its unique identifier.
async getMetric(metricId: UUID): Promise<Metric>
Parameters:
metricId
:UUID
(required)
Returns: Promise<Metric>
Example:
const metric = await mandoline.getMetric(
"550e8400-e29b-41d4-a716-446655440000",
);
List Metrics
Fetches a list of metrics with optional filtering.
async getMetrics(options?: {
skip?: number;
limit?: number;
tags?: NullableStringArray;
filters?: SerializableDict;
}): Promise<Metric[]>
Parameters:
options
: (optional)skip
:number
(optional, default: 0)limit
:number
(optional, default: 100, max: 1000)tags
:NullableStringArray
(optional)filters
:SerializableDict
(optional)
Returns: Promise<Metric[]>
Example:
const metrics = await mandoline.getMetrics({
skip: 0,
limit: 50,
tags: ["clarity", "communication"],
});
Update a Metric
Modifies an existing metric's attributes.
async updateMetric(metricId: UUID, update: MetricUpdate): Promise<Metric>
Parameters:
metricId
:UUID
(required)update
:MetricUpdate
objectname
:string
(optional)description
:string
(optional)tags
:NullableStringArray
(optional)
Returns: Promise<Metric>
Example:
const updatedMetric = await mandoline.updateMetric(
"550e8400-e29b-41d4-a716-446655440000",
{
description: "Updated description for the metric",
// Fields not included will not be updated
},
);
Delete a Metric
Removes a metric permanently.
async deleteMetric(metricId: UUID): Promise<void>
Parameters:
metricId
:UUID
(required)
Returns: Promise<void>
Example:
await mandoline.deleteMetric("550e8400-e29b-41d4-a716-446655440000");
Evaluations
Evaluations in Mandoline apply metrics to specific LLM interactions. To learn more about evaluations, see our Core Concepts guide.
Create an Evaluation
Performs an evaluation for a single metric on a prompt-response pair.
async createEvaluation(evaluation: EvaluationCreate): Promise<Evaluation>
Parameters:
evaluation
:EvaluationCreate
objectmetricId
:UUID
(required)prompt
:string
(required)response
:string
(required)properties
:NullableSerializableDict
(optional)
Returns: Promise<Evaluation>
Example:
const newEvaluation = await mandoline.createEvaluation({
metricId: "550e8400-e29b-41d4-a716-446655440000",
prompt: "Explain quantum computing",
response: "Quantum computing uses quantum mechanics...",
properties: { model: "my-llm-model-v1" },
});
Note: This is a compute-heavy operation and is therefore rate limited to 3 requests / second. If you exceed this limit, you'll receive a RateLimitExceeded
error.
Get an Evaluation
Fetches details of a specific evaluation.
async getEvaluation(evaluationId: UUID): Promise<Evaluation>
Parameters:
evaluationId
:UUID
(required)
Returns: Promise<Evaluation>
Example:
const evaluation = await mandoline.getEvaluation(
"550e8400-e29b-41d4-a716-446655440000",
);
List Evaluations
Fetches a list of evaluations with optional filtering.
async getEvaluations(options?: {
skip?: number;
limit?: number;
metricId?: UUID;
properties?: NullableSerializableDict;
filters?: SerializableDict;
}): Promise<Evaluation[]>
Parameters:
options
: (optional)skip
:number
(optional, default: 0)limit
:number
(optional, default: 100, max: 1000)metricId
:UUID
(optional)properties
:NullableSerializableDict
(optional)filters
:SerializableDict
(optional)
Returns: Promise<Evaluation[]>
Example:
const evaluations = await mandoline.getEvaluations({
skip: 0,
limit: 50,
metricId: "550e8400-e29b-41d4-a716-446655440000",
properties: { model: "my-llm-model-v1" },
});
Update an Evaluation
Modifies an existing evaluation's properties.
async updateEvaluation(evaluationId: UUID, update: EvaluationUpdate): Promise<Evaluation>
Parameters:
evaluationId
:UUID
(required)update
:EvaluationUpdate
objectproperties
:NullableSerializableDict
(optional)
Returns: Promise<Evaluation>
Example:
const updatedEvaluation = await mandoline.updateEvaluation(
"550e8400-e29b-41d4-a716-446655440000",
{
properties: { reviewed: true },
},
);
Delete an Evaluation
Removes an evaluation permanently.
async deleteEvaluation(evaluationId: UUID): Promise<void>
Parameters:
evaluationId
:UUID
(required)
Returns: Promise<void>
Example:
await mandoline.deleteEvaluation("550e8400-e29b-41d4-a716-446655440000");
Evaluate Multiple Metrics
Performs evaluations across multiple metrics for a given prompt-response pair.
async evaluate(
metrics: Metric[],
prompt: string,
response: string,
properties?: NullableSerializableDict
): Promise<Evaluation[]>
Parameters:
metrics
:Metric[]
(required) - An array of metrics to evaluate againstprompt
:string
(required) - The prompt to evaluateresponse
:string
(required) - The response to evaluateproperties
:NullableSerializableDict
(optional) - Additional properties to include with the evaluations
Returns: Promise<Evaluation[]>
Example:
const metrics = await mandoline.getMetrics({ tags: ["depth"] });
const evaluations = await mandoline.evaluate(
metrics,
"Explain the theory of relativity",
"The theory of relativity, proposed by Albert Einstein...",
{ model: "my-llm-model-v1" },
);
Advanced Concepts
Pagination
Mandoline uses offset-based pagination for listing metrics and evaluations:
skip
: Number of items to skip before returning results.limit
: Maximum number of items to return in a single request.
Example:
// Get first 50 metrics
const firstPage = await mandoline.getMetrics({ limit: 50 });
// Get next 50 metrics
const secondPage = await mandoline.getMetrics({ skip: 50, limit: 50 });
For queries larger than 1000 items, multiple requests are required.