Dexter is a powerful TypeScript library for working with Large Language Models (LLMs), with a focus on real-world Retrieval-Augmented Generation (RAG) applications. It provides a set of tools and utilities to interact with various AI models, manage caching, handle embeddings, and implement AI functions.
- 
Comprehensive Model Support: Implementations for Chat, Completion, Embedding, and Sparse Vector models, with efficient OpenAI API integration via openai-fetch.
- 
Advanced AI Function Utilities: Tools for creating and managing AI functions, including createAIFunction,createAIExtractFunction, andcreateAIRunner, with Zod integration for schema validation.
- 
Structured Data Extraction: Dexter supports OpenAI's structured output feature through the createExtractFunction, which uses theresponse_formatparameter with a JSON schema derived from a Zod schema.
- 
Flexible Caching and Tokenization: Built-in caching system with custom cache support, and advanced tokenization based on tiktokenfor accurate token management.
- 
Robust Observability and Control: Customizable telemetry system, comprehensive event hooks, and specialized error handling for enhanced monitoring and control. 
- 
Performance Optimization: Built-in support for batching, throttling, and streaming, optimized for handling large-scale operations and real-time responses. 
- 
TypeScript-First and Environment Flexible: Fully typed for excellent developer experience, with minimal dependencies and compatibility across Node.js 18+, Deno, Cloudflare Workers, and Vercel edge functions. 
To install Dexter, use your preferred package manager:
npm install @dexaai/dexterThis package requires node >= 18 or an environment with fetch support.
This package exports ESM. If your project uses CommonJS, consider switching to ESM or use the dynamic import() function.
Here's a basic example of how to use the ChatModel:
import { ChatModel } from '@dexaai/dexter';
const chatModel = new ChatModel({
  params: { model: 'gpt-3.5-turbo' },
});
const response = await chatModel.run({
  messages: [{ role: 'user', content: 'Tell me a short joke' }],
});
console.log(response.message.content);
}import { ChatModel, MsgUtil } from '@dexaai/dexter';
const chatModel = new ChatModel({
  params: { model: 'gpt-4' },
});
const response = await chatModel.run({
  messages: [MsgUtil.user('Write a short story about a robot learning to love')],
  handleUpdate: (chunk) => {
    process.stdout.write(chunk);
  },
});
console.log('\n\nFull response:', response.message.content);import { ChatModel, createExtractFunction } from '@dexaai/dexter';
import { z } from 'zod';
const extractPeopleNames = createExtractFunction({
  chatModel: new ChatModel({ params: { model: 'gpt-4o-mini' } }),
  systemMessage: `You extract the names of people from unstructured text.`,
  name: 'people_names',
  schema: z.object({
    names: z.array(
      z.string().describe(
        `The name of a person from the message. Normalize the name by removing suffixes, prefixes, and fixing capitalization`
      )
    ),
  }),
});
const peopleNames = await extractPeopleNames(
  `Dr. Andrew Huberman interviewed Tony Hawk, an idol of Andrew Hubermans.`
);
console.log('peopleNames', peopleNames);
// => ['Andrew Huberman', 'Tony Hawk']import { ChatModel, MsgUtil, createAIFunction } from '@dexaai/dexter';
import { z } from 'zod';
const getWeather = createAIFunction(
  {
    name: 'get_weather',
    description: 'Gets the weather for a given location',
    argsSchema: z.object({
      location: z.string().describe('The city and state e.g. San Francisco, CA'),
      unit: z.enum(['c', 'f']).optional().default('f').describe('The unit of temperature to use'),
    }),
  },
  async ({ location, unit }) => {
    // Simulate API call
    await new Promise((resolve) => setTimeout(resolve, 500));
    return {
      location,
      unit,
      temperature: Math.floor(Math.random() * 30) + 10,
      condition: ['sunny', 'cloudy', 'rainy'][Math.floor(Math.random() * 3)],
    };
  }
);
const chatModel = new ChatModel({
  params: {
    model: 'gpt-4',
    tools: [{ type: 'function', function: getWeather.spec }],
  },
});
const response = await chatModel.run({
  messages: [MsgUtil.user('What\'s the weather like in New York?')],
});
console.log(response.message);import { EmbeddingModel } from '@dexaai/dexter';
const embeddingModel = new EmbeddingModel({
  params: { model: 'text-embedding-ada-002' },
});
const response = await embeddingModel.run({
  input: ['Hello, world!', 'How are you?'],
});
console.log(response.embeddings);The Dexter library is organized into the following main directories:
- src/: Contains the source code for the library- model/: Core model implementations and utilities
- ai-function/: AI function creation and handling
 
- examples/: Contains example scripts demonstrating library usage
- dist/: Contains the compiled JavaScript output (generated after build)
Key files:
- src/model/chat.ts: Implementation of the ChatModel
- src/model/completion.ts: Implementation of the CompletionModel
- src/model/embedding.ts: Implementation of the EmbeddingModel
- src/model/sparse-vector.ts: Implementation of the SparseVectorModel
- src/ai-function/ai-function.ts: AI function creation utilities
- src/model/utils/: Various utility functions and helpers
The ChatModel class is used for interacting with chat-based language models.
new ChatModel(args?: ChatModelArgs<CustomCtx>)- args: Optional configuration object- params: Model parameters (e.g.,- model,- temperature)
- client: Custom OpenAI client (optional)
- cache: Cache implementation (optional)
- context: Custom context object (optional)
- events: Event handlers (optional)
- debug: Enable debug logging (optional)
 
- run(params: ChatModelRun, context?: CustomCtx): Promise<ChatModelResponse>- Executes the chat model with the given parameters and context
 
- extend(args?: PartialChatModelArgs<CustomCtx>): ChatModel<CustomCtx>- Creates a new instance of the model with modified configuration
 
The CompletionModel class is used for text completion tasks.
new CompletionModel(args?: CompletionModelArgs<CustomCtx>)- args: Optional configuration object (similar to ChatModel)
- run(params: CompletionModelRun, context?: CustomCtx): Promise<CompletionModelResponse>- Executes the completion model with the given parameters and context
 
- extend(args?: PartialCompletionModelArgs<CustomCtx>): CompletionModel<CustomCtx>- Creates a new instance of the model with modified configuration
 
The EmbeddingModel class is used for generating embeddings from text.
new EmbeddingModel(args?: EmbeddingModelArgs<CustomCtx>)- args: Optional configuration object (similar to ChatModel)
- run(params: EmbeddingModelRun, context?: CustomCtx): Promise<EmbeddingModelResponse>- Generates embeddings for the given input texts
 
- extend(args?: PartialEmbeddingModelArgs<CustomCtx>): EmbeddingModel<CustomCtx>- Creates a new instance of the model with modified configuration
 
The SparseVectorModel class is used for generating sparse vector representations.
new SparseVectorModel(args: SparseVectorModelArgs<CustomCtx>)- args: Configuration object- serviceUrl: URL of the SPLADE service (required)
- Other options similar to ChatModel
 
- run(params: SparseVectorModelRun, context?: CustomCtx): Promise<SparseVectorModelResponse>- Generates sparse vector representations for the given input texts
 
- extend(args?: PartialSparseVectorModelArgs<CustomCtx>): SparseVectorModel<CustomCtx>- Creates a new instance of the model with modified configuration
 
Creates a function to extract structured data from text using OpenAI's structured output feature.
This is a better way to extract structured data than using the legacy createAIExtractFunction function.
createExtractFunction<Schema extends z.ZodObject<any>>(args: {
  chatModel: Model.Chat.Model;
  name: string;
  schema: Schema;
  systemMessage: string;
}): (input: string | Msg) => Promise<z.infer<Schema>>Creates a function meant to be used with OpenAI tool or function calling.
createAIFunction<Schema extends z.ZodObject<any>, Return>(
  spec: {
    name: string;
    description?: string;
    argsSchema: Schema;
  },
  implementation: (params: z.infer<Schema>) => Promise<Return>
): AIFunction<Schema, Return>Creates a function to extract structured data from text using OpenAI function calling.
createAIExtractFunction<Schema extends z.ZodObject<any>>(
  {
    chatModel: Model.Chat.Model;
    name: string;
    description?: string;
    schema: Schema;
    maxRetries?: number;
    systemMessage?: string;
    functionCallConcurrency?: number;
  },
  customExtractImplementation?: (params: z.infer<Schema>) => z.infer<Schema> | Promise<z.infer<Schema>>
): ExtractFunction<Schema>Creates a function to run a chat model in a loop, handling parsing, running, and inserting responses for function & tool call messages.
createAIRunner<Content = string>(args: {
  chatModel: Model.Chat.Model;
  functions?: AIFunction[];
  shouldBreakLoop?: (msg: Msg) => boolean;
  maxIterations?: number;
  functionCallConcurrency?: number;
  validateContent?: (content: string | null) => Content | Promise<Content>;
  mode?: Runner.Mode;
  systemMessage?: string;
  onRetriableError?: (error: Error) => void;
}): Runner<Content>Utility class for creating and checking message types.
- MsgUtil.system(content: string, opts?): Msg.System
- MsgUtil.user(content: string, opts?): Msg.User
- MsgUtil.assistant(content: string, opts?): Msg.Assistant
- MsgUtil.funcCall(function_call: { name: string; arguments: string }, opts?): Msg.FuncCall
- MsgUtil.funcResult(content: Jsonifiable, name: string): Msg.FuncResult
- MsgUtil.toolCall(tool_calls: Msg.Call.Tool[], opts?): Msg.ToolCall
- MsgUtil.toolResult(content: Jsonifiable, tool_call_id: string, opts?): Msg.ToolResult
Utility for encoding, decoding, and counting tokens for various models.
- createTokenizer(model: string): Tokenizer
Utilities for caching model responses.
- type CacheStorage<KeyType, ValueType>
- type CacheKey<Params extends Record<string, any>, KeyType = string>
- defaultCacheKey<Params extends Record<string, any>>(params: Params): string
OpenAI Client (openai-fetch)
Dexter uses the openai-fetch library to interact with the OpenAI API. This client is lightweight, well-typed, and provides a simple interface for making API calls. Here's how it's used in Dexter:
- 
Default Client: By default, Dexter creates an instance of OpenAIClientfromopenai-fetchwhen initializing models.
- 
Custom Client: You can provide your own instance of OpenAIClientwhen creating a model:import { OpenAIClient } from 'openai-fetch'; import { ChatModel } from '@dexaai/dexter'; const client = new OpenAIClient({ apiKey: 'your-api-key' }); const chatModel = new ChatModel({ client }); 
- 
Client Caching: Dexter implements caching for OpenAIClientinstances to improve performance when creating multiple models with the same configuration.
- 
Streaming Support: The openai-fetchclient supports streaming responses, which Dexter utilizes for real-time output in chat models.
- 
Structured Output: Dexter supports OpenAI's structured output feature through the createExtractFunction, which uses theresponse_formatparameter with a JSON schema derived from a Zod schema.
Dexter defines a set of message types (Msg) that closely align with the OpenAI API's message formats but with some enhancements for better type safety and easier handling. The MsgUtil class provides methods for creating, checking, and asserting these message types.
- Msg.System: System messages
- Msg.User: User messages
- Msg.Assistant: Assistant messages
- Msg.Refusal: Refusal messages (thrown as errors in Dexter)
- Msg.FuncCall: Function call messages
- Msg.FuncResult: Function result messages
- Msg.ToolCall: Tool call messages
- Msg.ToolResult: Tool result messages
These types are designed to be compatible with the ChatMessage type from openai-fetch, with some differences:
- Dexter throws a RefusalErrorfor refusal messages instead of including them in theMsgunion.
- The contentproperty is always defined (string or null) in Dexter's types.
- 
Creation Methods: - system,- user,- assistant,- funcCall,- funcResult,- toolCall,- toolResult
 
- 
Type Checking Methods: - isSystem,- isUser,- isAssistant,- isRefusal,- isFuncCall,- isFuncResult,- isToolCall,- isToolResult
 
- 
Type Assertion Methods: - assertSystem,- assertUser,- assertAssistant,- assertRefusal,- assertFuncCall,- assertFuncResult,- assertToolCall,- assertToolResult
 
- 
Conversion Method: - fromChatMessage: Converts an- openai-fetch- ChatMessageto a Dexter- Msgtype
 
Dexter includes a telemetry system for tracking and logging model operations. The telemetry system is based on the OpenTelemetry standard and can be integrated with various observability platforms.
- 
Default Telemetry: By default, Dexter uses a no-op telemetry provider that doesn't perform any actual logging or tracing. 
- 
Custom Telemetry: You can provide your own telemetry provider when initializing models. The provider should implement the Telemetry.Providerinterface:interface Provider { startSpan<T>(options: SpanOptions, callback: (span: Span) => T): T; setTags(tags: { [key: string]: Primitive }): void; } 
- 
Span Attributes: Dexter automatically adds various attributes to telemetry spans, including model type, provider, input tokens, output tokens, and more. 
- 
Usage: Telemetry is used internally in the AbstractModelclass to wrap therunmethod, providing insights into model execution.
Dexter provides a flexible caching system to improve performance and reduce API calls:
- 
Cache Interface: The cache must implement the CacheStorageinterface:interface CacheStorage<KeyType, ValueType> { get: (key: KeyType) => Promise<ValueType | undefined> | ValueType | undefined; set: (key: KeyType, value: ValueType) => Promise<unknown> | unknown; } 
- 
Default Cache Key: Dexter uses a default cache key function that creates a SHA512 hash of the input parameters. 
- 
Custom Cache: You can provide your own cache implementation when initializing models: import { ChatModel } from '@dexaai/dexter'; const customCache = new Map(); const chatModel = new ChatModel({ cache: customCache }); 
- 
Cache Usage: Caching is automatically applied in the AbstractModelclass. Before making an API call, it checks the cache for a stored response. After receiving a response, it stores it in the cache for future use.
- 
Cache Invalidation: Cache invalidation is left to the user. You can implement your own cache invalidation strategy based on your specific use case. 
Dexter includes a tokenization system based on the tiktoken library, which is used by OpenAI for their models. This system is crucial for accurately counting tokens and managing model inputs and outputs.
- 
Tokenizer Creation: The createTokenizerfunction creates aTokenizerinstance for a specific model:const tokenizer = createTokenizer('gpt-3.5-turbo'); 
- 
Tokenizer Methods: - encode(text: string): Uint32Array: Encodes text to tokens
- decode(tokens: number[] | Uint32Array): string: Decodes tokens to text
- countTokens(input?: string | ChatMessage | ChatMessage[]): number: Counts tokens in various input formats
- truncate({ text: string, max: number, from?: 'start' | 'end' }): string: Truncates text to a maximum number of tokens
 
- 
Model Integration: Each model instance has its own Tokenizer, which is used internally for token counting and management.
Dexter provides a system of event hooks that allow you to add custom logic at various points in the model execution process. These hooks are defined in the Model.Events interface:
- 
Available Hooks: - onStart: Called before the model execution starts
- onApiResponse: Called after receiving a response from the API
- onComplete: Called after the model execution is complete
- onError: Called if an error occurs during model execution
 
- 
Hook Parameters: Each hook receives an event object with relevant information, such as timestamps, model parameters, responses, and context. 
- 
Usage: Event hooks can be defined when creating a model instance: const chatModel = new ChatModel({ events: { onStart: [(event) => console.log('Starting model execution', event)], onComplete: [(event) => console.log('Model execution complete', event)], }, }); 
- 
Multiple Handlers: Each event can have multiple handlers, which are executed in the order they are defined. 
- 
Async Handlers: Event handlers can be asynchronous functions. Dexter uses Promise.allSettledto handle multiple async handlers.
- 
Extending Models: When using the extendmethod to create a new model instance, event handlers are merged, allowing you to add new handlers without removing existing ones.const boringModel = new ChatModel({ params: { model: 'gpt-4o', temperature: 0 } }); const funModel = boringModel.extend({ params: { temperature: 2 } }); const cheapAndFunModel = funModel.extend({ params: { model: 'gpt-4o-mini' } }); 
MIT © Dexa