Transforms

Overview

Transforms are composable functions that enrich your knowledge graph. Chain them together using the transform() API to build sophisticated knowledge representations.

Core Transforms

chunk

Splits documents into smaller, semantically meaningful pieces:

import { chunk } from '@open-evals/generator'
import { RecursiveCharacterSplitter } from '@open-evals/rag'

const graph = await transform(baseGraph)
  .pipe(
    chunk(
      new RecursiveCharacterSplitter({
        chunkSize: 512,
        chunkOverlap: 50,
      })
    )
  )
  .apply()

Parameters

Prop

Type

What it does: For each document node, creates multiple chunk nodes using the provided text splitter. Maintains parent-child relationships between documents and chunks.

When to use: Essential for RAG systems and any scenario where you need granular access to document content. Smaller chunks improve retrieval precision.

embed

Adds vector embeddings for semantic similarity:

import { embed } from '@open-evals/generator'

const graph = await transform(baseGraph)
  .pipe(embed(openai.embedding('text-embedding-3-small')))
  .apply()

Parameters

Prop

Type

What it does: Generates embeddings for all chunk nodes, storing them in the embedding property. Uses the embedding model you provide.

When to use: Required for semantic search and relationship detection. Run after chunking so each chunk gets its own embedding.

relationship

Detects and creates connections between related chunks:

import { relationship } from '@open-evals/generator'

const graph = await transform(baseGraph).pipe(relationship(0.7)).apply()

Parameters

Prop

Type

What it does: Compares chunk embeddings to find semantically similar content. Creates bidirectional relationships between related chunks with similarity scores.

When to use: After embedding. Enables multi-hop question generation by connecting related concepts across your knowledge base.

summarize

Generates concise summaries of documents:

import { summarize } from '@open-evals/generator'

const graph = await transform(baseGraph)
  .pipe(summarize(openai.chat('gpt-4o')))
  .apply()

Parameters

Prop

Type

What it does: Uses an LLM to generate summaries for each document node, stored in the summary property.

When to use: Summary is useful for generating personas and synthesizers.

embedProperty

Creates embeddings for specific node properties:

import { embedProperty } from '@open-evals/generator'

const graph = await transform(baseGraph)
  .pipe(summarize(llm))
  .pipe(
    embedProperty(openai.embedding('text-embedding-3-small'), {
      embedProperty: 'summary', // Property to embed
      propertyName: 'summaryEmbedding', // Where to store embedding
      filter: (node) => node.type === 'document',
    })
  )
  .apply()

Parameters

Prop

Type

What it does: Embeds a specific property instead of the main content. Useful for embedding summaries, titles, or other metadata.

When to use: When you want separate embeddings for different aspects of your nodes (e.g., content vs. summary embeddings).

tap

Inspect or modify the graph mid-pipeline:

import { tap } from '@open-evals/generator'

const graph = await transform(baseGraph)
  .pipe(chunk(splitter))
  .pipe(
    tap((g) => {
      console.log(`Created ${g.getNodes().length} nodes`)
      // Optionally modify graph here
    })
  )
  .pipe(embed(embedModel))
  .apply()

What it does: Allows you to inspect or modify the graph between transforms without breaking the pipeline.

When to use: Debugging, logging progress, or applying custom modifications that don't fit into a standard transform.

Transforms

On this page