How to implement AI vector search and related posts with pgvector

At the end of this tutorial, you should be able to set up your own vector search with text embeddings in a Next.js app. This is a tutorial that mostly consists of coding samples taken directly from the Sanity codebase.

You can see the results right here on Sanity. The related posts section underneath each post is generated with pgvector. So is the search.

The stack I used:

  • Open AI's text-embedding-ada-002 model
  • Next.js
  • Prisma
  • PostgreSQL

Start by setting up the Prisma client:

This step is needed to get Prisma to cooperate with Next.js.

// Setting up prisma

import { PrismaClient } from "@prisma/client";

import { IS_DEVELOPMENT } from "@/utils";

/**
 * This is basically:
 *  export const IS_DEVELOPMENT = process.env.NODE_ENV === "development";
 */


let prisma: PrismaClient;

if (IS_DEVELOPMENT) {
  // @ts-ignore
  if (!global.prisma) {
    // @ts-ignore
    global.prisma = new PrismaClient();
  }

  // @ts-ignore
  prisma = global.prisma;
} else {
  prisma = new PrismaClient();
}

export { prisma };

Set up your Prisma schema

// schema.prisma

generator client {
  provider = "prisma-client-js"
  previewFeatures = ["postgresqlExtensions"]
}

datasource db {
  provider = "postgresql"
  url = env("POSTGRES_PRISMA_URL")
  directUrl = env("POSTGRES_URL_NON_POOLING")
  extensions = [vector]
}

model Post {
  mongoId         String   @map("mongo_id")
  id              String   @id @default(cuid())
  content         String   @db.VarChar(40000)
  parentPostId    String?  @map("parent_post_id")
  parentPostSlug  String?  @map("parent_post_slug")
  userId          String   @map("user_id")
  score           Int      @default(0)
  commentCount    Int      @default(0) @map("comment_count")
  tags            String[]
  createdAt       DateTime @default(now()) @map("created_at")
  slug            String   @unique
  images          Json    
  status          PostStatus
  publishedAt     DateTime? @map("published_at")
  embedding      Unsupported("vector(1536)")?
  @@map("posts")
}

enum PostStatus {
  PUBLISHED
  DRAFT
  AWAITING_DELETION
  AWAITING_PUBLICATION
}

At some point you'll need to push your changes with a Prisma command, for example prisma db push. If that does not ring a bell, I recommend reading up on Prisma then returning here.

You don't need most of the Post model fields. I have them because I use them internally at Sanity.

Generating embeddings

We use open AI's ada model to generate embeddings.

// Generate embeddings

"use server";
import OpenAI from "openai";

import { OPEN_AI_API_KEY } from "@/utils";

const openai = new OpenAI({
  apiKey: OPEN_AI_API_KEY,
});

export const generateEmbedding = async (input: string) => {
  const embeddingData = await openai.embeddings.create({
    model: "text-embedding-ada-002",
    input,
  });
  const [{ embedding }] = embeddingData.data;
  return embedding;
};

Implementing vector search

I use the functions below to return related posts and to run post search.

// Searching posts and getting related posts
"use server";
import { prisma } from "@/lib";
import { RegularPost } from "@/types";
import { ServerTracker } from "@/utils";

import { generateEmbedding } from "../../openai";

// eslint-disable-next-line @typescript-eslint/no-explicit-any
const cleanPosts = (posts: any[]) => {
  return posts.map((post) => {
    return {
      ...post,
      publishedAt: post.publishedAt?.toISOString() ?? null,
      createdAt: post.createdAt?.toISOString() ?? null,
    };
  }) as RegularPost[];
};

const searchByEmbedding = async (embedding: number[]) => {
  const posts = (await prisma.$queryRaw`
    SELECT
      content,
      parent_post_id as "parentPostId",
      parent_post_slug as "parentPostSlug",
      user_id as "userId",
      score,
      comment_count as "commentCount",
      tags,
      created_at as "createdAt",
      slug,
      images,
      status,
      published_at as "publishedAt",
      mongo_id AS "postId",
      1 - (embedding <=> ${embedding}::vector) as similarity
    FROM posts
    WHERE embedding IS NOT NULL
    AND LENGTH(content) > 200
    ORDER BY  similarity DESC
    LIMIT 25
  `) as any[]; // eslint-disable-line @typescript-eslint/no-explicit-any

  return posts;
};

export const searchPosts = async (query: string): Promise<RegularPost[]> => {
  ServerTracker.trackSearchQuery({ query });
  const shortQuery = query.substring(0, 100);
  const embedding = await generateEmbedding(shortQuery);
  const posts = await searchByEmbedding(embedding);
  return cleanPosts(posts);
};

export const getRelatedPostsByVector = async (
  postId: string,
): Promise<RegularPost[]> => {
  const [post] = (await prisma.$queryRaw`
    SELECT mongo_id, CAST(embedding AS text) AS embedding
    FROM posts
    WHERE mongo_id = ${postId}
    LIMIT 1;
  `) as {
    mongo_id: string;
    embedding: string;
  }[];

  if (!post || !post.embedding) {
    return [];
  }

  const relatedPosts = (await prisma.$queryRaw`
    SELECT
      content,
      parent_post_id as "parentPostId",
      parent_post_slug as "parentPostSlug",
      user_id as "userId",
      score,
      comment_count as "commentCount",
      tags,
      created_at as "createdAt",
      slug,
      images,
      status,
      published_at as "publishedAt",
      mongo_id AS "postId",
      1 - (embedding <=> ${post.embedding}::vector) as similarity
    FROM posts
    WHERE embedding IS NOT NULL
    AND parent_post_id IS NULL
    AND mongo_id != ${postId}
    ORDER BY  similarity DESC
    LIMIT 10
  `) as any[]; // eslint-disable-line @typescript-eslint/no-explicit-any

  return cleanPosts(relatedPosts);
};

Thanks for reading! If this post was helpful to you, please upvote this post or leave a comment!

reply

Other posts you might like

How I implemented slugs on Sanity - a TypeScript code sample

The lack of human-readable slugs on Sanity had bothered me for a while and I finally got around to fixing them last Sunday. The old, slugless URL structure probably wasn't doing me any favors in terms of SEO and user experience. I'm hoping the new format can give Sanity a much needed SEO boost. Plus, I can finally tell which post is which in Google Search Console and Vercel Analytics.

The Result

Before

https://www.sanity.media/p/64c375049f5d6b05859f10c6

After

https://www.sanity.media/p/64c375049f5d6b05859f10c6-delicious-post-workout-milkshake-recipe

Isn't this much clearer?

The Code

When writing the code I had the following goals in mind:

programmingjavascriptmongoosebuilding in publicmongodb
1 comment

How I built a chat app using Streams API, Next.JS, Redis and Vercel

Last week I added a chat feature to Sanity. In this article, I'll guide through how I built it using Streams API, Next.js, Redis and Vercel.

Sanity chat

Before we start, a quick disclaimer: there are much better ways to build a chat application, for example by using WebSockets. Vercel unfortunately doesn't support WebSockets and I didn't want to spin a dedicated server, which is why I used Streams API. Using Streams API the way I use it here is most likely not the best use of resources but it works and is a good enough solution for my small scale use. If you're on the same boat, keep reading.

If the chat takes off, I'll have to move it to a dedicated Socket.io server, a serverless WebSocket on AWS, or something similar to reduce costs.

Storing messages in Redis

I use the KV (Redis) database from Vercel to store the last 100 messages. Here is the code used to send and read messages.

import { MAX_CHAT_MESSAGE_LENGTH } from "@/utils";

const MAX_MESSAGES = 100;

export const addChatMessage = async ({
programmingvercelstreams apibackendnext.jsreactredisjavascript
reply

How I struggled to fix votes on Sanity

Ever since I implemented upvotes a few months ago, I had been struggling with user upvotes/downvotes request occasionly timing out. The bug persisted for a few months and the few times I tried to debug it, I had no success. Is it the database schema? Nope, I use similar schemas for other collections and they work fine. An inefficient MongoDB query? Same thing. No indexing? I indexed the DB even though there are barely any votes in the collection. An issue with Vercel cold start? Also not it, everything within the norm.

Last Friday the rest of the app was finally ready and I wanted to start inviting some users, so I gave up and decided to pay $20/month for Vercel Pro to increase the timeout from 10 to 60 seconds and worry about the bug another day. And then I checked the logs on Vercel Pro...

Unhandled error: MongooseError: Operation `userVotes.findOne()` buffering timed out after 10000ms
    at Timeout.<anonymous> (/var/task/sanity_client/node_modules/mongoose/lib/drivers/node-mongodb-native/collection.js:175:23)
    at listOnTimeout (node:internal/timers:569:17)
    at process.processTimers (node:internal/timers:512:7)

Because Mongoose timeout is 10000ms and Vercel's timeout is also 10000ms but this includes the cold start time, this error never popped up on my free plan....

sanityprogrammingvercelmongodbbuilding in public
reply

How to use AWS Amplify Authentication with Next.js Server Actions

No next-auth required:

// amplifyServerUtils.ts

import { createServerRunner } from '@aws-amplify/adapter-nextjs';
import config from '@/amplifyconfiguration.json';

export const { runWithAmplifyServerContext } = createServerRunner({
  config
});
// actions.ts

"use server";
import { cookies } from "next/headers";
programmingawsamplifynext.jsamplify v6cognito
reply

Feature announcement πŸŽ‰

You can now tag your posts! This will make it easier for your content to be discovered, both on Sanity and in search engines - a much-needed SEO boost. You can add up to 5 tags per post and the length limit is 35 characters.

feature announcementsanitytagsseosanity tips
reply

A user acquisition conundrum

One of the fastest ways I can attract the first users to Sanity and start getting feedback is by posting about it and engaging on Facebook and Twitter. The thing is, I find them too addictive and full of dark patterns, which is why started Sanity in the first place. I always tell myself I’ll only use them for a specific purpose, or for 15 minutes a day, or only occasionally, and then I gradually get sucked into using them all the time. I start out by visiting once every few days, I check for new posts from a few people I follow, read some AI news, visit groups etc. With time, I find myself using them more and more until several weeks later, I check them throughout the day. This is more of a problem for me with Twitter but, to a lesser extent, also applies to Facebook.

So the question is - how to I get the word out about Sanity without using the addictive social media networks it’s meant to be the alternative to?

I'm focusing on writing content and optimizing SEO for now but it's a slow process. Any tips?

indie makersbuilding in publicseosocial mediasocial media addiction
reply

Quick feature announcement

I updated the layout a bit to make it easier to find replies to your posts and comments. This is not backward-compatible yet, so you are only going to see the posts and replies created from now on. I'll run a database migration some time in the next few days so that older replies also appear there.

sanityfeature announcementbuilding in publicux
reply

Is there a secure way to use Redis with Vercel?

I spent a couple of hours yesterday trying to find a way to use Redis with Sanity, which currently runs on Vercel. According to Redis docs on security, it is not a good idea to expose a Redis instance directly to the internet:

Redis is designed to be accessed by trusted clients inside trusted environments. This means that usually it is not a good idea to expose the Redis instance directly to the internet or, in general, to an environment where untrusted clients can directly access the Redis TCP port or UNIX socket.

I wanted to use Digital Ocean's trusted sources to restrict the incoming connections to those coming from my Vercel server but looks like that won't be possible because of Vercel's use of dynamic IP addresses. According to Vercel docs:

To ensure your Vercel deployment is able to access the external resource, you should allow connections from all IP addresses. Typically this can be achieved by entering an IP address of (0.0.0.0).

While allowing connections from all IP addresses may be a concern, relying on IP allowlisting for security is generally ineffective and can lead to poor security practices.

To properly secure your database, we recommend using a randomly generated password, stored as an environment variable, at least 32 characters in length, and to rotate this password on a regular basi...

1 comment
feedback