Skip to content

AI & Semantic Search

Build semantic search without extra complexity

Section titled “Build semantic search without extra complexity”

With UQL, semantic search is part of your normal ORM workflow. You can store embeddings and query by meaning without adding a separate search stack.

  • Native vector operators: Use $vector, $distance, and $project directly in queries.
  • Multi-database support: PostgreSQL (pgvector), MariaDB, SQLite (sqlite-vec), and MongoDB Atlas.
  • One query shape end-to-end: Use the same JSON query structure on backend and frontend.

If you are building RAG, search, or recommendation features, this gives you one type-safe API from database to UI.


  1. Create embeddings
    Convert text into vectors using your preferred model.

  2. Store vectors in your entities
    Add a typed vector field and index it for fast similarity search.

  3. Query by meaning
    Sort by vector distance to get the closest matches for a user query.

  4. Use the best matches in your LLM step
    Pass the most relevant results into Q&A, summarization, or recommendation prompts.


import { Entity, Id, Field, Index } from 'uql-orm';
@Entity()
@Index(['embedding'], { type: 'hnsw', distance: 'cosine', m: 16, efConstruction: 64 })
export class Article {
@Id() id?: number;
@Field() title?: string;
@Field() category?: string;
@Field({ type: 'vector', dimensions: 1536 })
embedding?: number[];
}

If you use Postgres, UQL can handle pgvector setup and index options for you.

For single-operation scopes, prefer pool.withQuerier().

import { pool } from './uql.config.js';
import { Article } from './entities.js';
const embedding = await embed('What is UQL?'); // any embedding model
await pool.withQuerier((querier) =>
querier.insertOne(Article, {
title: 'What is UQL?',
category: 'docs',
embedding,
})
);
import type { WithDistance } from 'uql-orm';
const queryEmbedding = await embed('TypeScript ORM with vector search');
const results = await pool.withQuerier((querier) =>
querier.findMany(Article, {
$where: { category: 'docs' },
$sort: {
embedding: {
$vector: queryEmbedding,
$distance: 'cosine',
$project: 'similarity',
},
},
$limit: 10,
})
) as WithDistance<Article, 'similarity'>[];
for (const article of results) {
console.log(article.title, article.similarity);
}

$project adds the computed score to each row (here: similarity), so your app can filter, rank, or inspect relevance.

The query shape stays the same across PostgreSQL, MariaDB, SQLite, and MongoDB Atlas.


UQL queries are plain JSON, so backend and frontend may share the same query format (optional).

// api.ts (Express)
import { querierMiddleware } from 'uql-orm/express';
import { Article } from './entities.js';
app.use('/api', querierMiddleware({ include: [Article] }));
// client.ts (Browser)
import { HttpQuerier } from 'uql-orm/browser';
import { Article } from './entities.js';
const queryEmbedding = await embed('semantic query from user input');
const http = new HttpQuerier('/api');
const results = await http.findMany(Article, {
$where: { category: 'science' },
$sort: { embedding: { $vector: queryEmbedding } },
$limit: 5,
});

This makes it easy to build search bars, recommendation panels, and related-content widgets without rewriting query logic for each layer.


In production, combine normal filters with vector ranking so results stay relevant to user context.

import type { WithDistance } from 'uql-orm';
const queryEmbedding = await embed(userQuery);
const candidates = await pool.withQuerier((querier) =>
querier.findMany(Article, {
$where: {
category: 'docs',
},
$sort: {
embedding: {
$vector: queryEmbedding,
$distance: 'cosine',
$project: 'score',
},
},
$limit: 30,
})
) as WithDistance<Article, 'score'>[];

Keep low-signal results out of your RAG context window:

const filtered = candidates.filter((row) => row.score <= 0.35);

With cosine distance, lower values are better matches. Tune this threshold from real logs and user feedback.

Use two passes when you need higher precision or tighter token budgets (optional for simpler setups):

  • First pass: retrieve a broader shortlist from the database with vector search.
  • Optional second pass: use a reranker model (or LLM) to reorder that shortlist for the exact query.
  • Final step: keep only the best few chunks as generation context.
  • Split content into chunks instead of embedding full documents at once.
  • Use a fixed chunk size (how much text per chunk) and fixed overlap (how much text repeats between chunks) so results are consistent and easier to tune.
  • Save metadata (documentId, section, URL, updatedAt) so you can cite sources and prefer fresher content.

Keep a small set of real user-style queries and define what “good” looks like for each one. Re-run this set whenever you change models, chunking, or indexes to confirm quality is improving instead of regressing.