# Search

URL: https://whitepaper.designervenkat.online/docs/coding-tutorials/search
Markdown: https://whitepaper.designervenkat.online/llms.mdx/docs/coding-tutorials/search
Site: White Papers - Designer Venkat
Author: Designer Venkat
Language: en

How the full-text index works, how to extend it, and what to do when results look wrong.





Search is one of the few features users notice within seconds. A bad search experience teaches readers to fall back to Google with a `site:` filter; a good one becomes how they navigate the entire library.

## How the index works [#how-the-index-works]

When you run `npm run build`, Fumadocs walks every MDX file under `content/docs/` and extracts structured data:

1. **Title** from frontmatter
2. **Headings** — every `##`, `###`, with their text and anchor IDs
3. **Body** — prose between headings, with HTML tags stripped

This is shipped as a JSON index served by `/api/search`. The client-side search component (mounted in the sidebar) fetches the index on first focus, then runs queries locally. No round trip per keystroke.

<Callout type="info" title="Why local?">
  An average documentation site has 50–500 pages. The index for that is 100KB–2MB gzipped — small enough to ship to the client. Server-side search makes sense only above \~10,000 pages.
</Callout>

## Modes [#modes]

`createSearchAPI` accepts two modes. The trade-off is index size versus result quality.

<TypeTable
  type="{
  &#x22;simple&#x22;: {
    description: &#x22;Title-only matching. Smaller index, exact-match feel.&#x22;,
    default: &#x22;—&#x22;,
    type: &#x22;mode&#x22;
  },
  &#x22;advanced&#x22;: {
    description: &#x22;Title + headings + body. Larger index, fuzzy-aware ranking.&#x22;,
    default: &#x22;recommended&#x22;,
    type: &#x22;mode&#x22;
  }
}"
/>

This project uses `advanced`:

```ts title="app/api/search/route.ts"
import { source } from "@/lib/source";
import { createSearchAPI } from "fumadocs-core/search/server";

export const { GET } = createSearchAPI("advanced", {
  indexes: source.getPages().map((page) => ({
    title: page.data.title,
    structuredData: page.data.structuredData,
    id: page.url,
    url: page.url,
  })),
});
```

`structuredData` comes from Fumadocs' `remark-structure` plugin, which is enabled by default — no extra config needed.

## Customizing relevance [#customizing-relevance]

The default ranking weighs title matches more than heading matches more than body matches. Two ways to influence it:

### Add tags [#add-tags]

Frontmatter `tags` are searchable with higher weight than body prose:

```mdx
---
title: Consensus Under Network Partitions
description: ...
tags: [distributed-systems, cap, raft]
---
```

Queries for "raft" will now rank this paper near the top even if "raft" doesn't appear in the title.

### Custom synonyms [#custom-synonyms]

For domain-specific abbreviations, expand them at index time:

```ts
indexes: source.getPages().map((page) => ({
  title: page.data.title,
  structuredData: page.data.structuredData,
  id: page.url,
  url: page.url,
  extra_tokens: synonyms(page.data.title),
})),
```

Where `synonyms()` maps "LLM" → "large language model", "CRDT" → "conflict-free replicated data type", etc. Worth doing if your domain has 10+ such abbreviations.

## Debugging [#debugging]

If search results look wrong, check three things:

1. **Is the page in the index?** Hit `/api/search?query=` (empty query) and inspect the response. If the page isn't there, the build skipped it — usually a parse error in the MDX.
2. **Are headings being extracted?** Check the structured data in `page.data.structuredData`. Headings should appear as `{ type: "heading", content: "..." }` entries.
3. **Is the client cache stale?** The browser caches the index. Hard refresh (Cmd+Shift+R) to bust it.

<Callout type="warn" title="Don't index draft content">
  If you have unpublished work in `content/docs/_drafts/`, the leading underscore tells Fumadocs to ignore the folder. Without it, drafts show up in search and confuse readers.
</Callout>

## Server-side search [#server-side-search]

If the index grows past 5MB, switch to server-side. The pattern:

1. Ship the index to a search service (Algolia, Meilisearch, Typesense)
2. Replace `/api/search` with a proxy to that service
3. Use Fumadocs' `createSearchClient` with a custom fetcher

Typesense is the lightest of the three — self-hostable, fast, with a typo-tolerant engine that matches Algolia's quality on most queries.
