# Search URL: https://whitepaper.designervenkat.online/docs/coding-tutorials/search Markdown: https://whitepaper.designervenkat.online/llms.mdx/docs/coding-tutorials/search Site: White Papers - Designer Venkat Author: Designer Venkat Language: en How the full-text index works, how to extend it, and what to do when results look wrong. Search is one of the few features users notice within seconds. A bad search experience teaches readers to fall back to Google with a `site:` filter; a good one becomes how they navigate the entire library. ## How the index works [#how-the-index-works] When you run `npm run build`, Fumadocs walks every MDX file under `content/docs/` and extracts structured data: 1. **Title** from frontmatter 2. **Headings** — every `##`, `###`, with their text and anchor IDs 3. **Body** — prose between headings, with HTML tags stripped This is shipped as a JSON index served by `/api/search`. The client-side search component (mounted in the sidebar) fetches the index on first focus, then runs queries locally. No round trip per keystroke. An average documentation site has 50–500 pages. The index for that is 100KB–2MB gzipped — small enough to ship to the client. Server-side search makes sense only above \~10,000 pages. ## Modes [#modes] `createSearchAPI` accepts two modes. The trade-off is index size versus result quality. This project uses `advanced`: ```ts title="app/api/search/route.ts" import { source } from "@/lib/source"; import { createSearchAPI } from "fumadocs-core/search/server"; export const { GET } = createSearchAPI("advanced", { indexes: source.getPages().map((page) => ({ title: page.data.title, structuredData: page.data.structuredData, id: page.url, url: page.url, })), }); ``` `structuredData` comes from Fumadocs' `remark-structure` plugin, which is enabled by default — no extra config needed. ## Customizing relevance [#customizing-relevance] The default ranking weighs title matches more than heading matches more than body matches. Two ways to influence it: ### Add tags [#add-tags] Frontmatter `tags` are searchable with higher weight than body prose: ```mdx --- title: Consensus Under Network Partitions description: ... tags: [distributed-systems, cap, raft] --- ``` Queries for "raft" will now rank this paper near the top even if "raft" doesn't appear in the title. ### Custom synonyms [#custom-synonyms] For domain-specific abbreviations, expand them at index time: ```ts indexes: source.getPages().map((page) => ({ title: page.data.title, structuredData: page.data.structuredData, id: page.url, url: page.url, extra_tokens: synonyms(page.data.title), })), ``` Where `synonyms()` maps "LLM" → "large language model", "CRDT" → "conflict-free replicated data type", etc. Worth doing if your domain has 10+ such abbreviations. ## Debugging [#debugging] If search results look wrong, check three things: 1. **Is the page in the index?** Hit `/api/search?query=` (empty query) and inspect the response. If the page isn't there, the build skipped it — usually a parse error in the MDX. 2. **Are headings being extracted?** Check the structured data in `page.data.structuredData`. Headings should appear as `{ type: "heading", content: "..." }` entries. 3. **Is the client cache stale?** The browser caches the index. Hard refresh (Cmd+Shift+R) to bust it. If you have unpublished work in `content/docs/_drafts/`, the leading underscore tells Fumadocs to ignore the folder. Without it, drafts show up in search and confuse readers. ## Server-side search [#server-side-search] If the index grows past 5MB, switch to server-side. The pattern: 1. Ship the index to a search service (Algolia, Meilisearch, Typesense) 2. Replace `/api/search` with a proxy to that service 3. Use Fumadocs' `createSearchClient` with a custom fetcher Typesense is the lightest of the three — self-hostable, fast, with a typo-tolerant engine that matches Algolia's quality on most queries.