~/krishna_dhakal
#Azure#Search#OpenAI#Enterprise

Semantic Search over SharePoint with Azure OpenAI

> May 20, 2024

Enterprise document search has always been painful. Traditional keyword search misses context and synonyms. After joining Upland Software, I tackled this head-on.


> The Problem


2 million+ SharePoint documents. Users searching for things like "quarterly budget projections" but documents are titled "Q3 Financial Outlook FY2023".


> The Solution


Azure OpenAI embeddings + Azure AI Search with vector fields:


import { SearchClient, AzureKeyCredential } from "@azure/search-documents";
import { OpenAIClient } from "@azure/openai";

async function embedQuery(query: string): Promise<number[]> {
  const response = await openAIClient.getEmbeddings("text-embedding-ada-002", [query]);
  return response.data[0].embedding;
}

async function semanticSearch(query: string) {
  const vector = await embedQuery(query);
  return await searchClient.search("*", {
    vectorSearchOptions: {
      queries: [{ vector, fields: ["contentVector"], kNearestNeighborsCount: 10 }]
    }
  });
}

> Scale Challenges


  • **Incremental indexing** — crawl only changed documents using SharePoint change tokens.
  • **Rate limiting** — batch embedding calls and respect Azure OpenAI TPM limits.
  • **Cost control** — cache embeddings for unchanged content.

> Results


Search relevance improved dramatically and user satisfaction scores jumped after rollout.