Build

Rough steps will be

Take input
Generate search queries
Map through each query and
- Search the web for a relevant result
- Analyze the result for learnings and follow-up questions
- If depth > 0, follow-up with a new query

e.g. Let's say you're researching "Electric Cars" with depth = 2 and breadth = 3

Level 0 (Initial Query): "Electric Cars"
│
├── Level 1 (depth = 1):
│   ├── Sub-query 1: "Tesla Model 3 specifications"
│   ├── Sub-query 2: "Electric car charging infrastructure"
│   └── Sub-query 3: "Electric vehicle battery technology"
│
└── Level 2 (depth = 2):
    ├── From Sub-query 1:
    │   ├── "Model 3 range capacity"
    │   └── "Model 3 pricing"
    │
    ├── From Sub-query 2:
    │   ├── "Fast charging stations in US"
    │   └── "Home charging installation"
    │
    └── From Sub-query 3:
        ├── "Lithium ion battery lifespan"
        └── "Solid state batteries"

Let's get started!

Start by creating a function to generate search queries.

import { openai } from '@ai-sdk/openai'
import { generateObject } from 'ai'
import { z } from 'zod'
import 'dotenv/config'
 
const mainModel = openai('gpt-4o')
 
const generateSearchQueries = async (query: string, n: number = 3) => {
  const {
    object: { queries },
  } = await generateObject({
    model: mainModel,
    prompt: `Generate ${n} search queries for the following query: ${query}`,
    schema: z.object({
      queries: z.array(z.string()).min(1).max(5),
    }),
  })
  return queries
}

Add a main function to run this function.

const main = async () => {
  const prompt = 'What do you need to be a D1 shotput athlete?'
  const queries = await generateSearchQueries(prompt)
}
 
main()

Add a function to search the web.

import Exa from 'exa-js'
 
const exa = new Exa(process.env.EXA_API_KEY)
 
type SearchResult = {
  title: string
  url: string
  content: string
}
 
const searchWeb = async (query: string) => {
  const { results } = await exa.searchAndContents(query, {
    numResults: 1,
    livecrawl: 'always',
  })
  return results.map(
    (r) =>
      ({
        title: r.title,
        url: r.url,
        content: r.text,
      }) as SearchResult
  )
}

Create a function to orchestrate search.

const searchAndProcess = async (query: string) => {
  const pendingSearchResults: SearchResult[] = []
  const finalSearchResults: SearchResult[] = []
  await generateText({
    model: mainModel,
    prompt: `Search the web for information about ${query}`,
    system:
      'You are a researcher. For each query, search the web and then evaluate if the results are relevant and will help answer the following query',
    maxSteps: 5,
    tools: {
      searchWeb: tool({
        description: 'Search the web for information about a given query',
        parameters: z.object({
          query: z.string().min(1),
        }),
        async execute({ query }) {
          const results = await searchWeb(query)
          pendingSearchResults.push(...results)
          return results
        },
      }),
      evaluate: tool({
        description: 'Evaluate the search results',
        parameters: z.object({}),
        async execute() {
          const pendingResult = pendingSearchResults.pop()!
          const { object: evaluation } = await generateObject({
            model: mainModel,
            prompt: `Evaluate whether the search results are relevant and will help answer the following query: ${query}. If the page already exists in the existing results, mark it as irrelevant.
 
            <search_results>
            ${JSON.stringify(pendingResult)}
            </search_results>
            `,
            output: 'enum',
            enum: ['relevant', 'irrelevant'],
          })
          if (evaluation === 'relevant') {
            finalSearchResults.push(pendingResult)
          }
          console.log('Found:', pendingResult.url)
          console.log('Evaluation completed:', evaluation)
          return evaluation === 'irrelevant'
            ? 'Search results are irrelevant. Please search again with a more specific query.'
            : 'Search results are relevant. End research for this query.'
        },
      }),
    },
  })
  return finalSearchResults
}

Don't forget to add the system prompt!

Update the main function:

const main = async () => {
  const prompt = 'What do you need to be a D1 shotput athlete?'
  const queries = await generateSearchQueries(prompt)
 
  for (const query of queries) {
    console.log(`Searching the web for: ${query}`)
    const searchResults = await searchAndProcess(query)
  }
}
 
main()

Create the function to generate learnings:

const generateLearnings = async (query: string, searchResult: SearchResult) => {
  const { object } = await generateObject({
    model: mainModel,
    prompt: `The user is researching "${query}". The following search result were deemed relevant.
    Generate a learning and a follow-up question from the following search result:
 
    <search_result>
    ${JSON.stringify(searchResult)}
    </search_result>
    `,
    schema: z.object({
      learning: z.string(),
      followUpQuestions: z.array(z.string()),
    }),
  })
  return object
}

Update the main function:

const main = async () => {
  const prompt = 'What do you need to be a D1 shotput athlete?'
  const queries = await generateSearchQueries(prompt)
 
  for (const query of queries) {
    console.log(`Searching the web for: ${query}`)
    const searchResults = await searchAndProcess(query)
    for (const searchResult of searchResults) {
      console.log(`Processing search result: ${searchResult.url}`)
      const learnings = await generateLearnings(query, searchResult)
    }
  }
}
 
main()

Refactor main function to allow for recursion:

const deepResearch = async (
  query: string,
  depth: number = 1,
  breadth: number = 3
) => {
  const queries = await generateSearchQueries(query)
 
  for (const query of queries) {
    console.log(`Searching the web for: ${query}`)
    const searchResults = await searchAndProcess(query)
    for (const searchResult of searchResults) {
      console.log(`Processing search result: ${searchResult.url}`)
      const learnings = await generateLearnings(query, searchResult)
      // call deepResearch recursively with decrementing depth and breadth
    }
  }
}
 
const main = async () => {
  const prompt = 'What do you need to be a D1 shotput athlete?'
  const research = await deepResearch(prompt)
}
 
main()

Create an accumulated research object to store all learnings and follow-up questions.

type Learning = {
  learning: string
  followUpQuestions: string[]
}
 
type Research = {
  query: string | undefined
  queries: string[]
  searchResults: SearchResult[]
  learnings: Learning[]
  completedQueries: string[]
}
 
const accumulatedResearch: Research = {
  query: undefined,
  queries: [],
  searchResults: [],
  learnings: [],
  completedQueries: [],
}

Update accumulatedResearch throughout process

const deepResearch = async (
  prompt: string,
  depth: number = 2,
  breadth: number = 2
) => {
  if (!accumulatedResearch.query) {

    accumulatedResearch.query = prompt 
  } 
 
  const queries = await generateSearchQueries(prompt, breadth)
  accumulatedResearch.queries = queries 
 
  for (const query of queries) {
    console.log(`Searching the web for: ${query}`)
    const searchResults = await searchAndProcess(query)
    accumulatedResearch.searchResults.push(...searchResults) 
    for (const searchResult of searchResults) {
      console.log(`Processing search result: ${searchResult.url}`)
      const learnings = await generateLearnings(query, searchResult)
      accumulatedResearch.learnings.push(learnings) 
      accumulatedResearch.completedQueries.push(query) 
 
      // call deepResearch recursively with decrementing depth and breadth
    }
  }
}

Add recursion.

const deepResearch = async (
  prompt: string,
  depth: number = 2,
  breadth: number = 2
) => {
  if (!accumulatedResearch.query) {
    accumulatedResearch.query = prompt
  }
 
  if (depth === 0) {
    return accumulatedResearch
  }
 
  const queries = await generateSearchQueries(prompt, breadth)
  accumulatedResearch.queries = queries
 
  for (const query of queries) {
    console.log(`Searching the web for: ${query}`)
    const searchResults = await searchAndProcess(query)
    accumulatedResearch.searchResults.push(...searchResults)
    for (const searchResult of searchResults) {
      console.log(`Processing search result: ${searchResult.url}`)
      const learnings = await generateLearnings(query, searchResult)
      accumulatedResearch.learnings.push(learnings)
      accumulatedResearch.completedQueries.push(query)
 
      const newQuery = `Overall research goal: ${prompt}
        Previous search queries: ${accumulatedResearch.completedQueries.join(', ')}

        Follow-up questions: ${learnings.followUpQuestions.join(', ')}
        `
      await deepResearch(newQuery, depth - 1, Math.ceil(breadth / 2)) 
    }
  }
  return accumulatedResearch
}

Update searchAndProcess to take in previous queries.

const searchAndProcess = async (
  query: string,
  accumulatedSources: SearchResult[] 
) => {
  const pendingSearchResults: SearchResult[] = []
  const finalSearchResults: SearchResult[] = []
  await generateText({
    model: mainModel,
    prompt: `Search the web for information about ${query}`,
    system:
      'You are a researcher. For each query, search the web and then evaluate if the results are relevant and will help answer the following query',
    maxSteps: 5,
    tools: {
      searchWeb: tool({
        description: 'Search the web for information about a given query',
        parameters: z.object({
          query: z.string().min(1),
        }),
        async execute({ query }) {
          const results = await searchWeb(query)
          pendingSearchResults.push(...results)
          return results
        },
      }),
      evaluate: tool({
        description: 'Evaluate the search results',
        parameters: z.object({}),
        async execute() {
          const pendingResult = pendingSearchResults.pop()!
          const { object: evaluation } = await generateObject({
            model: mainModel,
            prompt: `Evaluate whether the search results are relevant and will help answer the following query: ${query}. If the page already exists in the existing results, mark it as irrelevant.
 
            <search_results>
            ${JSON.stringify(pendingResult)}
            </search_results>
 
            <existing_results>
            ${JSON.stringify(accumulatedSources.map((result) => result.url))}
            </existing_results>
 
            `,
            output: 'enum',
            enum: ['relevant', 'irrelevant'],
          })
          if (evaluation === 'relevant') {
            finalSearchResults.push(pendingResult)
          }
          console.log('Found:', pendingResult.url)
          console.log('Evaluation completed:', evaluation)
          return evaluation === 'irrelevant'
            ? 'Search results are irrelevant. Please search again with a more specific query.'
            : 'Search results are relevant. End research for this query.'
        },
      }),
    },
  })
  return finalSearchResults
}

Update deepResearch function

const deepResearch = async (
  prompt: string,
  depth: number = 2,
  breadth: number = 2
) => {
  if (!accumulatedResearch.query) {
    accumulatedResearch.query = prompt
  }
 
  if (depth === 0) {
    return accumulatedResearch
  }
 
  const queries = await generateSearchQueries(prompt, breadth)
  accumulatedResearch.queries = queries
 
  for (const query of queries) {
    console.log(`Searching the web for: ${query}`)
    const searchResults = await searchAndProcess(

      query, 
      accumulatedResearch.searchResults 
    ) 
    accumulatedResearch.searchResults.push(...searchResults)
    for (const searchResult of searchResults) {
      console.log(`Processing search result: ${searchResult.url}`)
      const learnings = await generateLearnings(query, searchResult)
      accumulatedResearch.learnings.push(learnings)
      accumulatedResearch.completedQueries.push(query)
 
      const newQuery = `Overall research goal: ${prompt}
        Previous search queries: ${accumulatedResearch.completedQueries.join(', ')}
 
        Follow-up questions: ${learnings.followUpQuestions.join(', ')}
        `
      await deepResearch(newQuery, depth - 1, Math.ceil(breadth / 2))
    }
  }
  return accumulatedResearch
}

Create generateReport function

const generateReport = async (research: Research) => {
  const { text } = await generateText({
    model: openai('o3-mini'),
    prompt:
      'Generate a report based on the following research data:\n\n' +
      JSON.stringify(research, null, 2),
  })
  return text
}

Update main function

const main = async () => {
  const research = await deepResearch(
    'What do you need to be a D1 shotput athlete?'
  )
  console.log('Research completed!')
  console.log('Generating report...')
  const report = await generateReport(research)
  console.log('Report generated! report.md')
  fs.writeFileSync('report.md', report)
}
 
main()

Update system prompt:

const SYSTEM_PROMPT = `You are an expert researcher. Today is ${new Date().toISOString()}. Follow these instructions when responding:
  - You may be asked to research subjects that is after your knowledge cutoff, assume the user is right when presented with news.
  - The user is a highly experienced analyst, no need to simplify it, be as detailed as possible and make sure your response is correct.
  - Be highly organized.
  - Suggest solutions that I didn't think about.
  - Be proactive and anticipate my needs.
  - Treat me as an expert in all subject matter.
  - Mistakes erode my trust, so be accurate and thorough.
  - Provide detailed explanations, I'm comfortable with lots of detail.
  - Value good arguments over authorities, the source is irrelevant.
  - Consider new technologies and contrarian ideas, not just the conventional wisdom.
  - You may use high levels of speculation or prediction, just flag it for me.
  - Use Markdown formatting.`
 
const generateReport = async (research: Research) => {
  const { text } = await generateText({
    model: openai('o3-mini'),
    system: SYSTEM_PROMPT,
    prompt:
      'Generate a report based on the following research data:\n\n' +
      JSON.stringify(research, null, 2),
  })
  return text
}