Skip to main content

How to Integrate Bright Data With Vercel AI SDK

Building an AI startup?

You might be eligible for our Startup Program. Get fully funded access to the infrastructure you’re reading about right now (up to $50K value).
Vercel AI SDK is a TypeScript toolkit for building AI applications with React, Next.js, Vue, Svelte, Node.js, and more. It provides a unified API for working with different AI providers and includes utilities for streaming, function calling, and building conversational interfaces. The @brightdata/ai-sdk package gives you drop-in tools for web scraping, search, and structured dataset collection — no manual wiring required.

Steps to Get Started

1

Prerequisites

2

Installation

Install the Bright Data AI SDK package alongside the Vercel AI SDK:
npm install @brightdata/ai-sdk ai zod
Set your API key as an environment variable:
.env.local
BRIGHTDATA_API_KEY=your_api_key_here
3

Import and Use

Import the tools you need directly from @brightdata/ai-sdk and pass them to any Vercel AI SDK call. No additional setup files or wrappers needed — each tool is a factory function that reads your API key automatically from BRIGHTDATA_API_KEY.
import { scrape, search, amazon_product, linkedin_profile } from '@brightdata/ai-sdk'
import { generateText, stepCountIs } from 'ai'
import { openai } from '@ai-sdk/openai'

const { text } = await generateText({
  model: openai('gpt-4o'),
  prompt: 'Scrape https://news.ycombinator.com and summarize the top 5 stories',
  tools: {
    scrape: scrape(),
  },
  stopWhen: stepCountIs(5),
})

console.log(text)
4

Usage Examples

Create an API route that uses Bright Data tools with any AI provider:
app/api/chat/route.ts
import { openai } from '@ai-sdk/openai'
import { streamText, stepCountIs } from 'ai'
import { scrape, search } from '@brightdata/ai-sdk'

export const maxDuration = 60

export async function POST(req: Request) {
  const { messages } = await req.json()

  const result = streamText({
    model: openai('gpt-4o'),
    messages,
    tools: {
      scrape: scrape(),
      search: search(),
    },
    stopWhen: stepCountIs(10),
  })

  return result.toDataStreamResponse()
}
Then use it in your component:
app/page.tsx
'use client'

import { useChat } from 'ai/react'

export default function Chat() {
  const { messages, input, handleInputChange, handleSubmit } = useChat()

  return (
    <div className="flex flex-col h-screen">
      <div className="flex-1 overflow-y-auto p-4">
        {messages.map((m) => (
          <div key={m.id} className="mb-4">
            <strong>{m.role === 'user' ? 'You: ' : 'AI: '}</strong>
            {m.content}
          </div>
        ))}
      </div>
      <form onSubmit={handleSubmit} className="p-4 border-t">
        <input
          value={input}
          onChange={handleInputChange}
          placeholder="Try: 'Scrape https://example.com' or 'Search for best laptops 2025'"
          className="w-full p-2 border rounded"
        />
      </form>
    </div>
  )
}

Available Tools

All tools are factory functions — call them with an optional config object (or no arguments at all to use env defaults):
ToolImportDescription
scrape()import { scrape } from '@brightdata/ai-sdk'Scrape any website and return clean markdown. Bypasses anti-bot protection and CAPTCHAs.
search()import { search } from '@brightdata/ai-sdk'Search Google, Bing, or Yandex with anti-bot bypass. Returns markdown results.
amazon_product()import { amazon_product } from '@brightdata/ai-sdk'Get detailed Amazon product info: price, ratings, reviews, and specs.
linkedin_profile()import { linkedin_profile } from '@brightdata/ai-sdk'Fetch LinkedIn profile data including experience, education, and skills.
linkedin_jobs()import { linkedin_jobs } from '@brightdata/ai-sdk'Search LinkedIn job postings by location and keyword.
instagram_profile()import { instagram_profile } from '@brightdata/ai-sdk'Get Instagram profile info and recent posts.
facebook_profile()import { facebook_profile } from '@brightdata/ai-sdk'Collect Facebook profile data from a profile URL.
chatgpt()import { chatgpt } from '@brightdata/ai-sdk'Query ChatGPT via Bright Data’s dataset API with optional web search mode.

Tool Reference

scrape(options?)

Scrape any public webpage and get back clean markdown (or HTML).
import { scrape } from '@brightdata/ai-sdk'

const tool = scrape({
  api_key: 'optional — defaults to BRIGHTDATA_API_KEY env var',
  data_format: 'markdown', // or 'html'
  country: 'us',           // default proxy country (2-letter code)
})
LLM input schema:
ParameterTypeRequiredDescription
urlstringYesThe URL to scrape
countrystringNo2-letter country code for the proxy exit node

search(options?)

Search the web via Google, Bing, or Yandex.
import { search } from '@brightdata/ai-sdk'

const tool = search({
  api_key: 'optional',
  search_engine: 'google', // 'google' | 'bing' | 'yandex'
  data_format: 'markdown', // or 'html'
  country: 'us',
})
LLM input schema:
ParameterTypeRequiredDescription
querystringYesThe search query
search_enginestringNo'google' (default), 'bing', or 'yandex'
countrystringNo2-letter country code for localized results
data_formatstringNo'markdown' (default) or 'html'

amazon_product(options?)

Retrieve structured Amazon product data.
import { amazon_product } from '@brightdata/ai-sdk'

const tool = amazon_product({ api_key: 'optional' })
LLM input schema:
ParameterTypeRequiredDescription
urlstringYesAmazon product URL (must include /dp/ or /gp/product/)
zipcodestringNoZIP code for location-specific pricing

linkedin_profile(options?)

Collect detailed LinkedIn profile data for one or more profiles.
import { linkedin_profile } from '@brightdata/ai-sdk'

const tool = linkedin_profile({
  api_key: 'optional',
  format: 'json', // 'json' | 'jsonl'
})
LLM input schema:
ParameterTypeRequiredDescription
urlsstring[]YesArray of LinkedIn profile URLs (min 1)

linkedin_jobs(options?)

Search LinkedIn job postings by location and keyword.
import { linkedin_jobs } from '@brightdata/ai-sdk'

const tool = linkedin_jobs({ api_key: 'optional' })
LLM input schema:
ParameterTypeRequiredDescription
locationstringYesJob location, e.g. "New York, NY"
keywordstringNoJob title or search keyword
countrystringNo2-letter country code

instagram_profile(options?)

Fetch Instagram profile info and recent posts.
import { instagram_profile } from '@brightdata/ai-sdk'

const tool = instagram_profile({ api_key: 'optional' })
LLM input schema:
ParameterTypeRequiredDescription
urlstringYesInstagram profile URL

facebook_profile(options?)

Collect Facebook profile data.
import { facebook_profile } from '@brightdata/ai-sdk'

const tool = facebook_profile({
  api_key: 'optional',
  format: 'json', // 'json' | 'jsonl'
})
LLM input schema:
ParameterTypeRequiredDescription
urlstringYesFacebook profile URL

chatgpt(options?)

Query ChatGPT via Bright Data’s ChatGPT dataset API with optional web search.
import { chatgpt } from '@brightdata/ai-sdk'

const tool = chatgpt({
  api_key: 'optional',
  format: 'json', // 'json' | 'jsonl'
})
LLM input schema:
ParameterTypeRequiredDescription
promptstringYesPrompt to send to ChatGPT
additional_promptstringNoOptional follow-up prompt
countrystringNo2-letter country code
require_sourcesbooleanNoFail when no sources are found
web_searchbooleanNoEnable ChatGPT web search mode

Example Output

Scraping Example

Prompt: “Scrape https://example.com and tell me what it’s about” AI Response:
I've scraped the website. Here's what I found:

# Example Domain

This domain is for use in illustrative examples in documents. You may use this
domain in literature without prior coordination or asking for permission.

The website appears to be a placeholder domain used for documentation and
examples. It's maintained by IANA (Internet Assigned Numbers Authority) and
serves as a standard example domain that can be referenced in documentation
without needing permission.

Search Example

Prompt: “Search for best mechanical keyboards 2025” AI Response:
I found several highly-rated mechanical keyboards for 2025:

1. **Keychron Q1 Pro** — Premium 75% layout, hot-swappable switches,
   wireless. ~$200.

2. **Wooting 60HE** — Analog switches with adjustable actuation points,
   popular among gamers.

3. **GMMK Pro** — 75% gasket-mounted, extensive customization options.

Amazon Product Example

Prompt: “Get info about https://www.amazon.com/dp/B0D2Q9397Y AI Response:
Product: Logitech MX Master 3S Wireless Mouse
Price: $99.99
Rating: 4.6/5 (8,234 reviews)

Key Features:
- 8K DPI sensor
- Quiet clicks
- USB-C charging
- Connects to up to 3 devices

Verdict: Excellent choice for productivity users. Premium price, but
justified by ergonomics and multi-device support.

Best Practices

  1. API key management — Use BRIGHTDATA_API_KEY in your environment; avoid hardcoding keys.
  2. Error handling — All tools catch errors internally and return a descriptive string, so LLM loops won’t crash.
  3. Data format — Use markdown for scraping to get clean, LLM-friendly content.
  4. Multi-step agents — Set stopWhen: stepCountIs(N) to let the model chain tool calls autonomously.
  5. Country targeting — Pass a 2-letter country code to get geo-specific results or pricing.
  6. Async datasets — For large dataset jobs (many LinkedIn profiles, etc.), consider setting async: true in the underlying SDK client to avoid timeouts.

Environment Variables

.env.local
BRIGHTDATA_API_KEY=your_api_key_here
Get your API key from the Bright Data Dashboard.