The Prompt API  |  AI on Chrome  |  Chrome for Developers

thomas steiner

alexandra klepper

Published: May 20, 2025, Last Updated: September 21, 2025

With the Prompt API, you can send natural language requests to Gemini Nano in the browser.

There are several ways you can use the Prompt API. For example, you can build:

  • AI-powered search: Answer questions based on the content of a web page.
  • personalized news feed: Create a feed that dynamically categorizes articles with categories and allows users to filter for that content.
  • Custom Content Filter. Analyze news articles and automatically blur or hide content based on user-defined topics.
  • calendar event creation. Develop a Chrome extension that automatically extracts event details from web pages, so users can create calendar entries in just a few steps.
  • seamless contact extraction. Create an extension that extracts contact information from websites, making it easy for users to contact a business or add details to their contacts list.

These are just some possibilities, and we’re excited to see what you create.

Review Hardware Requirements

The following requirements exist for developers and users operating features using these APIs in Chrome. Other browsers may have different operating requirements.

language detector And translator api Work in Chrome on desktop. These APIs do not work on mobile devices.

prompt api, Summary API, Writer API,
Rewriter APIAnd Proofreader API Work in Chrome when the following conditions are met:

  • operating system: Windows 10 or 11; macOS 13+ (Ventura and beyond); Linux; Or ChromeOS on Chromebook Plus devices (platform 16389.0.0 and up). Chrome for Android, iOS, and ChromeOS on non-Chromebook Plus devices are not yet supported by the APIs that Gemini Nano uses.
  • storage: At least 22 GB of free space on the volume containing your Chrome profile.
  • gpu or cpu: Built-in models can run with either a GPU or a CPU.
    • gpu:Strictly more than 4GB VRAM.
    • CPU: 16 GB RAM or more and 4 CPU cores or more.
    • Comment:Prompt API with audio input requires a GPU.
  • network: Unlimited data or unmetered connection.

The exact size of the Gemini Nano may vary as the browser updates the model. To determine the current size, go to chrome://on-device-internals.

Use Prompt API

The Prompt API uses the Gemini Nano model in Chrome. While the API is built into Chrome, the first time someone uses the native API the models are downloaded separately. Before using this API, please accept Google’s Generative AI Prohibited Use Policy.

To determine if the model is ready for use, call
LanguageModel.availability().

const availability = await LanguageModel.availability({
  // The same options in `prompt()` or `promptStreaming()`
});

To trigger the download and get the language model running immediately, check user activation. Then, call
create() Celebration.

const session = await LanguageModel.create({
  monitor(m) {
    m.addEventListener('downloadprogress', (e) => {
      console.log(`Downloaded ${e.loaded * 100}%`);
    });
  },
});


If reaction availability() Was downloadingListen to the download progress and notify the user, as the download may take time.

use on localhost

All built-in AI APIs are available here localhost In Chrome. Set the following flags to Active: :

  • chrome://flags/#optimization-guide-on-device-model
  • chrome://flags/#prompt-api-for-gemini-nano-multimodal-input

then click relaunch Or restart Chrome. If you get errors, troubleshoot localhost.

model parameters

params() The function informs you about the parameters of the language model. The object has the following fields:

  • defaultTopK: Default top-K value.
  • maxTopK:Maximum peak-K value.
  • defaultTemperature: Default temperature.
  • maxTemperature:Maximum temperature.

// Only available when using the Prompt API for Chrome Extensions.
await LanguageModel.params();
// {defaultTopK: 3, maxTopK: 128, defaultTemperature: 1, maxTemperature: 2}

create a session

Once the Prompt API can run, you create a session with create() Celebration.

const session = await LanguageModel.create();

Create a session with the Prompt API for Chrome extension

When you use the Prompt API for Chrome extension, each session can be customized topK And temperature Using the Alternative Options object. Default values ​​are returned for these parameters LanguageModel.params().

// Only available when using the Prompt API for Chrome Extensions.
const params = await LanguageModel.params();
// Initializing a new session must either specify both `topK` and
// `temperature` or neither of them.
// Only available when using the Prompt API for Chrome Extensions.
const slightlyHighTemperatureSession = await LanguageModel.create({
  temperature: Math.max(params.defaultTemperature * 1.2, 2.0),
  topK: params.defaultTopK,
});


create() Optional option of function also takes object signal field, which lets you make a pass AbortSignal To destroy the session.

const controller = new AbortController();
stopButton.onclick = () => controller.abort();

const session = await LanguageModel.create({
  signal: controller.signal,
});

Add context with initial prompts

With initialization prompts, you can provide the language model with context of previous interactions, for example, allowing the user to resume a stored session after a browser restart.

const session = await LanguageModel.create({
  initialPrompts: [
    { role: 'system', content: 'You are a helpful and friendly assistant.' },
    { role: 'user', content: 'What is the capital of Italy?' },
    { role: 'assistant', content: 'The capital of Italy is Rome.' },
    { role: 'user', content: 'What language is spoken there?' },
    {
      role: 'assistant',
      content: 'The official language of Italy is Italian. [...]',
    },
  ],
});

Limit responses by prefix

you can add a "assistant" The role, in addition to previous roles, to elaborate on the model’s previous reactions. For example:

const followup = await session.prompt([
  {
    role: "user",
    content: "I'm nervous about my presentation tomorrow"
  },
  {
    role: "assistant",
    content: "Presentations are tough!"
  }
]);


In some cases, instead of requesting a new response, you may want to pre-fill part of it "assistant"-Role response message. This can be helpful to guide the language model to use a specific response format. To do this, add
prefix: true to follow up "assistant"-Introduction message. For example:

const characterSheet = await session.prompt([
  {
    role: 'user',
    content: 'Create a TOML character sheet for a gnome barbarian',
  },
  {
    role: 'assistant',
    content: '```toml\n',
    prefix: true,
  },
]);

Add expected inputs and outputs

The Prompt API has multimodal capabilities and supports multiple languages. set up expectedInputs And expectedOutputs
Modalities and languages ​​when creating your session.

  • type:Methods required.
    • For expectedInputsthis can happen text, imageOr audio.
    • For expectedOutputsPrompt API allows text Only.
  • languages:array to set the expected language or languages. accepts prompt api "en", "ja"And "es". Support for additional languages ​​is in development.
    • For expectedInputsSet the system prompt language and one or more required user prompt languages.
    • set one or more expectedOutputs Languages.

const session = await LanguageModel.create({
  expectedInputs: [
    { type: "text", languages: ["en" /* system prompt */, "ja" /* user prompt */] }
  ],
  expectedOutputs: [
    { type: "text", languages: ["ja"] }
  ]
});


you can get one "NotSupportedError" DOMException if the model encounters an unsupported input or output.

multimodal capabilities

With these capabilities, you can:

  • Allow users to transcribe audio messages sent in chat applications.
  • Describe an image uploaded to your website for use in a caption or alt text.

Take a look at the MediaRecorder Audio Prompt demo to use the Prompt API with audio input and the Canvas Image Prompt demo to use the Prompt API with image input.

The prompt API supports the following input types:

This snippet shows a multimodal session that processes the first two views (one image) Blob and one HTMLCanvasElement) and the AI ​​compares them, and that second lets the user respond with an audio recording (like this). AudioBuffer).

const session = await LanguageModel.create({
  expectedInputs: [
    { type: "text", languages: ["en"] },
    { type: "audio" },
    { type: "image" },
  ],
  expectedOutputs: [{ type: "text", languages: ["en"] }],
});

const referenceImage = await (await fetch("reference-image.jpeg")).blob();
const userDrawnImage = document.querySelector("canvas");

const response1 = await session.prompt([
  {
    role: "user",
    content: [
      {
        type: "text",
        value:
          "Give a helpful artistic critique of how well the second image matches the first:",
      },
      { type: "image", value: referenceImage },
      { type: "image", value: userDrawnImage },
    ],
  },
]);
console.log(response1);

const audioBuffer = await captureMicrophoneInput({ seconds: 10 });

const response2 = await session.prompt([
  {
    role: "user",
    content: [
      { type: "text", value: "My response to your critique:" },
      { type: "audio", value: audioBuffer },
    ],
  },
]);
console.log(response2);

add message

Estimation may take some time, especially when signaling with multimodal input. It may be useful to send predefined signals in advance to populate the session, so that the model can get a head start on processing.

Whereas initialPrompts Useful in session creation append() The method can be used in addition prompt() Or promptStreaming() Ways to provide additional contextual prompts after a session is created.

For example:

const session = await LanguageModel.create({
  initialPrompts: [
    {
      role: 'system',
      content:
        'You are a skilled analyst who correlates patterns across multiple images.',
    },
  ],
  expectedInputs: [{ type: 'image' }],
});

fileUpload.onchange = async () => {
  await session.append([
    {
      role: 'user',
      content: [
        {
          type: 'text',
          value: `Here's one image. Notes: ${fileNotesInput.value}`,
        },
        { type: 'image', value: fileUpload.files[0] },
      ],
    },
  ]);
};

analyzeButton.onclick = async (e) => {
  analysisResult.textContent = await session.prompt(userQuestionInput.value);
};


came back as promised append() This is complete once the prompt is validated, processed, and added to the session. If the signal cannot be attached the promise is rejected.

Pass JSON Schema

add responseConstraint to the field prompt() Or promptStreaming() Method to pass a JSON schema as a value. You can then use structured output with the Prompt API.

In the following example, the JSON schema ensures that the model responds
true Or false To classify whether a given message is about pottery.

const session = await LanguageModel.create();

const schema = {
  "type": "boolean"
};

const post = "Mugs and ramen bowls, both a bit smaller than intended, but that
happens with reclaim. Glaze crawled the first time around, but pretty happy
with it after refiring.";

const result = await session.prompt(
  `Is this post about pottery?\n\n${post}`,
  {
    responseConstraint: schema,
  }
);
console.log(JSON.parse(result));
// true


Your implementation may include a JSON schema or regular expression as part of the message sent to the model. It uses some context windows. You can measure how many context windows it will use by passing responseConstraint option of
session.measureContextUsage().

You can avoid this behavior omitResponseConstraintInput Option. If you do this, we recommend that you include some guidance in the prompt:

const result = await session.prompt(`
  Summarize this feedback into a rating between 0-5. Only output a JSON
  object { rating }, with a single property whose value is a number:
  The food was delicious, service was excellent, will recommend.
`, { responseConstraint: schema, omitResponseConstraintInput: true });

prompt the model

You can indicate the model with one of these prompt() Or promptStreaming()
Work.

request-based output

If you expect concise results, you can use prompt() Function that returns when a response is available.

// Start by checking if it's possible to create a session based on the
// availability of the model, and the characteristics of the device.
const available = await LanguageModel.availability({
  expectedInputs: [{type: 'text', languages: ['en']}],
  expectedOutputs: [{type: 'text', languages: ['en']}],
});

if (available !== 'unavailable') {
  const session = await LanguageModel.create();

  // Prompt the model and wait for the whole result to come back.
  const result = await session.prompt('Write me a poem!');
  console.log(result);
}

streamed output

If you expect a long response, you should use promptStreaming() Function that lets you show partial results as they come from the model.
promptStreaming() function returns a ReadableStream.

const available = await LanguageModel.availability({
  expectedInputs: [{type: 'text', languages: ['en']}],
  expectedOutputs: [{type: 'text', languages: ['en']}],
});
if (available !== 'unavailable') {
  const session = await LanguageModel.create();

  // Prompt the model and stream the result:
  const stream = session.promptStreaming('Write me an extra-long poem!');
  for await (const chunk of stream) {
    console.log(chunk);
  }
}

stop signaling

Both prompt() And promptStreaming() Accept an optional second parameter with signal Field, which lets you stop running the prompt.

const controller = new AbortController();
stopButton.onclick = () => controller.abort();

const result = await session.prompt('Write me a poem!', {
  signal: controller.signal,
});

session management

Each session keeps track of the context of the conversation. Previous conversations are taken into account for future conversations until the session’s context window is full.

Each session has a maximum number of tokens that it can process. Check your progress toward this limit with the following:

console.log(`${session.contextUsage}/${session.contextWindow}`);


It is possible to send a signal that causes the context window to overflow. In such cases, the early parts of the interaction with the language model will be removed, one prompt and response pair at a time, until enough tokens are available to process the new prompt. The exception is the system prompt, which is never removed.

Such overflow can be detected by listening contextoverflow Event on session:

session.addEventListener("contextoverflow", () => {
  console.log("We've gone past the context window, and some inputs will be dropped!");
});


If it is not possible to remove enough tokens from the conversation history to process the new prompt, prompt() Or promptStreaming() The call will fail with QuotaExceededError Exceptions and nothing else will be removed. QuotaExceededError Has the following properties:

  • requested: how many tokens are in the input
  • contextWindow: how many tokens were available

Learn more about session management.

clone a session

To conserve resources, you can copy an existing session clone()
Celebration. This creates a fork of the conversation, where the context and initial signal are preserved.

clone() Function takes an optional option object signal
field, which lets you make a pass AbortSignal To destroy the cloned session.

const controller = new AbortController();
stopButton.onclick = () => controller.abort();

const clonedSession = await session.clone({
  signal: controller.signal,
});

end a session

call out destroy() To free up resources if you no longer need the session. When a session is destroyed, it can no longer be used, and any ongoing execution is aborted. If you intend to prompt the model frequently you may want to keep the session around as it may take a while to create the session.

await session.prompt(
  "You are a friendly, helpful assistant specialized in clothing choices."
);

session.destroy();

// The promise is rejected with an error explaining that
// the session is destroyed.
await session.prompt(
  "What should I wear today? It is sunny, and I am choosing between a t-shirt
  and a polo."
);

community

We’ve created several demos to explore multiple use cases for the Prompt API. The following are demo web applications:

To test the Prompt API in Chrome extensions, install the demo extension. The extension source code is available on GitHub.

performance strategy

The Prompt API is still being developed for the web. As we build this API, refer to our best practices on session management for optimal performance.

Permission Policy, iFrame and Web Workers

By default, the prompt API is available only for top-level windows and their same-parent iframes. Access to API can be assigned to cross-origin iframes using permission policy allow="" Property:




Due to the complexity of setting up a responsible document for each worker to check the status of a permission policy, the Prompt API is not currently available in Web Workers.

Participate and share feedback

Your input can directly impact how we build and implement future versions of this API and all underlying AI APIs.



<a href

Leave a Comment