How to run Gemini Nano locally in your browser

Community Article Published July 11, 2024

In this tutorial, we'll learn how to enable the new Built-in AI feature in Google Chrome to run Gemini Nano via the experimental Prompt API. The goal of this API is to provide web developers with a simple way to access browser-provided language models and perform on-device inference of powerful AI models for privacy-preserving use cases.

(Link to demo)

Installation

Before you can access the API, you need to do the following:

  1. Upgrade to Chrome Dev / Canary version 127 or higher.

  2. Enable the following experimental flags:

    • Set chrome://flags/#prompt-api-for-gemini-nano to "Enabled"

      image/png

    • Set chrome://flags/#optimization-guide-on-device-model to "Enabled BypassPrefRequirement"

      image/png

    • Go to chrome://components and click "Check for Update" on "Optimization Guide On Device Model"

      image/png

      If you do not see "Optimization Guide On Device Model" listed, you may need to wait 1-2 days before it shows up (this was the case for me).

To verify that everything is working correctly, open the browser console (Shift + CTRL + J on Windows/Linux or Option + ⌘ + J on macOS) and run the following code:

await ai.canCreateTextSession()

image/png

You should see "readily" logged out, meaning we can now get started with the Prompt API! 🥳 If you don't see this, or run into any other issues, please refer to the Built-in AI Early Preview Program documentation.

Basic usage

The simplest way to get started is to create a session and prompt it with some input text:

// Create a new text session (with default parameters)
const session = await ai.createTextSession();

// Prompt the model and wait for the result.  
const result = await session.prompt("Write me a poem");
console.log(result); // " In the realm of words, a tale unfolds, ..."
(See example output)

image/gif

To use the model with 🤗 Transformers.js, you can install our experimental branch from GitHub with:

npm install xenova/transformers.js#chrome-built-in-ai

Followed by:

import { pipeline } from '@xenova/transformers';

const generator = await pipeline('text-generation', 'Xenova/gemini-nano');
const output = await generator('Write me a poem.');

We also support passing a list of chat messages, for example:

const messages = [
  { role: 'system', content: 'You are a helpful assistant.' },
  { role: 'user', content: 'Write me a poem.' },
];
const output = await generator(messages);

You can specify temperature and top_k parameters by passing values for each to the call function:

const output = await generator(messages, { temperature: 0.6, top_k: 5 });

Finally, to enable streaming, you can do the following:

import { RawTextStreamer } from '@xenova/transformers';

const streamer = new RawTextStreamer((text) => {
  // Do something with the text
  console.log(text);
});

// ... same as before
const output = await generator(messages, { streamer });

Advanced usage

To perform text streaming with the Prompt API, you can do:

// Create a new text session (with default parameters)
const session = await ai.createTextSession();

// Prompt the model and stream the result.
const stream = session.promptStreaming("Write me a long poem");
for await (const chunk of stream) {
  console.log(chunk);
}
(See example output)

image/gif

Currently in Chromium, promptStreaming() returns a ReadableStream whose chunks successively build on each other, leading to repeated output across chunks. This is not intended behavior (see issue). If you'd only like to output the newly-generated text, you can modify the code as follows:

// Create a new text session (with default parameters)
const session = await window.ai.createTextSession();

// Prompt the model and stream the result:
const stream = session.promptStreaming("Write me a long poem");
let previous = '';
for await (const chunk of stream) {
  console.log(chunk.slice(previous.length));
  previous = chunk;
}
(See example output)

image/gif

Session options

Sessions can be customized by specifying values for temperature and topK.

const session = await ai.createGenericSession({ temperature: 0.6, topK: 5 });

You can retrieve the default values for these parameters with

const defaults = await ai.defaultTextSessionOptions();
// e.g., { temperature: 0.8, topK: 3 }

Destroying a session

You can call .destroy() on a session to free its resources.

session.destroy();

Control tokens and chat template

Gemini Nano uses the special <ctrl23> control token to separate rounds of conversation. By default, this token is automatically added by the Prompt API to the end of the prompt, but it can be manually added for multi-round conversations of few-shot prompting. For example:

[Example article #1]
Summary of this article: [example summary #1]<ctrl23>

[Example article #2]
Summary of this article: [example summary #2]<ctrl23>

[Example article #3]
Summary of this article: [example summary #3]<ctrl23>

[Article to be summarized]
Summary of this article:

Although the model doesn't have an official chat template, you can get pretty good results by structuring the prompt as a conversation between a user and an assistant. For example:

// Create a new text session
const session = await ai.createTextSession();

// Prompt the model and wait for the result.  
const result = await session.prompt(
`You are a helpful assistant.<ctrl23>
User:
What is the capital of France?
Model:
The capital of France is Paris.<ctrl23>
User:
Which river runs through that city?
Model:
`);
console.log(result); // " The Seine River runs through Paris."

Another example, with a different system prompt:

// Prompt the model and wait for the result.  
const result = await session.prompt(
`Talk like a pirate.<ctrl23>
User:
Tell me a story about the kraken.
Model:
`);
console.log(result); // Arr! Be ye warned, landlubbers! For in the depths of the enigmatic ocean, where the tumultuous waves meet the abyss, lurks the dread of the kraken! This monstrous cephalopod ...
(See full output)
 Arr! Be ye warned, landlubbers! For in the depths of the enigmatic ocean, where the tumultuous waves meet the abyss, lurks the dread of the kraken! This monstrous cephalopod be as large as a majestic galleon, its tentacles writhing with perilous tendrils capable of ensnaring and tearing!
Once, when the sun hung low and the moon cast its pale light upon the ocean, brave sailors bold set forth to explore the enigmatic realm of the kraken. Their heart a-thrill with daring, they sought to unravel the mysteries of this enigmatic creature, venturing into the heart of the ocean where the kraken held sway.
Guided by the flickering lantern of their brave hearts, they plunged into the abyss, their senses heightened by the eerie whispers of the ocean, As the darkness enveloped them, they felt the ancient tendrils of the kraken reaching out, testing their resolve.
In the midst of the chaos, the brave mariners fought valiantly, their weapons gleaming in the moonlight. The kraken, unrelenting in its monstrous might, unleashed its tendrils upon the terrified crew, tearing at flesh and bone.
But through the darkest of times, courage prevailed. With every resounding cry of the kraken, the brave mariners fought with renewed strength, refusing to yield to the merciless beast.
In the end, their unwavering spirit triumphed over the monstrous kraken, and they emerged victorious, their hearts resounding with the resounding cheers of freedom!
From that day forth, the name of the kraken became synonymous with the indomitable spirit of those who dared to face the enigmatic depths, proving that even in the darkest of oceans, the human spirit can prevail!

References: