gemma-7b-it
Model ID: @hf/google/gemma-7b-it
Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants.
More Information Terms & License
Properties
Task Type: Text Generation
Max batch prefill tokens: 2048
Max input tokens: 1512
Max total tokens: 2048
Use the Playground
Try out this model with Workers AI Model Playground. It does not require any setup or authentication and an instant way to preview and test a model directly in the browser.
Launch the Model Playground Code Examples
Worker
curl
Prompting
Part of getting good results from text generation models is asking questions correctly. LLMs are usually trained with specific predefined templates, which should then be used with the model’s tokenizer for better results when doing inference tasks.
We recommend using unscoped prompts for inference with LoRA.
Unscoped prompts
You can use unscoped prompts to send a single question to the model without worrying about providing any context. Workers AI will automatically convert your { prompt: } input to a reasonable default scoped prompt internally so that you get the best possible prediction.
You can also use unscoped prompts to construct the model chat template manually. In this case, you can use the raw parameter. Here’s an input example of a Mistral chat template prompt:
Responses
API Schema
The following schema is based on JSON Schema