diff --git a/.cspell-wordlist.txt b/.cspell-wordlist.txt index 30fede370..2e94b445a 100644 --- a/.cspell-wordlist.txt +++ b/.cspell-wordlist.txt @@ -98,3 +98,8 @@ ocurred libfbjni libc gradlew +AEROPLANE +DININGTABLE +POTTEDPLANT +TVMONITOR +sublist \ No newline at end of file diff --git a/RELEASE.md b/RELEASE.md index ea66070c7..e281667b5 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -10,7 +10,7 @@ The release process of new minor version consists of the following steps: 4. Create a new release branch `release/{MAJOR}.{MINOR}`and push it to the remote. 5. Stability tests are performed on the release branch and all fixes to the new-found issues are pushed into the main branch and cherry-picked into the release branch. This allows for further development on the main branch without interfering with the release process. 6. Once all tests are passed, tag the release branch with proper version tag `v{MAJOR}.{MINOR}.0` and run `npm publish`. -7. Create versioned docs by running from repo root `(cd docs && yarn docusaurus docs:version {MAJOR}.{MINOR}.x)` (the 'x' part is intentional and is not to be substituted). +7. Create versioned docs by running from repo root `(cd docs && yarn docusaurus docs:version {MAJOR}.{MINOR}.x)` (the 'x' part is intentional and is not to be substituted). Also, make sure that all the links in `api-reference` are not broken. 8. Create a PR with the updated docs. 9. Create the release notes on GitHub. 10. Update README.md with release video, if available. diff --git a/apps/computer-vision/app/ocr_vertical/index.tsx b/apps/computer-vision/app/ocr_vertical/index.tsx index 3e78ead6b..f298a3d5c 100644 --- a/apps/computer-vision/app/ocr_vertical/index.tsx +++ b/apps/computer-vision/app/ocr_vertical/index.tsx @@ -1,7 +1,7 @@ import Spinner from '../../components/Spinner'; import { BottomBar } from '../../components/BottomBar'; import { getImage } from '../../utils'; -import { useVerticalOCR, VERTICAL_OCR_ENGLISH } from 'react-native-executorch'; +import { useVerticalOCR, OCR_ENGLISH } from 'react-native-executorch'; import { View, StyleSheet, Image, Text, ScrollView } from 'react-native'; import ImageWithBboxes2 from '../../components/ImageWithOCRBboxes'; import React, { useContext, useEffect, useState } from 'react'; @@ -16,7 +16,7 @@ export default function VerticalOCRScree() { height: number; }>(); const model = useVerticalOCR({ - model: VERTICAL_OCR_ENGLISH, + model: OCR_ENGLISH, independentCharacters: true, }); const { setGlobalGenerating } = useContext(GeneratingContext); diff --git a/docs/README.md b/docs/README.md index aaba2fa1e..398d39a7c 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,41 +1,112 @@ -# Website +# React Native ExecuTorch Documentation -This website is built using [Docusaurus 2](https://docusaurus.io/), a modern static website generator. +This directory contains the source code for the documentation website of React Native ExecuTorch, built with [Docusaurus](https://docusaurus.io/). -### Installation +## Getting Started -``` -$ yarn -``` +### Prerequisites -### Local Development +- Node.js (v20+) +- Yarn -``` -$ yarn start +### Installation + +Navigate to the `docs` directory and install dependencies: + +```bash +cd docs +yarn install ``` -This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server. +### Running Locally -### Build +Start the development server. **Note:** This command automatically generates the API reference from the TypeScript source before starting the site. +```bash +yarn start ``` -$ yarn build -``` +The site will open at `http://localhost:3000/react-native-executorch/`. -This command generates static content into the `build` directory and can be served using any static contents hosting service. +## Building +To build the static files for production: + +```bash +yarn build +``` +The output will be generated in the `build/` directory. -### Deployment +## Deployment Using SSH: -``` +```bash $ USE_SSH=true yarn deploy ``` Not using SSH: -``` +```bash $ GIT_USER= yarn deploy ``` If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the `gh-pages` branch. + +## How API Reference Generation Works + +We use `docusaurus-plugin-typedoc` to automatically generate documentation from the TypeScript source code. + +1. **Source:** The plugin reads `../packages/react-native-executorch/src/index.ts`. + +2. **Generation:** On `yarn start` or `yarn build`, it generates Markdown files into `docs/06-api-reference`. + +3. **Sidebar Integration:** Docusaurus strips the number prefix (`06-`) from URLs but requires the folder name for configuration. We use a specific sidebar setup to ensure breadcrumbs work while keeping the UI clean. + +### Sidebar Configuration (`sidebars.js`) +The "API Reference" section is configured with specific settings to handle breadcrumbs correctly: + +- `collapsed: true`: Keeps the huge list of API files closed by default. + +- **`link` property**: Points to `api-reference/index`. This ensures clicking the text "API Reference" redirects to the Overview page. + +- **`items` array**: We nest the autogenerated items inside. This is **critical** because it creates the parent-child relationship needed for breadcrumbs (e.g., `Home > API Reference > LLMType`). + +### Troubleshooting Sidebar: + +If you change the folder structure, ensure `dirName` matches the physical folder (`06-api-reference`), but `id` uses the Docusaurus ID (`api-reference/index`). + +## Releasing & Versioning + +We use Docusaurus versioning to "freeze" documentation for older releases. + +### Creating a New Version + +When you release a new npm version (e.g., `1.0.0`), you must snapshot the documentation. Run this command from the `docs/` folder: + +```bash +yarn docusaurus docs:version 1.0.0 +``` +### Why is this important? + +1. It copies the current contents of `docs/` (including the **currently generated API reference**) into `versioned_docs/version-1.0.0`. +2. This ensures that if the TypeScript API changes in the future, the docs for v1.0.0 remain accurate to that version. + +### Working on "Next" + +Edit files in `docs/` as usual. These changes appear in the "Next" version of the site (the default view), reflecting the current `main` branch. + +## Troubleshooting + +### "Duplicate ID" Errors +If you see build errors about duplicate IDs, it usually means an old generation artifact is conflicting with a new one. Run: + +```bash +yarn clear +``` +This deletes the `.docusaurus` cache and `build` folder. + +### API Docs Not Updating + +TypeDoc runs strictly at startup. If you modify comments in the TypeScript package: + +1. Stop the server. +2. Run `yarn start` again to regenerate the Markdown files. \ No newline at end of file diff --git a/docs/docs/01-fundamentals/01-getting-started.md b/docs/docs/01-fundamentals/01-getting-started.md index 9c70651b0..e95754631 100644 --- a/docs/docs/01-fundamentals/01-getting-started.md +++ b/docs/docs/01-fundamentals/01-getting-started.md @@ -95,6 +95,5 @@ yarn run expo: -d If you want to dive deeper into ExecuTorch or our previous work with the framework, we highly encourage you to check out the following resources: - [ExecuTorch docs](https://pytorch.org/executorch/stable/index.html) -- [Native code for iOS](https://medium.com/swmansion/bringing-native-ai-to-your-mobile-apps-with-executorch-part-i-ios-f1562a4556e8?source=user_profile_page---------0-------------250189c98ccf---------------) -- [Native code for Android](https://medium.com/swmansion/bringing-native-ai-to-your-mobile-apps-with-executorch-part-ii-android-29431b6b9f7f?source=user_profile_page---------2-------------b8e3a5cb1c63---------------) -- [Exporting to Android with XNNPACK](https://medium.com/swmansion/exporting-ai-models-on-android-with-xnnpack-and-executorch-3e70cff51c59?source=user_profile_page---------1-------------b8e3a5cb1c63---------------) +- [React Native RAG](https://blog.swmansion.com/introducing-react-native-rag-fbb62efa4991) +- [Offline Text Recognition on Mobile: How We Brought EasyOCR to React Native ExecuTorch](https://blog.swmansion.com/bringing-easyocr-to-react-native-executorch-2401c09c2d0c) \ No newline at end of file diff --git a/docs/docs/01-fundamentals/02-loading-models.md b/docs/docs/01-fundamentals/02-loading-models.md index 8763d9614..96be9784f 100644 --- a/docs/docs/01-fundamentals/02-loading-models.md +++ b/docs/docs/01-fundamentals/02-loading-models.md @@ -36,6 +36,10 @@ useExecutorchModule({ The downloaded files are stored in documents directory of your application. ::: +## Predefined Models + +Our library offers out-of-the-box support for multiple models. To make things easier, we created aliases for our model exported to `pte` format. For full list of aliases, check out: [API Reference](../06-api-reference/index.md#models---classification) + ## Example The following code snippet demonstrates how to load model and tokenizer files using `useLLM` hook: diff --git a/docs/docs/01-fundamentals/03-frequently-asked-questions.md b/docs/docs/01-fundamentals/03-frequently-asked-questions.md index 35b0b1b1b..9216c615f 100644 --- a/docs/docs/01-fundamentals/03-frequently-asked-questions.md +++ b/docs/docs/01-fundamentals/03-frequently-asked-questions.md @@ -10,7 +10,18 @@ Each hook documentation subpage (useClassification, useLLM, etc.) contains a sup ### How can I run my own AI model? -To run your own model, you need to directly access the underlying [ExecuTorch Module API](https://pytorch.org/executorch/stable/extension-module.html). We provide an experimental [React hook](../03-hooks/03-executorch-bindings/useExecutorchModule.md) along with a [TypeScript alternative](../04-typescript-api/03-executorch-bindings/ExecutorchModule.md), which serve as a way to use the aforementioned API without the need of diving into native code. In order to get a model in a format runnable by the runtime, you'll need to get your hands dirty with some ExecuTorch knowledge. For more guides on exporting models, please refer to the [ExecuTorch tutorials](https://pytorch.org/executorch/stable/tutorials/export-to-executorch-tutorial.html). Once you obtain your model in a `.pte` format, you can run it with `useExecuTorchModule` and `ExecuTorchModule`. +To run your own model, you need to directly access the underlying [ExecuTorch Module API](https://pytorch.org/executorch/stable/extension-module.html). We provide [React hook](../03-hooks/03-executorch-bindings/useExecutorchModule.md) along with a [TypeScript alternative](../04-typescript-api/03-executorch-bindings/ExecutorchModule.md), which serve as a way to use the aforementioned API without the need of diving into native code. In order to get a model in a format runnable by the runtime, you'll need to get your hands dirty with some ExecuTorch knowledge. For more guides on exporting models, please refer to the [ExecuTorch tutorials](https://pytorch.org/executorch/stable/tutorials/export-to-executorch-tutorial.html). Once you obtain your model in a `.pte` format, you can run it with `useExecuTorchModule` and `ExecuTorchModule`. + +### How React Native ExecuTorch works under the hood? + +The general workflow for each functionality in our library goes like this: + +- You call a functionality from TypeScript +- TypeScript calls C++ function like model inference or data processing via JSI +- C++ returns result to TypeScript back via JSI +- You get results in TypeScript + +Using JSI enables us using **zero-copy data transfer** and **fast, low-level C++**. ### Can you do function calling with useLLM? diff --git a/docs/docs/03-hooks/01-natural-language-processing/useLLM.md b/docs/docs/03-hooks/01-natural-language-processing/useLLM.md index 56deff82a..6f75e3258 100644 --- a/docs/docs/03-hooks/01-natural-language-processing/useLLM.md +++ b/docs/docs/03-hooks/01-natural-language-processing/useLLM.md @@ -23,13 +23,19 @@ description: "Learn how to use LLMs in your React Native applications with React React Native ExecuTorch supports a variety of LLMs (checkout our [HuggingFace repository](https://huggingface.co/software-mansion) for model already converted to ExecuTorch format) including Llama 3.2. Before getting started, you’ll need to obtain the .pte binary—a serialized model, the tokenizer and tokenizer config JSON files. There are various ways to accomplish this: -- For your convenience, it's best if you use models exported by us, you can get them from our [HuggingFace repository](https://huggingface.co/software-mansion). You can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library. -- Follow the official [tutorial](https://github.com/pytorch/executorch/blob/release/0.7/examples/demo-apps/android/LlamaDemo/docs/delegates/xnnpack_README.md) made by ExecuTorch team to build the model and tokenizer yourself. +- For your convenience, it's best if you use models exported by us, you can get them from our [HuggingFace repository](https://huggingface.co/collections/software-mansion/llm). You can also use [constants](../../06-api-reference/index.md#models---lmm) shipped with our library. +- Follow the official [tutorial](https://docs.pytorch.org/executorch/stable/llm/export-llm.html) made by ExecuTorch team to export arbitrary chosen LLM model. :::danger Lower-end devices might not be able to fit LLMs into memory. We recommend using quantized models to reduce the memory footprint. ::: +## API Reference + +- For detailed API Reference for `useLLM` see: [`useLLM` API Reference](../../06-api-reference/functions/useLLM.md). +- For all LLM models available out-of-the-box in React Native ExecuTorch see: [LLM Models](../../06-api-reference/index.md#models---lmm). +- For useful LLM utility functionalities please refer to the following link: [LLM Utility Functionalities](../../06-api-reference/index.md#utilities---llm). + ## Initializing In order to load a model into the app, you need to run the following code: @@ -42,123 +48,28 @@ const llm = useLLM({ model: LLAMA3_2_1B });
-The code snippet above fetches the model from the specified URL, loads it into memory, and returns an object with various functions and properties for controlling the model. You can monitor the loading progress by checking the `llm.downloadProgress` and `llm.isReady` property, and if anything goes wrong, the `llm.error` property will contain the error message. +The code snippet above fetches the model from the specified URL, loads it into memory, and returns an object with various functions and properties for controlling the model. You can monitor the loading progress by checking the [`llm.downloadProgress`](../../06-api-reference/interfaces/LLMType.md#downloadprogress) and [`llm.isReady`](../../06-api-reference/interfaces/LLMType.md#isready) property, and if anything goes wrong, the [`llm.error`](../../06-api-reference/interfaces/LLMType.md#error) property will contain the error message. ### Arguments -**`model`** - Object containing the model source, tokenizer source, and tokenizer config source. - -- **`modelSource`** - `ResourceSource` that specifies the location of the model binary. - -- **`tokenizerSource`** - `ResourceSource` pointing to the JSON file which contains the tokenizer. +`useLLM` takes [`LLMProps`](../../06-api-reference/interfaces/LLMProps.md) that consists of: -- **`tokenizerConfigSource`** - `ResourceSource` pointing to the JSON file which contains the tokenizer config. +- [model source](../../06-api-reference/interfaces/LLMProps.md#modelsource), [tokenizer source](../../06-api-reference/interfaces/LLMProps.md#tokenizersource), and [tokenizer config source](../../06-api-reference/interfaces/LLMProps.md#tokenizerconfigsource). +- An optional flag [`preventLoad`](../../06-api-reference/interfaces/SpeechToTextProps.md#preventload) which prevents auto-loading of the model. -**`preventLoad?`** - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook. +You need more details? Check the following resources: -For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page. +- For detailed information about `useLLM` arguments check this section: [`useLLM` arguments](../../06-api-reference/functions/useLLM.md#parameters). +- For more information on loading resources, take a look at [loading models](../../01-fundamentals/02-loading-models.md) page. +- For available LLM models please check out the following list: [LLM Models](../../06-api-reference/index.md#models---lmm). ### Returns -| Field | Type | Description | -| ------------------------ | -------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `generate()` | `(messages: Message[], tools?: LLMTool[]) => Promise` | Runs model to complete chat passed in `messages` argument. Returns the generated response. It doesn't manage conversation context. | -| `interrupt()` | `() => void` | Function to interrupt the current inference. | -| `response` | `string` | State of the generated response. This field is updated with each token generated by the model. | -| `token` | `string` | The most recently generated token. | -| `isReady` | `boolean` | Indicates whether the model is ready. | -| `isGenerating` | `boolean` | Indicates whether the model is currently generating a response. | -| `downloadProgress` | `number` | Represents the download progress as a value between 0 and 1, indicating the extent of the model file retrieval. | -| `error` | string | null | Contains the error message if the model failed to load. | -| `configure` | `({chatConfig?: Partial, toolsConfig?: ToolsConfig, generationConfig?: GenerationConfig}) => void` | Configures chat and tool calling. See more details in [configuring the model](#configuring-the-model). | -| `sendMessage` | `(message: string) => Promise` | Function to add user message to conversation. Returns the generated response. After model responds, `messageHistory` will be updated with both user message and model response. | -| `deleteMessage` | `(index: number) => void` | Deletes all messages starting with message on `index` position. After deletion `messageHistory` will be updated. | -| `messageHistory` | `Message[]` | History containing all messages in conversation. This field is updated after model responds to `sendMessage`. | -| `getGeneratedTokenCount` | `() => number` | Returns the number of tokens generated in the last response. | - -
-Type definitions - -```typescript -const useLLM: ({ - model, - preventLoad, -}: { - model: { - modelSource: ResourceSource; - tokenizerSource: ResourceSource; - tokenizerConfigSource: ResourceSource; - }; - preventLoad?: boolean; -}) => LLMType; - -interface LLMType { - messageHistory: Message[]; - response: string; - token: string; - isReady: boolean; - isGenerating: boolean; - downloadProgress: number; - error: string | null; - configure: ({ - chatConfig, - toolsConfig, - generationConfig, - }: { - chatConfig?: Partial; - toolsConfig?: ToolsConfig; - generationConfig?: GenerationConfig; - }) => void; - getGeneratedTokenCount: () => number; - generate: (messages: Message[], tools?: LLMTool[]) => Promise; - sendMessage: (message: string) => Promise; - deleteMessage: (index: number) => void; - interrupt: () => void; -} - -type ResourceSource = string | number | object; - -type MessageRole = 'user' | 'assistant' | 'system'; - -interface Message { - role: MessageRole; - content: string; -} -interface ChatConfig { - initialMessageHistory: Message[]; - contextWindowLength: number; - systemPrompt: string; -} - -interface GenerationConfig { - temperature?: number; - topp?: number; - outputTokenBatchSize?: number; - batchTimeInterval?: number; -} - -// tool calling -interface ToolsConfig { - tools: LLMTool[]; - executeToolCallback: (call: ToolCall) => Promise; - displayToolCalls?: boolean; -} - -interface ToolCall { - toolName: string; - arguments: Object; -} - -type LLMTool = Object; -``` - -
- ## Functional vs managed You can use functions returned from this hooks in two manners: -1. Functional/pure - we will not keep any state for you. You'll need to keep conversation history and handle function calling yourself. Use `generate` (and rarely `forward`) and `response`. Note that you don't need to run `configure` to use those. Furthermore, `chatConfig` and `toolsConfig` will not have any effect on those functions. +1. Functional/pure - we will not keep any state for you. You'll need to keep conversation history and handle function calling yourself. Use [`generate`](../../06-api-reference/interfaces/LLMType.md#generate) and [`response`](../../06-api-reference/interfaces/LLMType.md#response). Note that you don't need to run [`configure`](../../06-api-reference/interfaces/LLMType.md#configure) to use those. Furthermore, [`chatConfig`](../../06-api-reference/interfaces/LLMConfig.md#chatconfig) and [`toolsConfig`](../../06-api-reference/interfaces/LLMConfig.md#toolsconfig) will not have any effect on those functions. 2. Managed/stateful - we will manage conversation state. Tool calls will be parsed and called automatically after passing appropriate callbacks. See more at [managed LLM chat](#managed-llm-chat). @@ -166,7 +77,7 @@ You can use functions returned from this hooks in two manners: ### Simple generation -To perform chat completion you can use the `generate` function. The `response` value is updated with each token as it's generated, and the function returns a promise that resolves to the complete response when generation finishes. +To perform chat completion you can use the [`generate`](../../06-api-reference/interfaces/LLMType.md#generate) function. The [`response`](../../06-api-reference/interfaces/LLMType.md#response) value is updated with each token as it's generated, and the function returns a promise that resolves to the complete response when generation finishes. ```tsx const llm = useLLM({ model: LLAMA3_2_1B }); @@ -194,13 +105,13 @@ return ( ### Interrupting the model -Sometimes, you might want to stop the model while it’s generating. To do this, you can use `interrupt()`, which will halt the model and update the response one last time. +Sometimes, you might want to stop the model while it’s generating. To do this, you can use [`interrupt`](../../06-api-reference/interfaces/LLMType.md#interrupt), which will halt the model and update the response one last time. -There are also cases when you need to check if tokens are being generated, such as to conditionally render a stop button. We’ve made this easy with the `isGenerating` property. +There are also cases when you need to check if tokens are being generated, such as to conditionally render a stop button. We’ve made this easy with the [`isGenerating`](../../06-api-reference/interfaces/LLMType.md#isgenerating) property. :::warning If you try to dismount the component using this hook while generation is still going on, it will result in crash. -You'll need to interrupt the model first and wait until `isGenerating` is set to false. +You'll need to interrupt the model first and wait until [`isGenerating`](../../06-api-reference/interfaces/LLMType.md#isgenerating) is set to false. ::: ### Reasoning @@ -264,34 +175,31 @@ return ( ### Configuring the model -To configure model (i.e. change system prompt, load initial conversation history or manage tool calling) you can use -`configure` function. It accepts object with following fields: - -**`chatConfig`** - Object configuring chat management, contains following properties: - -- **`systemPrompt`** - Often used to tell the model what is its purpose, for example - "Be a helpful translator". - -- **`initialMessageHistory`** - An array of `Message` objects that represent the conversation history. This can be used to provide initial context to the model. +To configure model (i.e. change system prompt, load initial conversation history or manage tool calling, set generation settings) you can use +[`configure`](../../06-api-reference/classes/LLMModule.md#configure) method. [**`chatConfig`**](../../06-api-reference/interfaces/LLMConfig.md#chatconfig) and [**`toolsConfig`**](../../06-api-reference/interfaces/LLMConfig.md#toolsconfig) is only applied to managed chats i.e. when using [`sendMessage`](../../06-api-reference/classes/LLMModule.md#sendmessage) (see: [Functional vs managed](../../03-hooks/01-natural-language-processing/useLLM.md#functional-vs-managed)) It accepts object with following fields: -- **`contextWindowLength`** - The number of messages from the current conversation that the model will use to generate a response. The higher the number, the more context the model will have. Keep in mind that using larger context windows will result in longer inference time and higher memory usage. +- [`chatConfig`](../../06-api-reference/interfaces/LLMConfig.md#chatconfig) - Object configuring chat management that contains: + - [`systemPrompt`](../../06-api-reference/interfaces/ChatConfig.md#systemprompt) - Often used to tell the model what is its purpose, for example - "Be a helpful translator". -**`toolsConfig`** - Object configuring options for enabling and managing tool use. **It will only have effect if your model's chat template support it**. Contains following properties: + - [`initialMessageHistory`](../../06-api-reference/interfaces/ChatConfig.md#initialmessagehistory) - Object that represent the conversation history. This can be used to provide initial context to the model. -- **`tools`** - List of objects defining tools. + - [`contextWindowLength`](../../06-api-reference/interfaces/ChatConfig.md#contextwindowlength) - The number of messages from the current conversation that the model will use to generate a response. Keep in mind that using larger context windows will result in longer inference time and higher memory usage. -- **`executeToolCallback`** - Function that accepts `ToolCall`, executes tool and returns the string to model. +- [`toolsConfig`](../../06-api-reference/interfaces/LLMConfig.md#toolsconfig) - Object configuring options for enabling and managing tool use. **It will only have effect if your model's chat template support it**. Contains following properties: + - [`tools`](../../06-api-reference/interfaces/ToolsConfig.md#tools) - List of objects defining tools. -- **`displayToolCalls`** - If set to true, JSON tool calls will be displayed in chat. If false, only answers will be displayed. + - [`executeToolCallback`](../../06-api-reference/interfaces/ToolsConfig.md#executetoolcallback) - Function that accepts [`ToolCall`](../../06-api-reference/interfaces/ToolCall.md), executes tool and returns the string to model. -**`generationConfig`** - Object configuring generation settings. + - [`displayToolCalls`](../../06-api-reference/interfaces/ToolsConfig.md#displaytoolcalls) - If set to `true`, JSON tool calls will be displayed in chat. If `false`, only answers will be displayed. -- **`outputTokenBatchSize`** - Soft upper limit on the number of tokens in each token batch (in certain cases there can be more tokens in given batch, i.e. when the batch would end with special emoji join character). +- [`generationConfig`](../../06-api-reference/interfaces/LLMConfig.md#generationconfig) - Object configuring generation settings with following properties: + - [`outputTokenBatchSize`](../../06-api-reference/interfaces/GenerationConfig.md#batchtimeinterval) - Soft upper limit on the number of tokens in each token batch (in certain cases there can be more tokens in given batch, i.e. when the batch would end with special emoji join character). -- **`batchTimeInterval`** - Upper limit on the time interval between consecutive token batches. + - [`batchTimeInterval`](../../06-api-reference/interfaces/GenerationConfig.md#batchtimeinterval) - Upper limit on the time interval between consecutive token batches. -- **`temperature`** - Scales output logits by the inverse of temperature. Controls the randomness / creativity of text generation. + - [`temperature`](../../06-api-reference/interfaces/GenerationConfig.md#temperature) - Scales output logits by the inverse of temperature. Controls the randomness / creativity of text generation. -- **`topp`** - Only samples from the smallest set of tokens whose cumulative probability exceeds topp. + - [`topp`](../../06-api-reference/interfaces/GenerationConfig.md#topp) - Only samples from the smallest set of tokens whose cumulative probability exceeds topp. ### Sending a message @@ -310,8 +218,8 @@ return