v0.9.2
1. Prerequisitesβ
You should already have:- An Android project created in Android Studio. You may create an empty project with the wizard. LEAP Android SDK is Kotlin-first. We recommend to work with the SDK only in Kotlin.
- Leap Android SDK needs Kotlin Android plugin v2.2.0 or above and Android Gradle Plugin v8.12.0 or above to build. Declare it in
build.gradle.ktsas
- A working Android device that supports
arm64-v8aABI with developer mode enabled. We recommend having 3GB+ of RAM to run the models. - The minimal SDK requirement is API 31. Declare it in
build.gradle.ktsas
2. Import the LeapSDKβ
Add the following dependencies into$PROJECT_ROOT/app/build.gradle.kts:
3. Getting and Loading Modelsβ
The SDK uses GGUF manifests for loading models (recommended for all new projects due to superior inference performance and better default generation parameters).Legacy Executorch bundle support is available in the accordion below for existing projects.
Loading from GGUF Manifest
The LEAP Edge SDK supports directly downloading LEAP models in GGUF format. Given the model name and quantization method (which you can find in the LEAP Model Library), the SDK will automatically download the necessary GGUF files along with generation parameters for optimal performance. TheLeapDownloader.loadModel suspend function loads a model and returns a model runner instance for invoking the model. This function takes some time to finish as loading the model is a heavy I/O operation, but it is safe to call on the main thread. The function should be executed in a coroutine scope.
Legacy: Executorch Bundles
Legacy: Executorch Bundles
Browse the Leap Model Library to find and download a model bundle that matches your needs.
Download and transfer bundle
Push the bundle file to the device usingadb push. Assuming the downloaded model file is located at ~/Downloads/model.bundle, run the following commands:Loading from local bundle file
TheLeapClient.loadModel suspend function loads a model bundle file and returns a model runner instance for invoking the model. This function takes some time to finish as loading the model is a heavy I/O operation, but it is safe to call on the main thread. The function should be executed in a coroutine scope.4. Generate content with the modelβ
To generate content, a conversation object should be created from the model runner:Conversation.generateResponse function to invoke the generation. Its return value is a Kotlin asynchronous flow of MessageResponse, which can be processed with Kotlin flow operators:
onEachcallback will be called when the model generates a chunk of content.onCompletioncallback will be called when the generation is done. At this time point,conversation.historywill have the latest message generated by the model.catchcallback will be called if an exception is thrown from the generation.
launch method: