Prompt Roles
LFM2 models use a structured conversation format with three prompt roles:system(optional) - Sets assistant behavior, context, and instructions. Use for personality, task context, output format, or constraints.user- Contains the question, instruction, or request from the user.assistant- Provides a partial response for the model to continue from. Useful for multi-turn conversations, few-shot prompting, or prefilling structured outputs (e.g., JSON opening brace).
Additional examples: few-shot prompting and prefill
Additional examples: few-shot prompting and prefill
Multi-turn conversations / Few-shot prompting:Continue a previous conversation or provide example interactions to guide the model’s behavior. The model learns from the conversation history and applies patterns to new inputs.Or provide few-shot examples:Prefill for structured output:Start the model with a specific format or structure (e.g., JSON opening brace) to guide it toward structured outputs.
Text Models
Control text generation behavior, balancing creativity, determinism, and quality:temperature(0.0-2.0) - Randomness control. Lower (0.1-0.7) = deterministic; higher (0.8-1.5) = creative.top_p(0.0-1.0) - Nucleus sampling. Lower (0.1-0.5) = focused; higher (0.7-0.95) = diverse.top_k- Limits to top-k tokens. Lower (10-50) = high-probability; higher (50-100) = diverse.min_p(0.0-1.0) - Filters tokens belowmin_p * max_probability. Maintains quality with diversity.repetition_penalty(1.0+) - Reduces repetition. 1.0 = no penalty; >1.0 = prevents repetition.max_tokens/max_new_tokens- Maximum tokens to generate.
Recommended Settings Text
For LFM2.5 text models:temperature=0.1top_k=50top_p=0.1repetition_penalty=1.05
temperature=0.3min_p=0.15repetition_penalty=1.05
Vision Models
LFM2-VL models use a variable resolution encoder to control the quality/speed tradeoff by adjusting how images are tokenized.Image Token Management
Control image tokenization with:min_image_tokens- Minimum tokens for image encodingmax_image_tokens- Maximum tokens for image encodingdo_image_splitting- Split large images into 512×512 patches
min_image_tokens and max_image_tokens to balance quality vs. speed.
Example configurations:
Recommended Settings Vision
For vision models:temperature=0.1min_p=0.15repetition_penalty=1.05min_image_tokens=64max_image_tokens=256do_image_splitting=True
Liquid Nanos (task-specific models like LFM2-Extract, LFM2-RAG, LFM2-Tool, etc.) may have special prompting requirements and different generation parameters. For the best usage guidelines, refer to the individual model cards on the Liquid Nanos page.