LFM2-1.2B-RAG

← Back to Liquid Nanos LFM2-1.2B-RAG is optimized for answering questions grounded in provided context documents. It excels at extracting relevant information from retrieved documents while avoiding hallucination.

HF GGUF ONNX

Specifications

Property	Value
Parameters	1.2B
Context Length	32K tokens
Task	Retrieval-Augmented Generation
Recommended Temperature	0

Document Q&A

Answer from retrieved docs

Knowledge Base

Query internal knowledge

Grounded Answers

Reduce hallucination

Prompting Recipe

Use temperature=0 (greedy decoding) for best results. Provide relevant documents in the system prompt.

System Prompt Format:

The following documents may provide you additional information to answer questions:

<document1>
[Document content here]
</document1>

<document2>
[Document content here]
</document2>

Quick Start

Transformers
llama.cpp

Install:

pip install transformers torch

Run:

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "LiquidAI/LFM2-1.2B-RAG"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")

system_prompt = """The following documents may provide you additional information to answer questions:

<document1>
The centre, which was created in 1906, has been instrumental in advancing
agriculture research. The library at the Agriculture Canada research centre
in Lethbridge serves 48 scientists and 85 technicians, along with many
visiting staff and students.
</document1>
"""

user_input = "How many individuals were reported to be served by the library at the Agriculture Canada research centre in Lethbridge?"

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_input}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=256, temperature=0, do_sample=False)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
# Output: The library serves 48 scientists and 85 technicians, along with many visiting staff and students.

Download GGUF:

huggingface-cli download LiquidAI/LFM2-1.2B-RAG-GGUF \
  --local-dir ./LFM2-1.2B-RAG-GGUF

Run:

llama-cli -m ./LFM2-1.2B-RAG-GGUF/LFM2-1.2B-RAG-Q4_K_M.gguf \
  -sys "The following documents may provide information: <doc1>The capital of France is Paris.</doc1>" \
  -p "What is the capital of France?" \
  --temp 0

Get Started

Models

Key Concepts

Inference

Fine-tuning

Frameworks

Help

Specifications

Document Q&A

Knowledge Base

Grounded Answers

Prompting Recipe

Quick Start

Get Started

Models

Key Concepts

Inference

Fine-tuning

Frameworks

Help

​Specifications

Document Q&A

Knowledge Base

Grounded Answers

​Prompting Recipe

​Quick Start

Specifications

Prompting Recipe

Quick Start