1.2B parameter model optimized for Retrieval-Augmented Generation
← Back to Liquid NanosLFM2-1.2B-RAG is optimized for answering questions grounded in provided context documents. It excels at extracting relevant information from retrieved documents while avoiding hallucination.
Use temperature=0 (greedy decoding) for best results. Provide relevant documents in the system prompt.
System Prompt Format:
Copy
Ask AI
The following documents may provide you additional information to answer questions:<document1>[Document content here]</document1><document2>[Document content here]</document2>
from transformers import AutoTokenizer, AutoModelForCausalLMmodel_id = "LiquidAI/LFM2-1.2B-RAG"tokenizer = AutoTokenizer.from_pretrained(model_id)model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")system_prompt = """The following documents may provide you additional information to answer questions:<document1>The centre, which was created in 1906, has been instrumental in advancingagriculture research. The library at the Agriculture Canada research centrein Lethbridge serves 48 scientists and 85 technicians, along with manyvisiting staff and students.</document1>"""user_input = "How many individuals were reported to be served by the library at the Agriculture Canada research centre in Lethbridge?"messages = [ {"role": "system", "content": system_prompt}, {"role": "user", "content": user_input}]inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)outputs = model.generate(inputs, max_new_tokens=256, temperature=0, do_sample=False)response = tokenizer.decode(outputs[0], skip_special_tokens=True)print(response)# Output: The library serves 48 scientists and 85 technicians, along with many visiting staff and students.
llama-cli -m ./LFM2-1.2B-RAG-GGUF/LFM2-1.2B-RAG-Q4_K_M.gguf \ -sys "The following documents may provide information: <doc1>The capital of France is Paris.</doc1>" \ -p "What is the capital of France?" \ --temp 0