simple — Plain text completion¶
The smallest possible program: load a model, generate a completion, print the result. Use it as a starting point for a one-shot CLI tool or as a template when you want full control over the sampler chain.
Run¶
The first positional argument is the path to a GGUF model.
What it does¶
use llama_crab::{Llama, LlamaParams};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut llama = Llama::load(
LlamaParams::new("model.gguf")
.with_n_ctx(2048)
.with_n_gpu_layers(99),
)?;
let resp = llama.create_completion("Once upon a time", 64)?;
println!("{}", resp.text);
Ok(())
}
Expected output¶
The actual text depends on the model. The point of the example is the shape of the program: a few lines, no ceremony, all the defaults applied.
Customising the call¶
Pass [CompletionOptions] to the high-level helper to expose the
rest of the sampler chain:
use llama_crab::{CompletionOptions, Llama, LlamaParams};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut llama = Llama::load(LlamaParams::new("model.gguf"))?;
let resp = llama.create_completion_with_options(
"Once upon a time",
CompletionOptions::new(64)
.with_temperature(0.7)
.with_top_p(0.9, 1)
.with_stop_sequence("\n\n"),
)?;
print!("{}", resp.text);
Ok(())
}
See the text completion guide for the full menu of options.
Using a custom sampler chain¶
For full control, build a [SamplerChain] and call
create_completion_with_sampler:
use llama_crab::sampling::SamplerChain;
use llama_crab::{CompletionOptions, Llama, LlamaParams};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut llama = Llama::load(LlamaParams::new("model.gguf"))?;
let mut sampler = SamplerChain::new()
.temp(0.7)
.top_p(0.9, 1)
.min_p(0.05, 1)
.penalties(64, 1.1, 0.0, 0.0)
.build();
let resp = llama.create_completion_with_sampler(
"Once upon a time",
CompletionOptions::new(64),
&mut sampler,
)?;
print!("{}", resp.text);
Ok(())
}
See the sampling strategies guide for the full menu of samplers and recommended chains.