Skip to content

Installation

llama-crab is distributed on crates.io and is also buildable from a Git checkout. The default features compile the CPU (OpenMP) and — on Apple Silicon — Metal backends, so most users can add the dependency and start building.

1. Add the dependency

Cargo.toml
[dependencies]
llama-crab = "0.1"
Cargo.toml
[dependencies]
llama-crab = { git = "https://github.com/DominguesM/llama-crab", branch = "main" }
Cargo.toml
[dependencies]
llama-crab = { path = "../llama-crab" }

Pin the llama.cpp version

The crate pins llama.cpp to a specific commit, so two builds of the same llama-crab version always produce the same native library. You can see the pinned commit on the README badge or through cargo tree -p llama-crab-sys.

2. Pick a backend

The default features give you a working binary on the most common platforms, but you almost always want to be explicit:

Cargo.toml
[dependencies]
llama-crab = { version = "0.1", default-features = false, features = ["metal", "openmp"] }
Cargo.toml
[dependencies]
llama-crab = { version = "0.1", default-features = false, features = ["cuda", "openmp"] }
Cargo.toml
[dependencies]
llama-crab = { version = "0.1", default-features = false, features = ["rocm", "openmp"] }
Cargo.toml
[dependencies]
llama-crab = { version = "0.1", default-features = false, features = ["vulkan", "openmp"] }
Cargo.toml
[dependencies]
llama-crab = { version = "0.1", default-features = false, features = ["openmp"] }

See the dedicated Mobile distribution guide.

See the Cargo features reference for the complete list of features and what each one toggles.

3. System requirements

The build script compiles llama.cpp from source. Make sure the following are available before running cargo build:

# Xcode Command Line Tools
xcode-select --install

# CMake (Homebrew, or use the one from CLT if present)
brew install cmake
sudo apt update
sudo apt install -y build-essential cmake
sudo dnf install -y gcc gcc-c++ cmake make
# Install Visual Studio 2022 with the "Desktop development with C++"
# workload, then:
winget install Kitware.CMake

First build is slow

Compiling every llama.cpp backend takes ~3 minutes on a 16-core machine the first time. Subsequent builds are cached. To cut the cold build, disable the backends you don't need — see step 2.

4. Verify the toolchain

After the install, run a quick cargo build to make sure CMake, the compiler and the C++ standard library are all reachable:

cargo new hello-crab --bin
cd hello-crab
# Add the dependency shown in step 1, then:
cargo build --release

A successful build prints something like:

   Compiling llama-crab-sys v0.1.300 (...)
   Compiling llama-crab v0.1.300 (...)
    Finished `release` profile [optimized] [..]

You're ready to write your first program.

Optional: download a model

The rest of the guide assumes you have a GGUF file on disk. The easiest way to grab a known-good one is the helper script:

./scripts/download_models.sh smol
# → models/qwen2.5-0.5b-instruct-q4_k_m.gguf
./scripts/download_models.sh bge
# → models/bge-small-en-v1.5-q4_k_m.gguf
./scripts/download_models.sh gemma4
# → models/gemma-4-E4B-it-Q4_K_M.gguf
# → models/mmproj-gemma-4-E4B-it-BF16.gguf

See scripts/download_models.sh for the full list of supported targets.

Next steps