RubyFlow The Ruby and Rails community linklog

×

The Ruby and Rails community linklog

Made a library? Written a blog post? Found a useful tutorial? Share it with the Ruby community here or just enjoy what everyone else has found!

Run LLMs natively in Ruby with Rust + GPU support

Red Candle is a Ruby gem that lets you run Llama 2, Llama 3, Mistral, and Gemma large language models directly inside your Ruby process using Rust.

It uses Magnus to interface with Hugging Face’s Candle crate, giving your Ruby app direct access to the LLM through FFI, with no Python or separate server needed.

Red Candle supports:

  • Chat completions (including streaming)
  • Embeddings
  • Reranking
  • Named entity recognition
  • Hardware acceleration (Metal and CUDA)
  • Both safetensors and gguf quantized model formats

GitHub: https://github.com/assaydepot/red-candle

Post a comment

You can use basic HTML markup (e.g. <a>) or Markdown.

As you are not logged in, you will be
directed via GitHub to signup or sign in