!full! | Bobbie-model

Bobbie is not just another incremental fine-tune. It represents a thoughtful experiment in .

messages = [ "role": "user", "content": "Summarize this 20k token document..." ] inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device) output = model.generate(inputs, max_new_tokens=512, temperature=0.7) print(tokenizer.decode(output[0][inputs.shape[1]:])) Bobbie works out-of-the-box with vLLM 0.6.0+: bobbie-model

If you’ve been following the open-source LLM space, you’ve likely memorized the specs of Llama 3, Mixtral, and Qwen. But a new contender has been quietly gaining traction in the "small model" category: . Bobbie is not just another incremental fine-tune

Published: April 13, 2026 | Reading time: 10 minutes 2026 | Reading time: 10 minutes

Who am I?

My name is Patrick McKenzie (better known as patio11 on the Internets.)

Twitter: @patio11 HN: patio11

Bits about Money

I write Bits about Money, a monthly-ish newsletter on the intersection of tech and finance.

Complex Systems

I host the Complex Systems podcast, a weekly conversation about the technical and human factors underlying infrastructure.