Do you host your own AI?

SuspiciousCarrot78@aussie.zone · 2 days ago

Do you host your own AI?

SuspiciousCarrot78@aussie.zone · 1 day ago

Llama.cpp or death!

tristynalxander@mander.xyz · 19 hours ago

It’s not that hard to use llama.cpp directly anyway. Why would I use a wrapper when I can just run a python script?

BlackLaZoR@lemmy.world · 1 hour ago

I use LMStudio, because it has quality of life improvements like nice GUI and huggingface search engine. Also they have Vulkan backend that at least on 7900XTX is ~10% faster than rocm (on LLama 3 8b Q4_0 it gets 115Tokens/s vs 105 on rocm)

brucethemoose@lemmy.world · 21 hours ago

Or exllama! Vllm, sglang, Lorax. Koboldcpp, Aphrodite, text-generation-webui, LM Studio, powerinfer, ktransformers, mlc-LLM, really whatever floats your boat. Just not ollama, specifically.