![]() ![]() License for delta weights: CC-BY-NC-SA-4.0.Model type: StableVicuna-13B is an auto-regressive language model based on the LLaMA transformer architecture.StableVicuna-13B is a Vicuna-13B v0 model fine-tuned using reinforcement learning from human feedback (RLHF) via Proximal Policy Optimization (PPO) on various conversational and instructional datasets. Thireus has written a great guide on how to update it to the latest llama.cpp code which may help with getting the files working in text-gen-ui sooner. Note: at this time text-generation-webui may not support the May 12th updated quantisation methods. GGML models can be loaded into text-generation-webui by installing the llama.cpp module, then placing the ggml model file in a model folder as usual.įurther instructions here: text-generation-webui/docs/. For example if your system has 8 cores/16 threads, use -t 8. main -t 18 -m 4_0.bin -color -c 2048 -temp 0.7 -repeat_penalty 1.1 -n -1 -r "# Human:" -p "# Human: write a story about llamas # Assistant:"Ĭhange -t 18 to the number of physical CPU cores you have. I use the following command line adjust for your tastes and needs. Higher accuracy than q5_0, but again higher resource usage and slower inference. Higher accuracy, higher resource usage and slower inference.ĥbit. ![]()
0 Comments
Leave a Reply. |