HuggingFace today unveiled Vicuna-13B, an open-source alternative to GPT-4 that reportedly achieves 90% of ChatGPT’s quality for around $300, after Stanford University launched ChatGPT clone Alpaca for $600. The model has been fine-tuned with LLaMA and includes user-shared conversations from ShareGPT.
In addition, the model weights were made public alongside the launch. HuggingFace’s latest innovation is expected to pique the interest of companies and individuals seeking to leverage cutting-edge technology for natural language processing applications.
How Does Vicuna Perform?
The preliminary findings for Vicuna have shown a great deal of promise. Because Vicuna was trained with 70,000 user-contributed ChatGPT conversations, the model is able to produce answers that have greater depth and structure. Its accuracy has been shown to be superior to that of well-known alternatives such as LLaMA and Stanford Alpaca in more than ninety percent of all tests.
Early evaluations using GPT-4 as a judge show that Vicuna-13B achieves more than 90%* of the quality of both OpenAI’s ‘ChatGPT’ and Google’s ‘Bard,’ while outperforming other models such as LLaMA and Stanford Alpaca in more than 90%* of cases. Furthermore, the performance level of Vicuna-13B compares favourably to other open-source models, including Stanford Alpaca. The findings have sparked considerable interest in the field of natural language processing, particularly among businesses looking to capitalise on the most recent advances in AI. Building an evaluation system for chatbots is still a work in progress, according to the researchers in a blog post.
An open-source chatbot with 13B parameters that was developed by researchers at Berkeley, Carnegie Mellon, Stanford, and the University of California, San Diego, and trained by refining LLaMA on conversations contributed by users. The chatbot was created by researchers at Berkeley, Carnegie Mellon, Stanford, and the University of California, San Diego. The information was obtained from ShareGPT by utilising APIs that were made available to the general public. ShareGPT is an extension for Google Chrome that enables you to quickly and easily send links to your previous ChatGPT conversations to other people with just one click. The LLaMA foundation model was modified in order to develop the Vicuna platform. ShareGPT has been mined for information on approximately 70.000 different conversations.
Before beginning their analysis of the data, the researchers mention that any HTML that they come across is first converted back into markdown. By doing so, we were able to get rid of discussions that were inappropriate or of poor quality. In order to fit within the model’s maximum context length, lengthy conversations have been segmented into smaller, more manageable pieces.
The foundation of the model is Stanford’s Alpaca, upon which we have added enhancements such as –
1. Optimization of Memory
2. Discussions in several rounds
3. Cost reduction
GPT-4 for Chatbot Evaluation
The Vicuna team is aware of the challenges associated with evaluating chatbots and has proposed a GPT-4-based evaluation framework as a means of simplifying the process. It was discovered that GPT-4 is capable of producing scores that are relatively consistent as well as in-depth explanations of those scores across all eight distinct question categories that were designed to test different aspects of chatbot performance.
The proposed evaluation framework is not yet a rigorous method because of the illogical nature of large language models such as GPT-4. More research is required because there is no clear answer to the question of how to evaluate chatbots in the most effective way.
When asked to complete mathematical or reasoning-based tasks, Vicuna exhibits the same kind of difficulty that is seen in other large language models. It is also possible that it will struggle to correctly identify itself and will have trouble ensuring that its outputs are accurate. It has not been worked on nearly enough to make it safer, less toxic, or less biassed than it currently is.
Despite this, Vicuna offers an exploratory platform for further research to be conducted in order to circumvent these limitations.