internet

Google-backed Slang Labs to use hybrid model of LLMs – The Financial Express


With numerous large language models (LLMs), including India-specific ones, being launched recently, Google-backed Slang Labs is opting for a hybrid model to take the best from every LLM. It will also release its own version of some open source LLMs in the first half of next year that will be domain and India-optimised.

The company offers voice assistants that can be embedded inside popular apps like e-commerce or banks. The firm’s clients include Nykaa, ICICI Direct, Tata Digital, Fresho from Bigbasket, and others.

Currently, the company is using OpenAI for its voice assistant. Kumar Rangarajan, co-founder of Slang Labs, said they have started fine-tuning open source LLMs like Meta’s LLaMA and French-based generative AI startup Mistral AI’s LLM to eventually have a hybrid model of LLMs for its voice assistant – CONVA.

“There are three layers for LLM – the first is called the base LLM, where it is generally trained with a lot of internet data and different language data for general purpose. This model has a good understanding but not trained to become a good assistant. If you ask questions, it will not be able to answer them in a proper way.

While it has a lot of knowledge, it is not that smart to answer correctly, because it is very poor in following instructions,” Rangarajan said.

Making base model is an expensive proposition as bulk of the cost goes in them.

Making the base model is an expensive proposition as the bulk of the cost goes into them. The next layer is the pre-training layer, where the system learns about what is the correct and what is the wrong answer. It learns to tell which answer to prefer when there are multiple answers. There is a bunch of techniques to make sure that the model is able to give you the right answer.

The third layer is the fine-tuning where the LLM has been trained to answer properly. It undergoes fine-tuning where it is made suitable for specific use cases. “People like us or other companies like us can take this lower-level model and build and optimise it for particular purpose or use cases. We are taking base models from LLaMA and Mistral and pre-training and fine-tuning them,” explained Rangarajan.



READ SOURCE

This website uses cookies. By continuing to use this site, you accept our use of cookies.