Nous Research has just unveiled its latest and most impressive creation to date—the Nous-Hermes-2 Mixtral 8x7B. This groundbreaking flagship Large Language Model (LLM) represents a significant leap forward, being the company’s first model to be fine-tuned using Reinforcement Learning from Human Feedback (RLHF). It’s also the first to surpass the renowned Mixtral Instruct across a wide array of popular benchmarks, setting a new standard for AI performance.
Today marks the release of two distinct configurations of Nous-Hermes-2: the SFT (Supervised Fine-Tuning) only model and the enhanced SFT+DPO (Decentralized Policy Optimization) model, alongside a qlora adapter designed specifically for the DPO variant. Both models are now available to the public via HuggingFace, offering users the unique opportunity to test and determine the best fit for their specific applications.
Advancements in Nous-Hermes-2
Here’s how it compares to Mixtral Instruct.
From Twitter, an example of the model writing code for data visualization:
The Mixtral 8x7B model, Nous-Hermes-2, comes in two variants: SFT+DPO and SFT-Only. The SFT only model refers to the model with only the Sparse Fine-Tuning (SFT) technique applied, while the SFT+DPO model includes both the Sparse Fine-Tuning and the Data Parallelism Optimization (DPO) techniques. These two configurations allow users to choose between the SFT only or the combined SFT+DPO model based on their specific requirements and performance preferences
They also we released a QLoRA Adapter that can be attached or merged to any Mixtral Based model to potentially get the benefits of our DPO Training on other Mixtral Finetunes maybe even he base model. This likely means you can potentially improve the performance of fine-tuning other Mixtral models by adding the QLoRA Adapter, even if you’re not using the SFT+DPO variant of the Mixtral 8x7B model.
The advent of Nous-Hermes-2 Mixtral 8x7B marks a milestone in the progress of open-source AI, illustrating the rapid advancements being made each day. This significant release from the Nous team not only meets but surpasses the capabilities of the best open-source model on the market. With its superior performance in 10-shot MMLU, it sets a new bar for the industry, and while showcasing 5-shot MMLU would have been a valuable addition, the current achievements are no less impressive. In my experience, the DPO version seems better.
The model’s use of ChatML as the prompt format and the integration of system prompts for steerability highlight the forward-thinking approach of Nous Research. This not only enhances the model’s versatility but also makes it incredibly user-friendly. The seamless transition for developers and researchers currently using OpenAI APIs to Nous-Hermes-2 is a testament to the thoughtful engineering and user-centric design of the new model.
It’s clear that the gap between proprietary and open-source AI solutions is narrowing with each passing day. The Nous team’s commitment to innovation and openness is not just commendable but a driving force in the democratization of AI technology. Users across the globe can now harness the power of cutting-edge language models, thanks to the relentless efforts of researchers and developers in pushing the boundaries and expanding what’s possible in the realm of AI. With Nous-Hermes-2, the future of open-source AI looks brighter than ever.