Exploring local LLMs: Mistral, Llama, and CodeLlama
When it comes to local LLMs, each model presents new opportunities. Witt Langstaff dives in to explore it different models and pros and cons of each.
Over the weekend, I started a deeper dive into the world of locally running LLMs. What’s the draw? The ability to run an AI model completely offline, without relying on third-party services, is a game-changer, especially for commercial applications where control is key or in scenarios where reliance on external AI systems presents a vulnerability. The independence gained here opens up a ton of possibilities for AI usage, offering a blend of power, privacy, and personalized control.
Exploring the capabilities
When it comes to local LLMs, each model presents new opportunities. Mistral and CodeLlama, for instance, are fantastic at writing Javascript, Python, and other languages. But they’re not limited to code generation. They also excel at tasks we’ve come to admire in ChatGPT — from inspired writing to meticulous document analysis to even coming up with recipes. The best part? All of this happens offline, in our own local setup.
The importance of resources
As the saying goes, “Knowledge is power.” Huggingface is at the forefront of providing publicly available models. Their “Alignment Handbook” is a treasure trove that offers robust recipes for aligning language models with human and AI preferences. It’s a resource that underscores the value of “Supervised Fine-Tuning” — a method that approaches the level of custom GPT's in training LLMs.
TTS demos: A glimpse into the future
These text-to-speech demos, particularly those from Google Colaboratory, are like windows into the future. They showcase an aspect of AI that’s both intriguing and unnervingly lifelike at times, and at others… just unnerving. For example, the “laughing one with throat clearing,” which is as captivating as it is unsettling.
Additional resources for LLM enthusiasts
LMStudio is hands down the best way I’ve found to get different LLM’s operating locally, offering a straightforward way to get started. You can import models directly from Huggingface and they run in containers, so it keeps everything nice and neat.
Ollama.ai is another great resource for loading LLM’s locally, but it’s geared more towards people who are comfortable working with the command line. It’s a great jumping off point for those interested in venturing further into models like the vision-capable LLaVA model on their local machine — though not on par with GPT-Vision, it’s putting visual AI models within our local reach.
Conclusion
The journey into local LLMs is a wild ride — it’s challenging and rewarding, and I’m consistently being surprised at what can be accomplished, all while completely offline. Stringing together LLMs and various Python libraries takes time, planning, research, and a lot of effort in some cases. But the reward? Privacy, independence, and a less restrictive AI experience, unfettered by the constraints of cloud-based services.
Did you enjoy this article? Read more like it on the Edgar Allan blog.
Say hi to Edgar Allan on LinkedIn, X (Twitter), YouTube, and TikTok. We’d love to hear from you!
Take a look at the work Edgar Allan has done by checking out our case studies.