In the field of artificial intelligence, few innovations have been as disruptive as Large Language Models (LLMs), renowned for their ability to understand and generate human-like text. These models, powered by cutting-edge technologies, are reshaping industries and the way we interact with technology. While the Generative Pre-trained Transformer (GPT) series, developed by OpenAI, stands as a testament to LLMs’ potential, it is the accessibility and versatility of platforms like Hugging Face that are truly driving this revolution.
What is Hugging Face’s Model Hub?
Traditionally, tech giants operated within proprietary frameworks, guarding their advanced AI models closely. However, a growing realization of the benefits of open-source development has led companies to release some of their most sophisticated models to the wider community. Meta, Microsoft, and others have embraced this new era of openness, contributing pre-trained models to Hugging Face’s Model Hub.
Hugging Face’s Model Hub serves as a central repository for pre-trained AI models, datasets, and resources. This open platform is democratizing AI by enabling developers, researchers, and businesses to access, fine-tune, and deploy these models for various applications. The involvement of industry giants in open-sourcing their models on this platform adds a new layer of significance to its ecosystem.
The integration of models from major tech companies into Hugging Face’s Model Hub brings collaborative innovation. Developers and researchers from around the world can now build starting from these models, creating novel applications, improving performance, and tailoring solutions to specific needs. This level of collaboration has the potential to accelerate AI advancements at an unprecedented pace, while simultaneously leveling the playing field for smaller players who may not have had access to such resources before.
One of the most exciting aspects is integrating LLMs into the production pipelines. There are several ways to achieve it but in this article, we want to present the solution powered by Helicon, Radicalbit’s MLOps platform.
Helicon is able to manage the whole life-cycle of a model in production, starting from the processing of the data before the inference, going through the model serving, until the performance monitoring. One of the latest features is the integration of the Hugging Face Model’s Hub into Helicon so that the deployment of LLMs can be achieved with no code and a few clicks.
How to run a Hugging Face Model in Helicon
That’s what you need to run a Hugging Face Model:
- An active Helicon subscription (if you don’t have one or you want to start a free trial, click here);
- 2 minutes on your hands.
First of all, you need to go to the MLOps section of the platform and create a New Model, choosing a Name and providing a Description.
Helicon gives you two options for uploading models: MLflow and Hugging Face. We’re going to roll with the Hugging Face option (we will write a dedicated blog about MLFlow deployment).
Helicon has a direct line to the Hugging Face APIs. To get your model, choose the Task, the Model Repository name and les jeux sont faits. For this demonstration, we will import a text generation model called bert-base-uncased a variant of BERT (Bidirectional Encoder Representations from Transformers) developed by Google (here’s the paper https://arxiv.org/abs/1810.04805), able to fill the masked words in a sentence with the most probable ones.
All the previous steps have taken almost 30 seconds of your time, and the remaining 90 will be involved to deploy the model and make it available for inference. Press the Serve button and enjoy your Hugging Face model!
Just some considerations before concluding.
There are plenty of tools and solutions to deploy and Hugging Face model, so why should you choose Helicon?
Helicon provides an entire ecosystem of features and perks for your machine-learning models, such as versioning, performance monitoring in production, drift and data integrity detection and much more. In addition to that, it exploits the power of Pipelines to pre-process and post-process your data just before and after the inference, an essential step to make rough data ready for the model and the predictions suitable for your custom use cases.
One last mention must be made and it concerns Applications, through which it is possible to build and expose a service containing pre/post processing and inference all in one shot, accessible via an HTTP call as a single, unique service.
Do you want to know more about Helicon? Visit our website and book your free demonstration!
We are excited to announce that Radicalbit is all set to join World Summit AI 2023 from 11th to 12th October 2023 as Bronze Event Partner and speaker! Alessandro Conflitti, Head of Data Science at Radicalbit, will take the stage with his talk: "Zen & the Art of...
We are proud to introduce Applications, the new, exciting feature included in Helicon, Radicalbit’s MLOps platform. Thanks to Applications, it is now possible to present all the artifacts implemented within the platform (streams, pipelines and models) as...
Kafka Summit is the unmissable event for developers, architects, data engineers, Devops professionals, and streaming data enthusiasts. The two-day appointment hosted by Confluent gathers the Apache Kafka community sharing best practices, learning how to build...
In the world of technology, Artificial Intelligence has been making waves in recent years. AI applications have been used in various industries, from healthcare to finance, and it has revolutionized the way businesses operate. As we look forward to the future, it is...
We are thrilled to announce that Radicalbit will be joining Kafka Summit in May 2023 as a sponsor and speaker for the second year in a row. Alessandro Conflitti, Head of Data Science at Radicalbit, will deliver an address entitled “Memory Matters: Drift Detection with...
Scenario & Challenge Christmas is no joke, especially for Santa Claus. Each year Father Christmas has only one night to deliver gifts to the entire world population, which in 2022 reached 8 billion people. Even considering only children and young people under 18,...