The Build or Buy Dilemma for Generative AI Development was a hot topic in our panel discussion from Startup Boston Week 2023. To continue that discussion, we’re exploring some of the Pros and Cons startups are having to evaluate and what factors go into making an informed decision.
For those new to this discussion, the quick summary is ‘Buying’ for Generative AI refers to the black-box options or closed-source commercial offerings from Google, Microsoft, AWS and Open AI. The ‘Build’ choice is based on using open-source models like Vicuna, LlaMA or Fulcan.
How Should Startups Choose Between Open Source and Closed Source Generative AI Models?
There are both external and internal factors to use in the evaluation process. As with any build versus buy decision, you’ll look at industry pressures and the need to be competitive, whether that is being first to market or playing catch up.
Startup leaders also have to look at the limitations coming from the commercial providers. They’re called “black box” for a reason, so there are a few unknowns to consider. And finally, for internal factors, leaders have to evaluate the readiness of their tech and their people. Regardless of a startup’s maturity, the process to evaluate LLMs and which approach is the best fit is going to depend on business goals and priorities.
Through experience with many clients, Gravitate suggests assessing priorities in three key areas:
Speed to Market
Cost and Resources
IP or Potential AI Assets
Choosing the Right AI Model: When Speed to Market is Priority
For some organizations, being first to market with a new capability is the priority. To build up velocity within your AI product development, we are seeing leaders choose an off-the-shelf option like ChatGPT.
Since the model is established it gives the team more room to focus on in integrating into an existing product or spinning up a new prototype quickly. For those teams that choose an open source model, there is a larger investment required upfront, and that tends to slow down how quickly the AI solution can be delivered. If speed to market is still a factor, but business reasons require an open source solution, bringing on additional resources, even establishing a second AI-focused team, can achieve both goals.
Sometimes speed to market is really about playing catch up to the competition, where the bar has already been set. This may be a time to look at how AI can be used differently to deliver a unique solution. That would mean instead of focusing on speed, quality or customization is important. Teams evaluating both open source and commercial options may find it’s hard to compare these elements, but it’s still a good metric to include in the big picture.
Choosing the Right AI Model: Comparing Costs & Resources
Historically, when companies have chosen the out-of-the-box solution, they are looking to skip some of the upfront costs. For startups working with generative AI for the first time, it may make sense to go with the commercial platforms so they do not have to start with as many employees to build out a large language model from scratch. But the costs don’t stop there. Using the big models from Google or AWS could have drawbacks when it’s time to grow. The models are priced per one thousand token input and/or output. When a business grows its user base, the model grows in queries and the related fees grow with it.
Even if a company has the resources to build a model, training the model comes with a separate cost. CNBC reported earlier this year that analysts and technologists estimated the cost of training GPT-3 at $4M and that’s based on dedicated prices from AWS, the similar training of LlaMa is estimated at $2.4 million. There is a potential downside of greater upfront costs when going with an open-source AI solution. Maintaining the tech stack for a custom large language model requires MLOps and the related infrastructure. You can get additional insights on Generatie AI development readiness from this discussion Managing Director Qiuyan Xu participated in with other startup leaders and AI tech providers.
Finally, when assessing overall cost, we also have to include the team and skills. For businesses starting out with less resources, choosing a closed-source option is a good fit. Since the majority of heavy data engineering and data science work is already done, the team can focus on the development and implementation. The other challenge for companies who want to use open source is skillset. Developing a custom LLM requires a much higher level of data science expertise. Operating a full team of specialized skills will have a much higher FTE cost.
Cost is a huge factor for every business decision, and when investing in AI development, the key is to have a roadmap and strategy to guide how you make your short-term and long-term trade offs. A lot of short-term goals are quick to market, but beyond that, more often the startup world is working towards healthy business models with profitability, instead of just considering growth without considering cost up front.
Choosing the Right AI Model: When AI Assets & Intellectual Property are Priority
Since Commercial AI Solutions are closed-source, there is very little that can be done to customize. Without customization, there’s not much room for creating a unique asset that could become part of a company’s IP.
For companies in a burgeoning field, owning the original assets and IP, is a key consideration. For startups who want to have the flexibility to develop and customize models to a very unique domain expertise, using open-source options is the way to go.
Reputation: The X-Factor When Choosing Open-Source vs Closed-Source Models
A brand’s reputation is an additional X-factor for some businesses to consider during their evaluation of the right AI model. Accountability in the data as part of the customer experience puts a new requirement on brands. If a company can analyze and adjust models as customer input comes in, that can build trust and positively impact the brand.
Currently, we are finding with commercial AI models we can achieve greater accuracy for certain tasks, and it’s possible to handle more complex computations. However, the lack of transparency from Black Box options brings additional risk. Bias and Discrimination is inherent in models built by humans. There’s a whole set of research just focused on ways to reduce bias and improve diversity across inputs and datasets. For certain industries, such as healthcare or governance and security, it’s more imperative to be able to analyze and explain how decisions are made within the model. This is where building on top of an open-source platform can be a significant advantage.
Building with closed-source AI models can have another business ethical dilemma, especially in certain industries where humans are directly impacted. If the system is making decisions that depend on a sense of justice, as is possible in criminal or law enforcement industries, the outcomes can cause significant ethical implications. The same could be true for AI-powered solutions in the employment, human resources or human behavior industries.
A company building up its own models with open source AI options still faces these same risks of bias and discrimination. Companies taking on these responsibilities internally point to data governance and transparency with customers as key pillars for success.
How to Feel Prepared to Choose the Right Generative AI Model for Your Startup
Still looking for other inputs to help you make a decision? There’s a great talk hosted by Stanford University where Ilya Sutskever shares his thoughts about shifting OpenAI to closed source.
You can also read this recap from a few examples of successful generative AI implementations and download this Generative AI Quick Reference Guide to keep learning.
Commenti