Previous Blogs

November 2, 2023
Cisco’s Webex Extends Generative AI into Collaboration

October 31, 2023
Lenovo Unites Businesses and AI Strategy

October 24, 2023
Qualcomm’s Snapdragon X Elite Solidifies New Era of AI PCs

October 10, 2023
HP Highlights PC Design Innovation

September 22, 2023
Microsoft Copilot Updates Push GenAI to the Mainstream

September 19, 2023
Intel Hopes to Reinvent the PC with Core Ultra SOC

September 6, 2023
Google Starts GenAI Productivity Onslaught with Duet AI for Workspace Release

August 16, 2023
Why Generative AI is so Unlike Other Major Tech Trends

August 9, 2023
Nvidia Enhances GenAI Offerings for Enterprise

July 31, 2023
Challenges Remain for Generative AI Tools

July 27, 2023
Generative AI Study Uncovers Surprising Facts on Business Usage

July 26, 2023
Samsung Works to Bring Foldables to the Mainstream

June 21, 2023
HPE Melds Supercomputing and Generative AI

June 14, 2023
AMD Delivers Generative AI Vision

June 6, 2023
Apple wants to redefine computing with Vision Pro headset

June 1, 2023
Hybrid AI is moving generative AI tech from the cloud to our devices

May 23, 2023
Dell and Nvidia Partner to Create Generative AI Solutions for Businesses

May 9, 2023
IBM Unleashes Generative AI Strategy With watsonx

May 4, 2023
Amazon’s Generative AI Strategy Focuses on Choice

April 20, 2023
Latest Cadence Tools Bring Generative AI to Chip and System Design

March 30, 2023
Amazon Enables Sidewalk Network for IoT Applications

March 16, 2023
Microsoft 365 Copilot Enables the Digital Assistants We’ve Always Wanted

March 14, 2023
Google Unveils Generative AI Tools for Workspace and GCP

March 9, 2023
Lenovo Revs Desktop Workstations with Aston Martin

March 1, 2023
MWC Analysis: The Computerized, Cloudified 5G Network is Getting Real

February 23, 2023
Early MWC News Shows Renewed Emphasis on 5G Infrastructure

February 1, 2023
Samsung Looking to Impact the PC Market

January 18, 2023
The Surprise Winner for Generative AI

January 5, 2023
AI To Go Mainstream in 2023

2022 Blogs

2021 Blogs

2020 Blogs

2019 Blogs

2018 Blogs

2017 Blogs

2016 Blogs

2015 Blogs

2014 Blogs

2013 Blogs

TECHnalysis Research Blog

November 7, 2023
The Rapidly Evolving State of Generative AI

By Bob O'Donnell

As someone who’s researched, written about, and closely tracked the evolution of generative AI (GenAI) and how it’s being deployed in real-world business environments, it never ceases to amaze me how quickly the landscape around us is changing. Ideas and concepts that seemed years away just a few months ago—such as the ability to run foundation models directly on client devices—are already here. At the same time, some of our early expectations around how the technology might evolve and be deployed are shifting as well—and the implications could be big.

In the case of basic technological development and deployment of GenAI, for example, there’s been a growing recognition that the two-step process of model training and model inferencing isn’t happening in the way we were led to believe. In particular, it turns out that only a handful of companies are building their own foundation models and training them from scratch. Instead, the vast majority of work being done is the customization of existing models.

While some might argue that the difference between the training and customization of things like large language models (LLMs) is one of semantics, in truth they imply a much bigger impact. For one, this trend is highlighting the fact that only the largest companies have the resources and money to not only build these models from scratch but also maintain and evolve them. It is companies like Microsoft, Google, Amazon, Meta, IBM, Salesforce—along with the companies they’re choosing to invest in and partner with, such as OpenAI, Anthropic, etc.—that are the ones doing the majority of the model creation work. Sure, there are plenty of startups and other smaller companies that are toiling away at creating their own foundation models, but there are increasing questions about how viable those types of business models are in the long run. In other words, the market is increasingly looking like yet another case of big tech companies getting bigger.

The reasons for this go beyond the typical factors of skill set availability, experience with the technology, and trust in big brand names. Indeed, because of the extensive reach and influence that GenAI tools are already starting to have (and which are predicted to expand even further), there are increasing concerns about legal issues and related factors. To put it simply, if large organizations are going to start depending on a tool that will likely have a profound impact on their business, they need to know that there’s a big company behind that tool that they can place the blame on in case something goes wrong. This is very different from many other new technology products that were often brought into organizations via startups and other small companies. The reach that GenAI is expected to have is simply too deep into an organization to be entrusted to anyone but a large, well-established tech company.

And yet, despite this concern, one of the other surprising developments in the world of GenAI has been the rapid adoption and usage of open-source models from places like Hugging Face. Both tech suppliers and businesses are partnering with Hugging Face at an incredibly rapid pace because of the speed at which new innovations are being introduced into the open models that they house.

So, how does one reconcile these seemingly incongruous, incompatible developments? It turns out that many of the models in Hugging Face are not entirely new ones but instead are customizations of existing models. So, for example, you can find things that leverage something like Meta’s open source and increasingly popular Llama 2 model as a baseline, but then are adapted to a particular use case. As a result, businesses can feel comfortable using something that stems from a large tech company but offers the unique value that other open-source developers have added to. It’s one of the many examples of the unique opportunities and benefits that the concept of separating the “engine” from the application—which GenAI is allowing developers to do—is now enabling.

From a market perspective, this means that the largest tech organizations will likely battle it out to produce the best “engines” for GenAI, but other companies and open-source developers can then leverage those engines for their own work. The implications of this, in turn, are likely to be large when it comes to things like pricing, packaging, licensing, business models, and the money-making side of GenAI. At this early stage, it’s unclear exactly what those implications will be. One likely development, however, is the separation of these core foundation model engines and the applications or model customizations that sit on top of them when it comes to creating products—certainly something worth watching.

Interestingly, this separation of models from applications might also impact how foundation models run directly on devices. One of the challenges of this exercise is that foundation models require a great deal of memory to function efficiently. Also, many people believe that client devices are going to need to run multiple foundation models simultaneously in order to perform all the various tasks that GenAI is expected to enable. The problem is, while PC and smartphone memory specs have certainly been on the rise over the last few years, it’s still going to be challenging to load multiple foundation models into memory at the same time on a client device. One possible solution is to select a single foundation model that ends up powering multiple independent applications. If this proves to be the case, it raises interesting questions about partnerships between device makers and foundation model suppliers and the ability to differentiate amongst them.

In addition to shifts in model training, there have been some intriguing developments in the world of inference. In particular, rapidly growing technologies like RAG (Retrieveal Augmented Generation) provide a powerful way to customize models leveraging an organization’s own data. Basically, the way RAG works is that it provides a mechanism to perform a typical query to an LLM, but the answer is generated from an organization’s own cache of original content. Putting it another way, RAG leverages the learned skills from a fully trained model in terms of what rules it uses to select the content. It then builds its response by combining its own logic and basic language understanding with the unique material of the organization running the tool.

The beauty of this approach is twofold. First, it offers a significantly easier and less resource-intensive way of a customizing a model. Second, it simultaneously reduces the potential for hallucinations and other content problems by generating its response from the custom data set only and not the much wider set of content used to first build and train the model. As a result, the RAG approach is being quickly adopted by many organizations and looks to be a key enabler for future developments. What’s also interesting about it is that it changes the nature of how inferencing is done and shifts the focus of where the computing resources are required from the cloud to the data center and/or client devices.

Of course, given the rapid evolution of the GenAI world, it’s certainly possible that much of what I’ve argued here may be irrelevant or a moot point by the middle of next year. Still, it seems clear that important shifts are already occurring, and it’s going to be important for industry players to start shifting their messaging around those changes. Switching from the focus on training and inferencing of models to one that highlights model customization, for example, seems overdue based on the realities of today’s marketplace. Along similar lines, providing more information around technologies like RAG and their potential influence on the inferencing process also seems critical to help educate the market.

There’s no longer much doubt about the impact that GenAI is expected to make on businesses of all sizes. The path to reach that level of impact and the pace at which it will be achieved, however, are still very undefined. In that light, any efforts that the tech industry can make to better educate people about how GenAI is evolving—including through better, more refined messaging—are going to be extremely important. The process won’t be easy, but let’s hope more companies are willing to take on the challenge.

Here's a link to the original column:

Bob O’Donnell is the president and chief analyst of TECHnalysis Research, LLC a market research firm that provides strategic consulting and market research services to the technology industry and professional financial community. You can follow him on LinkedIn at Bob O’Donnell or on Twitter @bobodtech.