Business opportunities under the "emergence" of large models: who will win the "Hundred Models War"

2023-07-12 02:28:31

Source: The Paper

Author: The Paper reporter Huan Yanhong intern Chen Xiaorui

Image source: Generated by Unbounded AI tool

Generative AI technology is changing the artificial intelligence industry. After only half a year, the development of domestic large-scale models has completed the leap from technology to product and then to business, and has penetrated into vertical industries. During the World Artificial Intelligence Conference on July 6, The Paper invited Wang Danjun, vice president of Beisen Talent Management Research Institute, Yu Zhonghai, deputy general manager of the research department of CICC, and Zhang Jie, vice president of Zhongguancun Kejin Technology, to the Metaverse chat room to discuss Business innovation opportunities under the "emergence" of large models. The dialogue was hosted by Shao Wen, a reporter from The Paper. During the one-hour dialogue, the guests discussed a number of related issues, including:

How do general-purpose large models and vertical field large-scale models find their own value in future business scenarios?

How can enterprises seize the business opportunities of large models? What is the key point to seize the opportunity?

The past life and future of the big model

**澎湃科技: At present, the domestic market has formed a "hundred-model war" situation. Do you think why the domestic large-scale models are developing so fast? What impact does it have on your field? **

**Wang Danjun (Vice President of Beisen Talent Management Research Institute): **From a technical point of view, the development of large models is not something that happened in the past two years. In the academic field many years ago, everyone has started to do research. It's just that with the increase in data volume and the improvement of technology level in recent years, especially after OpenAI released ChatGPT, it has attracted widespread attention.

For each field, everyone thinks about how to apply such a new technology to their own field and how to improve their business model.

For example, our Beisen Talent Management Research Institute belongs to the human resources industry, and the customers we serve are all HR (human resources) groups of enterprises. We think that the application of large models in the field of human resources may produce some disruptive value in the future, for example, it can indeed better improve everyone's work efficiency; it will help our partners to a certain extent, whether it is HR or employees. It can better guide the development of employees; at the same time, it can also help companies answer many human resource management questions, including what we often ask when recruiting employees: who is more suitable and who may bring more benefits. Good performance, how salary should be determined, etc. I think the development and application of large models will transform the human resources industry in a very short period of time.

**Yu Zhonghai (Deputy General Manager of Research Department of CICC): **Dean Wang mentioned a very important point, that is, large-scale model technology did not just happen today. In 2017, Google wrote a famous paper called Attention Is All You Need (attention is all you need), and proposed the transformer architecture.

In fact, practitioners in China's artificial intelligence industry discovered the potential of large-scale model technology very early on. We also wrote a related research report in 2021. At that time, we were shocked by the development of artificial intelligence models. Now the specific product of ChatGPT makes the big model known to the common people. No matter whether you understand technology or not, you can feel its charm as long as you play it, so the big model is popular.

The ability of large models to be applied in thousands of industries so quickly also depends on equal rights in technology, which is a very important concept we put forward. Thanks to the development of the open source community and the natural dissemination of knowledge, many companies have quickly made their own large models.

In addition, I would like to add that this time we have seen the entire technology circle and even the whole society form a joint force, that is, "everyone thinks this is very important", which is rarely seen in the past.

**Zhang Jie (Vice President of Zhongguancun Kejin Technology): **I think the first year of the big model can be traced back to the launch of GPT3 in 2020. Why has it developed so fast in the past six months? I think there are three reasons:

The first aspect is that some companies and universities have a lot of accumulation before, such as Baidu, Huawei, Tsinghua University, etc., they have pre-trained language models or large models before, and the last six months have been a version update iteration for them;

The second aspect is that various enterprises, including some investment institutions, have increased their investment;

The third aspect is to rely on open source. The open source community has driven enterprises to launch various large domain models.

Regarding the impact of large models on the field, it is a two-way benefit for customers and us. Zhongguancun Kejin is a leading provider of conversational AI technology solutions. The large model reduces our cost of customizing and developing various scene models, which is good news for us. For our customers, the cost reduction of intelligent projects has led many small and medium-sized enterprises to consider establishing projects, and the entire market has grown in size.

**澎湃科技: Recently, Tencent Cloud and OpenAI have successively proposed a large-scale model store for customized model software, which is somewhat similar to the form of an app store. How do you understand this trend? **

**Wang Danjun:**Whether you are an investor or a practitioner in this field, I believe everyone expects to see this trend, because it encourages more people to participate in the co-creation of this field. We believe that for basic service providers, they will definitely tend to the store model in the future, which will make it easier to get close to customers. The model of the large model store also enables users receiving services to better use the large model to create some new application scenarios and service value.

Yu Zhonghai: This is actually an issue that the capital market is very concerned about. I would think that the development of large models will be more like China's public cloud and private cloud in the future.

The public cloud was actually called IaaS (Infrastructure-as-a-Service, infrastructure as a service) at the time. When it comes to large models, we call it MaaS (Model as a Service). There is a well-known overseas company called Hugging Face, which puts many models on the Internet, and you can use different AI models just by calling its API (Application Program Interface, application programming interface).

However, there are also many enterprises that need to deploy their own large-scale model and use their own knowledge to enhance it to do their own Fine Tune due to requirements such as security compliance and subdivided scene implementation. This may be like the private cloud market in China. .

Broadly speaking, large models are actually not suitable for everyone to use directly, including ChatGPT, which is actually packaged into a product based on the large model, so that every ordinary person can experience it. Therefore, making applications based on large models is actually a very broad market, which can generate many business models, such as digital employees, and the concept of app store just mentioned by the host.

In fact, ChatGPT is also considered by many people as a traffic portal. As far as I know, some large-scale mobile phone manufacturers may also end up doing search. Then this matter is completely different. Its reachable market will be from the enterprise IT Spend becomes marketing spend. Chinese enterprises may spend an average of 1%-3% of revenue on IT, but may spend 30% of revenue on marketing. So when you have traffic, the business model can change a lot, which is very interesting.

Zhang Jie: I think OpenAI and Tencent Cloud have different emphases. OpenAI's focus may still be on the basic large model. It uses some ecological partners to help it make up for the lack of accuracy or professionalism of the basic large model in some fragmented scenarios. However, Tencent Cloud does not overemphasize its own basic model, but emphasizes the need to be a large model for vertical industries.

Their different emphases show a common problem, that is, the basic large model cannot be directly used in some specific industries or scenarios with high professionalism, and it also needs an adaptation layer. I think there may be a new business model here, that is, more and more knowledge can be capitalized in the future. It turns out that when everyone uses AI, the technical paradigm is different. Therefore, the assets that everyone can share are some raw data. After getting the raw data, different technical means are used to process it. During this process, the intermediate form of the data cannot be traded.

However, the large model basically unifies many AI tasks together, and the entire research paradigm and processing flow converge, which makes many intermediate products tradeable, such as labeled data, data for instruction alignment, prompt words, and The prompt chain composed of prompt phrases, etc., and even robots or assistants trained in specific scenarios, can do some transactions and be capitalized. This may be a new business model in the future.

General large model VS vertical field large model

**Paper Technology: In fact, just now, the three of you have mentioned two different business paths, the domain big model and the general big model. Can you talk about the opportunities and challenges of these two different business models in detail. **

Wang Danjun: In the future, business around large models will definitely become a business ecology, and the core feature in the business ecology is that it will be layered. First of all, there will be some manufacturers to provide basic services, because large models require data foundation and computing power. The large model serves as the underlying engine. Although it has strong capabilities, it does not understand the knowledge of specific domains, service models, and customer needs.

Along with that comes large mockups of verticals. Some of the large models in these vertical domains are the domain manufacturers' own large models, and some are trained based on the basic large models. In layman's terms, it is to make these models an expert. For example, in the financial field, there may be some services that are dedicated to helping investors make business decisions, and in the human resources field, there may be services that are dedicated to helping companies make talent decisions.

There may be many subdivided fields. These subdivisions may also require industry accumulation, training, and understanding of how to solve customer problems, and even get involved in the user experience level of the interactive interface. These accumulations are actually some of the capabilities behind the app store we talked about earlier.

Yu Zhonghai: From our point of view, these two types of markets are actually quite different.

First of all, it is also important to build a good data flywheel for the general large model.

For example, today I teach you how to play table tennis, and I say you should play like this. Then I serve a ball and you don't catch it, you know it's not the right way to hit it. Then change another way, this time you catch it, you will remember the correct way. Then I keep serving the ball to you, and you keep catching it, adjusting your muscle memory little by little to find the best way. In fact, the training of the model is also such a process. Every time you communicate with ChatGPT, there will be a like or dislike under it. Or there are times when you say nothing and are just very disappointed to have regenerated one. Or for example, every time you use Midjourney to draw, it will display four images, and then you select the third one by default, enlarge it, download it, and it will know that this one seems to be better, and it will go to this in the future The direction is a little bit more biased, and the model will learn a little bit. Without data feedback, the model stays here and cannot improve. So this is the data flywheel we emphasize, emphasizing the card position of the scene. As a general-purpose large model, the most important thing is that it is used by someone, which is also an advantage of China. Chinese people like to embrace technology, and they are more accepting of large models. We have a lot of people to give feedback.

For large domain models, I think what is more important is know-how (the technical know-how and expertise required to engage in a certain industry or do a certain job).

The key to the large domain model is to meet the needs of customers and understand the know-how of customers in a certain deep domain. Enterprises like Zhongguancun Kejin have their advantages here. They understand the customer's know-how and understand the customer's pain points in specific field scenarios.

**Zhang Jie: **Thank you for your recognition. I understand that the general large model and the domain large model are an upstream and downstream relationship, not a competitive relationship. The general model is large and comprehensive, and the domain model is a small but beautiful business.

The general large-scale model is characterized by "three highs", that is, high investment and high return, but also high uncertainty. Because training a basic large model requires high requirements in terms of algorithms, data, and talents. For example, the computing power must be at least 10,000 kat level of computing power.

Why is the uncertainty so high? Because the shortlisted manufacturers are basically big Internet companies or technology giants, the technical requirements are very high for the trained large models to have general capabilities and emergent capabilities. If you want to gain a foothold in the market, you must at least be able to run Only by winning the open source model can it survive. In the future market, I think there will be many basic large models coexisting, but there may not be too many.

A domain mockup might be a small but beautiful track. Training a large domain model does not require very high computing power. According to our experiments, it only takes one week to iterate the large domain model. Two cards can do some reasoning in real time. Then in terms of data, we still need to establish some thresholds of our own, which is the data flywheel effect that Mr. Yu just mentioned.

The most important point of competition is talent. The large domain model has flattened the technology chain of the entire industry. Your talents must not only understand algorithms, but also know how to engineer algorithms well, understand scenarios, understand business, and even understand product design. It is necessary to stimulate the compound ability of talents, and then stimulate the initiative and creativity of talents, which is a test of organizational ability.

In which field is the large model most useful?

**Paper Technology: What are the three most promising vertical application directions? **

Yu Zhonghai: We have looked at many directions, and the most promising direction now is AI Answer, which is a word we created ourselves, which refers to a category of products. For example, you tested me on a matter today, and you asked me which brand of stroller is better to buy. I believe that everyone should have a similar situation. Ask three friends, read five Xiaohongshu articles, and then search ten posts on the Internet. Finally, after reading it, I summed up three models, and then looked at the price and made a decision.

In fact, the same is true of many things when you ask artificial intelligence. The large model has a certain memory. It knows that these things are related, but it will have hallucinations. what it looks like. So what companies often do is to make certain queries based on the internal knowledge base, and then make a summary based on these things. In addition, there is a question and answer, summary or comparison based on a document, such as an insurance clause based on hundreds of pages. Or give an answer based on a search. We collectively refer to these products as AI Answer. In addition, we also pay more attention to opportunities such as office work and AI learning machines.

Zhang Jie: I think we can look at it by industry first, and then by scene. The ideal industry for large-scale domain models is one with high knowledge density, strong process logic, and sensitive to security and privacy. For example, industries such as finance, medical care, law, and government affairs may be ideal industries for the implementation of large-scale domain models in the future.

The scene may be empowered for internal employees at the beginning. For example, do some knowledge quizzes, followed by training and sparring. After empowering internal employees, let’s look at how to do some service to C (to customers), or empower sales staff and customer service staff to become marketing customer service assistants. In short, when applying, you can first consider starting from the scene with relatively high fault tolerance.

Wang Danjun: We also believe that in everyone's application experience, the large models help us solve the most problems in knowledge-intensive industries. In the To B field like Beisen or Zhongguancun Kejin, there will indeed be some companies that are more likely to benefit from it. For example, in the human resource management of an enterprise, everyone hopes that the big model can solve some boring and boring things for people, such as answering various common concerns of employees. Beisen and Zhongguancun Kejin are also discussing cooperation in this area, hoping to establish some digital employees through the underlying facilities of the large model in the future, and answer these questions that do not seem to require human experience and knowledge.

In addition, we found that there are some high-value areas, such as corporate leadership coaching. An enterprise has multiple levels of managers, and each manager will encounter various challenges every day, but he does not have so many opportunities to ask others for advice. We think AIGC can provide very good help in this scenario, and It can well meet some personal requirements for privacy.

**澎湃科技：Microsoft recently announced cooperation with OpenAI to open up the application ecology. Zhongguancun Kejin is a representative of large-scale application manufacturers in China. How is the current progress of product landing and application? Will every company have its own exclusive model in the future? **

Zhang Jie: I think not every company needs its own exclusive model, because I think the large model technology can be divided into three layers from shallow to deep:

On the first level, if the industry or enterprise does not have so much knowledge or documents, or the particularity of the enterprise itself is not strong, there is no need to use a dedicated model, and a general large model can be used directly. On the general large model, according to the specific scene requirements, it is enough to adjust (prompt). During this period, what enterprises need is a prompt word management tool with good user experience and low threshold. It may be necessary for some business experts to accumulate some useful prompts in specific scenarios; for more complicated ones, it is necessary to turn some prompts into a small set of prompts, and then form chains between different sets, and the chains are sequenced Logical relationship, it is enough to have a prompt chain management tool.

The second layer is a little more complicated. Some enterprises have a lot of fragmented knowledge. At this time, prompt words or prompt chains may not be enough. It is necessary to add a knowledge base on top of the general large model to manage low-frequency and long-tail knowledge. When the general large model faces specific problems, go to the knowledge base to find answers. This is the technical route of general large model + knowledge base + prompt engineering.

The third layer, for enterprises with a large amount of domain knowledge, it is necessary to build a large domain model exclusive to the enterprise, and input the general knowledge in the domain into the brain of the large model, which not only maintains general common sense and reasoning, but also has the expertise in the field Knowledge and skills make big models from liberal arts students to domain experts. This is a route of large domain model + domain knowledge base + prompt engineering.

Zhongguancun Kejin has some application practice for the above three situations. We have been trying to solve three technical difficulties some time ago, and after solving, we have formed two sets of assistant applications. What are the three technical difficulties?

The first is to solve the problem of hallucinations that often occur in large models, that is, serious nonsense. This requires a knowledge base in the field to let the big model know that when it is asked a question, it needs to find relevant knowledge in the knowledge base. When building a knowledge base, it is often necessary to process documents in rich text format. It is necessary for the large model to understand the document layout first, and then understand the content inside.

The second is to prevent the problem of forgetting during the training process when training large models in the domain. Otherwise, when you feed new knowledge into its brain, it will forget the old knowledge. Efficient fine-tuning technical means will be used here, so that it can not only have the original general knowledge, but also learn the common sense in the field.

The third is how to make users use it more conveniently and at low cost. Whether it is a large domain model, a domain knowledge base, or prompt words, it cannot be accomplished overnight, and requires continuous feedback and iteration. This requires customers to have maintenance capabilities, and product design and engineering packaging must be done well at the tool level.

**澎湃科技: I would like to ask Mr. Yu from the perspective of investment, compared to the investment logic in the era of mobile Internet and hard technology, in this round of big model upsurge, the core focus of project selection, including investment decision-making points what is it **

**Yu Zhonghai:**This is also a question that capital has been exploring. In the past six months, the market of large models has also undergone many rounds of evolution. At the beginning, many people regarded it as a theme investment. At that time, companies related to AI were speculated, and companies with real underlying capabilities of large-scale models were highly valued by everyone. Later, the "Hundred Models War" appeared. Everyone felt that the large model seemed a bit too complicated. They found that computing power seemed to be the beneficiary of buying shovels from the Nuggets, so the valuation of companies related to computing power rose a lot. By May and June of this year, we proposed that "AI applications ushered in an explosive Cambrian period". Many companies have come up with their own products, and the application of AI has started to rise again.

In fact, the investment rhythm of large models is also gradual. From the beginning, as long as it is an AI company, it can increase, and then it will come out with an application, and then everyone will start to explore whether this application has customers paying for it, and whether there are real orders. I think maybe in another year or so, everyone will think about whether there are barriers to this application and whether the business is sustainable in the long run. In fact, what we are most optimistic about is the application level. We attach great importance to enterprises that can truly connect large models and enterprise applications, and they can truly create value.

**The Paper: The next question is for Dean Wang. From the perspective of talents, the big model is now building a new production relationship of man-machine collaboration. So what impact will this have on the talent management of enterprises? **

Wang Danjun: This is a particularly good question. Because we see that at the social level, everyone may be very keen to discuss "Will my work be replaced by a large model in the future?" Regarding this question, I suggest that we can look at it from two aspects:

On the one hand, what everyone discussed more is that the ability of large-scale models is getting stronger and stronger, and the application of vertical fields is getting deeper and deeper, which may replace the work of many general employees in the future. In recent years, even without the background of large-scale model development, everyone is very concerned about reducing costs and increasing efficiency, which involves the proportion of labor costs in enterprises. If some new technologies can be used to reduce labor costs, it will be very valuable to the enterprise, which is an important reason for its sustainable development.

On the other hand, the emergence of large models and the emergence of more vertical field tools in the future will be a blessing for employees. On the positive side, if employees can make better use of human-machine collaboration, it can create more value for the company.

We predict that there will be some changes in the organizational form in the future. Most companies now have a functional structure. Different functions serve the same customers. There are many problems of collaboration and communication between functions, which is a great consumption for the company. But in the future, when individual employees or groups have all-round capabilities with the help of tools, the organizational unit will inevitably be smaller, and it will be more agile, flexible, and responsive.

**Paper Technology: I would like to ask the three of you what kind of capabilities companies need most in the context of the Hundred Models War? **

**Zhang Jie: **For basic large-scale model manufacturers, what they need most is to have particularly strong and unique technical capabilities. To outperform open-source large-scale models, and then have some unique technologies in multi-modal or complex reasoning sex. For large-scale model manufacturers in the field, compound talents are most needed, and organizational skills are very important. Because it is necessary to give full play to the initiative and creativity of talents, connect him with some cutting-edge business opportunities, and let him know what customers need. In addition, there must be some engineering capabilities and algorithmic capabilities.

Yu Zhonghai: I think data is very important, especially for companies that want to use large models, data is really important. In fact, data will soon become a barrier for us. Companies like Zhongguancun Kejin that provide large-scale model applications have their greatest advantage in data.

**Wang Danjun: **Actually, for the vast majority of enterprises, it is more likely that they will be consumers of large models in the future, rather than creators of large models. Actively embracing large models may greatly accelerate their digital management process. Both organizations and individuals may need to embrace change. Leaders at all levels need to increase their awareness of embracing this new opportunity, and employees need to maintain continuous and self-driven learning capabilities.

**Paper Technology: In this process, enterprises in the vertical field have the advantage of industry know-how. Are they easier to break through the card points of large model data? **

Zhang Jie: I think industry know-how needs to be looked at in detail, and there may be several forms that require special attention.

The most obvious form of industry know-how is some data in documents, which are symbolic expressions. How to dig out (prompt words) or prompt chains from the document is a type of problem that needs to be dealt with at the current stage. Many companies have a large number of documents, and many of these documents are in PPT or PDF format. It is necessary to understand both its format and its content.

In addition to the know-how of this type of industry, there is also the knowledge hidden in the minds of industry experts or scenario experts. He needs to be given a tool, which can be summed up in words. The template is the industry know-how with higher density.

There is also a relatively hidden type of industry know-how, which is the collective wisdom hidden in system logs. After mining these logs, it is possible to turn scattered knowledge into structured chain knowledge. Then, from the thousands of chains, it is summarized and turned into the best practice under a specific scenario. For example, salespeople make a lot of phone calls every day, and there are millions of conversation logs accumulated every day. By tailoring according to different results such as order completion, flow order, or hanging up calls, the best speech skills for marketing new products can be precipitated. practice.

These three types of questions are actually very good industry know-how. There are also some technical challenges here, such as how to dig out prompt words and prompt chains from documents and logs. doc2 and log2 are the focus of research worth investing in.

Yu Zhonghai: In fact, Zhang Jie always had a saying that impressed me deeply. He said that recording the orbits of the eight planets in the solar system for tens of millions of years is not as good as an abstract law of gravity that can explain things better. Nature.

I think this is know-how. Companies in vertical fields have a lot of experience in related fields. Let me give you an example. After artificial intelligence became popular, everyone said that there is a new profession called engineer with an annual salary of one million.

It's the engineer who went to write. So why are they so valuable? I also use artificial intelligence painting tools myself. Once I wanted to draw a man in a suit, but I tried many keywords and it didn't work. Later, I typed "tie", and the person immediately put on a suit, and you will find that this is the magic word.

The large model is a bit like a probability game. It actually has many prior probabilities. In the scene just now, tie is the very important word corresponding to suit. And the engineer knows which keywords can disturb the model, and he knows what kind of things we want to write. This is his value. Then if we put this matter in the application scenario, it is actually the value of companies in vertical fields. He understands both artificial intelligence and industry know-how.

Wang Danjun: At present, it seems that in addition to the basic large-scale model companies, vertical field companies are the second batch of companies that everyone will pay close attention to in the market. I think it will be very challenging for them in the next period of time. The experience of companies in vertical fields may not necessarily be directly transformed into new creativity under the blessing of new technologies. Need to spend more time to learn first, to understand what the underlying logic of the large model operation is, to learn how it can help me, even to help me mine our knowledge, or help us provide services to customers. I think for some time to come, they will face this challenge first. On these foundations, find new opportunities to create value for customers based on these models.

**Paper Technology: There have been some new trends recently, such as the slowdown in the number of ChatGPT visitors. It didn't actually exist as a super entrance. There is a view that ChatGPT cannot represent the future of GPT, but is just a product presented by OpenAI and Microsoft to the market. What do you three think about this point of view? **

Yu Zhonghai: This point of view is indeed quite interesting. Personally, I am relatively optimistic, because I am a person who often uses large-scale products. And I would like to add that the current ChatGPT is not its final form. I think Google Assistant (Google Assistant) in Google's next-generation Android system will be supported by its own large model, because this is also a very important traffic entrance.

Wang Danjun: For many people in the industry, everyone is very optimistic. As a new product, at this stage of development, there will naturally be some differentiation. For end users, they will be interested in such tools, willing to try, and continue to use them. Even if the frequency of use is decreasing and the degree of attention is decreasing, the behavior of users is difficult to change.

Zhang Jie: I think the slowdown of ChatGPT’s growth rate is something that will happen sooner or later. After all, the traffic has a ceiling. Sooner or later, the traffic will peak or the public's enthusiasm for the application scenarios of chatting will decline. After all, chatting is not a high-frequency rigid demand, but it brings you a particularly good inspiration.

On the one hand, there are many applications around Chat, such as generating pictures, videos, reports, PPT, etc. through chat, such as ChatDoc, ChatPdf, and ChatBI. Another scenario is the large domain model just mentioned, such as financial GPT, legal GPT, and medical GPT. These may all be some new business entrances in the future.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

0/400

No comments

Topic
#Gate Square Qixi Celebration
6k Popularity
#Crypto Market Pullback
272k Popularity
#Trump Removes Fed Governor Cook
5k Popularity
#Companies Expand Crypto Reserves
489 Popularity
#Gate Alpha DORA Points Airdrop
383 Popularity

Sitemap

Business opportunities under the "emergence" of large models: who will win the "Hundred Models War"

The past life and future of the big model

General large model VS vertical field large model

**In which field is the large model most useful? **

In which field is the large model most useful?