Natural language processing (NLP) is a critical part of the digital transformation. NLP enables user-friendly interactions between machine and human by making computers understand human languages. Intelligent chatbot is an essential application of NLP to allow understanding of users’ utterance and responding in understandable sentences for specific applications simulating human-to-human conversations and interactions for problem solving or Q&As. This research studies emerging technologies for NLP-enabled intelligent chatbot development using a systematic patent analytic approach. Some intelligent text-mining techniques are applied, including document term frequency analysis for key terminology extractions, clustering method for identifying the subdomains, and Latent Dirichlet Allocation for finding the key topics of patent set. This research utilizes the Derwent Innovation database as the main source for global intelligent chatbot patent retrievals.
Despite the global impact of COVID-19, almost 80% of global artificial intelligence (AI) projects have maintained the same or even increasing the investments of R&D since the beginning of the pandemic. AI-based systems nowadays are widely adopted for decision makings, which have a profound impact on individuals and society. The so-called intelligent systems are mostly driven by machine learning (ML) or deep learning (DL) algorithms with their models being trained and tested by big data [
The applications of intelligent chatbots have increased rapidly in recent years. A lot of research delves into the details of AI and DL algorithms for chatbot solutions and applications in pursuits of high efficiency and intelligence. Even though the development of chatbot seems to be booming, thorough review of the life cycle of chatbot developments and key technologies are in great needs. Furthermore, with the popularity of the Internet and social platforms, a digitally transformed environment for the uses of smart chatbots (as human machine communication interfaces) has become largely popular. More and more applications offer “life” services by mounting voice-interactive assistants; that is, smart chatbots, which hold regular conversations and provide online services interactively with users, are becoming a trend [
Chatbot is a computer program that allows computers to mimic human communications and conversations. At first, chatbot can only answer standard questions where questions and answers are known and saved in the system. With the technological advances, computers can gradually answer a freelanced question like human by passing a Turing Test, which is closer to a human intelligence [
NLP-enabled chatbot is a complex system. Starting from the front-end user inputting utterance, the natural language understanding (NLU) module of chatbot judges the user’s intent from the user’s natural language expression. Next, the dialogue management module finds contents that can answer the user’s request. In this process, different types of databases may be accessed for finding answers. Finally, the natural language generation (NLG) module converts the collected contents into human-readable expression as the response to the user [
NLP technology is an important branch of AI. It studies the use of computer software, such as machine learning (ML), to intelligently process natural language. The basic NLP technology is mainly developed around seven levels of language, including phonemes (language pronunciation patterns), morphology (words, how do letters form words, the morphological changes of words), vocabulary (the relationship between words), syntax (how words form sentences), semantics (the corresponding meaning of language expression), pragmatics (semantic interpretation in different contexts), and chapter (how sentences are combined into paragraphs).
As AI drives the transformation of the digital economy, companies should also pay more attention to intellectual property (IP) innovation and management. Therefore, it is expected that the latest trend of chatbot development can be found from collective patent information. Through the patent layout (or landscape), important technology development trends can be evaluated, and the development direction of important international manufacturers can be found, and international technology benchmarks can be used as a reference for subsequent R&D investment decisions [
According to statistics from the World Intellectual Property Organization (WIPO), more than 80% of emerging technologies with commercial values are patented, which shows that the patent database consists of comprehensive domain knowledge. The purpose of the patent database is not only to provide a search for prior arts, but also to obtain a wealth of information for future R&D. For example, when key patents are found, the technology development trends can be extrapolated, the technical contents of domain patents can be analyzed, and the core countries, assignees, and inventors of the key technologies can be identified. By making good use of such patent information, companies can develop various business and management strategies [
In order to understand the latest emerging technologies of chatbots, this study takes “natural language-enabled chatbots” as the domain for relevant patent technology exploration. Thus, the overall chatbot technological development trends can be discovered and future research directions can be suggested.
Before investigating natural language-enabled chatbots, a well-constructed knowledge ontology is needed. Afterwards, the global patent management landscape map and technology function matrix are presented. After that, a discussion of the analytical results will be presented to show the interesting technology trends we found and verified with the matching literature. In this study, some text-mining tools are used, such as clustering and topic modeling. Saura [
Past patent reviews are usually analyzed by experts. However, with the increasing number of patents and the development of information technology [
Govindarajan et al. [
In a knowledge-based economy, the economic status of a country depends on the production, distribution, and use of knowledge and information. The latest trend of economic growth in various countries mainly depends on the individual’s innovative technological knowledge, which is an important reason why intellectual property has attracted attention. Information related to intellectual assets, such as technical insight and legal status, cannot be obtained from any other literature search except for the patent database. Thus, the importance of the patent database can be revealed [
Derwent World Patents Index (DWPI) and the smart search function are two major features of DI. DWPI is a process of translation, rewriting of key abstracts, content debugging, and normalization of patent holders after experts have read the entire official patent disclosure materials, which is considered to be the essence of the patent content. The DWPI rewritten items include novelty, use, advantage, technical focus, detailed description, drawing description, activity, and mechanism. Every operation of DI simultaneously searches the official patent publications and DWPI patent value-added database to obtain more complete results. This is also the unique feature of DI. Smart search will analyze the word string semantically and automatically expand keywords, and then go through multiple steps of calculation, including weighting of classification numbers and weighting of citations, to find patents related to the input technical description. Grammar is not that important here, because smart search will remove conjunctions, prepositions, etc. in the description and only retain the technical keyword description. Therefore, whether the words used in the technical description are accurate or whether they are mixed with too many unnecessary technical conditions have more influence on the search results than the grammar. If the keywords left by smart search after analyzing the string are not as expected, or the results found by smart search for the first time do not meet the requirements, manually adjust, including adding new keywords in the search pane, or removing possible noise to let smart search recalculate new results. After several adjustments, the result of smart search will be closer to the demand. Smart search is an iterative process, the purpose is to quickly find potential targets, and if you want to search all related patents without omission, it is suitable to use general patent search technique [
An ontology map for a specific domain connects the relevant subjects and key terms, provides a domain knowledge-rich structure that can be as the basis for analyzing technologies in depth. Weng et al. [
Patent documents contain important research results. However, they are lengthy and rich in technical terms, so analysis requires a lot of manpower, and there is an urgent need for automatic tools to assist patent engineers or decision makers in patent analysis. The importance of patent mining is thus seen. Patent-mining technology includes text segmentation, abstract extraction, feature selection, term association, cluster generation, topic identification, and information mapping [
In the patent analysis application of drones, through LDA, the three most active technology development themes such as communication technology, power supply, and navigation system are found [
To further focus on the patent development context of a specific technical field and find a technical minefield or a technical blue ocean zone, it is necessary to analyze the technical location and function of each patent through a more detailed TFM, and further explore in-depth strategies, such as technological innovation or avoiding development conflicts [
In the practice of the industry, most of the patents collected are read by the researcher one by one and classified according to the technical field and effect of their professional human judgment. The manual classification method consumes a lot of time, and it is difficult to obtain a comprehensive review through the interpretation of a large number of patent documents. Many recent studies have tried to find a more efficient way to construct TFM. Yang and Ren [
Table
Comparison for studies related to patent and technology mining techniques.
Part | Task | Method | Proposed framework | [ | [ | [ | [ | [ | [ | [ | [ | [ | [ | [ | [ | [ | [ | [ | [ | [ |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Preprocessing | Data processing | Text preprocessing | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||
Processing | Key term extraction | TF/TF-IDF | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||
Skip-gram | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||
Patent management map | Yearly trend | ✓ | ✓ | ✓ | ||||||||||||||||
Assignee | ✓ | ✓ | ||||||||||||||||||
CPC/IPC | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||
Postprocessing | Text-mining-based approach | Clustering | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||
Topic modeling | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||
Classification | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||
Visualized approach | Semantic tree analysis | ✓ | ||||||||||||||||||
Node-relation graph | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||
TFM | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||
Purpose | Prior art patent search | ✓ | ||||||||||||||||||
Classification | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||
Ontology construction | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||
Finding emerging technologies | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
In the preprocessing part, the use of natural language processing for text preprocessing is mentioned in most articles, and the corresponding algorithms, tools, or kits are quite mature. Although some articles did not specifically mention this part, it is believed that this part, as a relatively mature part, should have been implemented. Two main tasks, key term extraction and patent management map, are included in the processing part. The TF-IDF method is widely used in the key term extraction task and can almost be regarded as a standard configuration. Skip-gram is an important method to study the contextual relationship, and it is often used in the research that uses the contextual relationship as the vectorization method. Patent management map, or patent map analysis, is a statistically-based data analysis method that has been widely used, with a database and business intelligence tools to visualize patent portfolios. Patent management map only involves data sorting and presentation, which does not conform to the current general definition of text-mining. Therefore, it is hardly mentioned in the research of patent analysis by text-mining in recent years. Among them, only the patent classification code will be referenced as a benchmark to verify whether the results of the text-mining-based approach are valid and consistent. The postprocessing part contains two parts: text-mining-based approach and visualized approach. The main methods of the former are clustering, topic modeling, and classification; the latter is mostly based on the expression of node-relation graph. Although TFM is less common, it is still one of the good visualization tools for exploring emerging technologies.
The main purpose of these studies is focused on classification, ontology construction, and finding emerging technologies. Classification is very basic, and the patent data itself already have classification codes, such as IPC or CPC. Researchers who use classification methods in postprocessing parts have a clear aim at classification. Ontology construction aims to clarify the technical details and scope of a specific field, and clustering and topic modeling methods can achieve this goal well. Both classification and ontology construction only obtain and analyze existing data, but in order to explore emerging technologies, it is necessary to find rules or discover changes in trends from the data.
The framework proposed in this study completely includes the three parts of preprocessing, processing, and postprocessing. In addition, this research also performed patent management map analysis and compared the results with text-mining to explore emerging technologies and verify the ideas and conclusions put forward in this research.
Figure
The ontology construction process flow.
The four levels are patent retrieval, patent clustering and target domain selection, topic modeling, and keyword generation. The two aspects are research process and ontology construction. At level 1, some key terms about natural language-enabled chatbot are figured out, and the smart search on DI is used to do the patent retrieval. Then, the most related 50 patents are quickly glanced to check if they match the subject of this study. If not, the search query is adjusted and do the retrieval again until the records are much in line with the subject. At level 2, DWPI title, DWPI abstract, and independent claims are used to do the k-means clustering, and silhouette score is used to evaluate the propriate number of clusters. After clustering, normalized TF-IDF (NTF-IDF) is used to identify the key words and key phrases. Again, we will check if the key words match the subject. If not, go back to level 1 and adjust the search query. Repeat the process until ideal target domains are found. At level 3, topics for domain are found in 2 different ways. The LDA model is used in domain of NLP, model, and system, while manual induction is used in domain of applied scenarios. In order to discover deeper topics or concepts at this level, each domain resets patent search conditions for applying the LDA method. After each execution, it is determined whether the subject of each domain is clearly identified according to the results. If not, the patent search conditions must be adjusted again. The topics of each domain are determined in this iterative process. Finally, by sorting out the key words and key phrases from level 2 and level 3, the construction of level 4 can be completed.
Smart search on DI provides a semantic search tool, which offers a quick path to capture related patents from simple search terms. The powerful algorithm behind replicates the strategies used by expert searchers to provide a manageable result set that matches users’ intent. By using smart search, it is not necessary to list all probable related terms before searching. Instead, the records discovered are always related to the technology described by the input terms but may not be exactly contained. Smart search automatically sorts the result set according to the relevance score to show the content that best matches the search term.
In order to obtain a well-constructed ontology, the main purpose is to find as wide a range of technologies as possible from the field, and not to focus on specific technologies that will lead to a small number of emerging technologies that cannot be found. Smart search has the advantage of intelligence, but the limit of 1,000 records corresponds to about 450 to 550 DWPI families on average, which is not much in terms of the number of patents related to NLP chatbot. The results of patent search will be used as the data source for clustering task at level 2. To use more patents for clustering, traditional patent search is also tried, which directly search patents form the original term user lists. Although by traditional search more patents can be found, if there are emerging technologies or applications that are not widely discussed or even undetected, they will not be found. After several rounds of trials this study finally selected 508 DWPI families detected by smart search as the results of level 1, and its search is shown in Table
Search query for clustering.
Search type | DI query | Result |
---|---|---|
Smart search | SSTO = (“natural language processing” “natural language understanding” “NLP” “NLU” “chatbot” “VIRTUAL ASSISTANT” “INTELLIGENT ASSISTANT” “automated conversational interface”) | 508 DWPI families |
At level 2, the patent obtained from the previous level is clustered and some target domains are discovered from the results. The process begins at extracting the words in the patent document and using NTF-IDF to do vectorization, so that numeric vectors are obtained and can be applied to perform the k-means clustering. After that, the top words and n-gram top phrases of each cluster can be counted, from which target domains are selected.
This study chooses DWPI title, DWPI abstract, and independent claims as the source attributes for clustering. Patent documents may come from different countries, written in different languages, and cover a large number of attributes. Patent is to protect the inventor’s smart finance or as a consideration for the enterprise’s knowledge layout. Contrary to academic articles, patents are not written for users to understand easily, and some information may even be deliberately hidden in the title, which is not conducive to patent mining. The DWPI title and DWPI abstract, provided by the DI database, just solve the above problems. DI employs discipline-professional editors with scientific and engineering backgrounds to manually read all patents one by one and rewrite the title and abstract with easy-to-understand text, which are DWPI title and DWPI abstract, respectively. They remove the legal jargon, use American spelling, and intellectually choose drawing instead of just choosing the ones on the front page. In addition, many studies have shown that the value of patents is greatly affected by the number of independent claims, which are also included as the source attribute of the cluster.
After retrieving and vectorizing patent documents, k-means can be performed to show the clustering distribution phenomenon in the vector space. The appropriate number of clusters can be obtained by calculating the silhouette score: the goal is to maximize the distance between clusters and minimize the distance within clusters. In this study, 13 clusters are clustered from 508 patents, and the top 10 words and 2-gram phrases in each cluster are extracted through NTF-IDF (see Table
Top 10 words and 2-gram phrases in each cluster.
Cluster | Size | Top 10 words and 2-gram phrases |
---|---|---|
1 | 20 | Assistant, automate, user, input, language, natural, client, human, computer, processor |
Automate assistant, natural language, automated assistant, virtual assistant, assistant client, automate summarization, human computer, computer dialogue, input corpus, telephone call | ||
2 | 27 | Engine, language, natural, user, medical, code, processing, billing, generate, clinical |
Natural language, medical billing, billing code, language processing, language understand, patient encounter, clinical patient, free text, question answer, processing engine | ||
3 | 45 | User, request, response, language, natural, processing, action, query, input, generate |
Natural language, user request, action structure, language processing, language request, response user, speech input, dynamic training, computer readable, request text | ||
4 | 31 | Word, language, natural, phrase, computer, processing, plurality, sentence, processor, clause |
Natural language, target word, language processing, word clause, word phrase, neural network, input question, numeric code, program instruction, user interface | ||
5 | 46 | Text, language, natural, processing, user, processor, process, semantic, information, computer |
Natural language, language processing, language text, language understand, text interest, input text, semantic segment, information processing, touch operation, text natural | ||
6 | 52 | Plurality, language, entity, natural, generate, computer, associate, processing, document, name |
Natural language, name entity, language processing, computer readable, language input, cluster classification, reduced aggregation, flow diagram, machine learn, neural network | ||
7 | 64 | User, interface, language, NLP, natural, display, input, query, information, computer |
Natural language, user interface, language processing, user input, user query, graphical user, processing NLP, voice apparatus, real time, computer readable | ||
8 | 30 | speech, user, input, recognition, processing, language, determine, computer, audio, natural |
Natural language, speech recognition, speech processing, language processing, input audio, speech input, computer implement, user profile, language understand, automatic speech | ||
9 | 52 | Input, user, intent, language, natural, determine, processor, generate, NLU, computer |
Natural language, language input, user input, user intent, voice input, language understand, computer readable, input determine, transitory computer, user interface | ||
10 | 35 | Communication, user, language, natural, input, interface, voice, computer, processor, call |
Natural language, voice input, language processing, communication interface, input communication, text communication, voice communication, user input, phone call, communication channel | ||
11 | 27 | Application, user, language, natural, NLU, input, computer, associate, plurality, processor |
Natural language, online application, speech word, user online, language processing, language input, part speech, language understand, dimensional vector, structured natural | ||
12 | 35 | Information, language, module, natural, user, entity, service, input, generate, obtain |
Natural language, touch screen, language understanding, language understand, language processing, object hovering, understanding module, component process, target conversation, question answer | ||
13 | 44 | Language, natural, program, processor, structure, computer, instruction, user, analysis, expression |
Natural language, program instruction, language expression, frame structure, language processing, computer readable, language understand, semantic structure, language story, computer program |
The top 10 words and 2-gram phrases of 13 clusters, with a total of 260 terms, of which technical details are examined individually, are classified as 13 subdomains, which are combined to form the 4 domains, that is, NLP, model, system, and applied scenarios (see Table
Domains, subdomains, and terms.
Domain | Subdomain | Cluster | Key words/phrases |
---|---|---|---|
NLP | Cognition | 3, 13 | Action, expression, action structure, frame structure, semantic structure, automate summarization |
Named entity recognition | 6 | Entity, name, name entity | |
Linguistics (syntactic semantic, morphology) | 4, 5 | Part speech, word, phrase, sentence, semantic, plurality, document, clause, expression, target word, word clause, semantic segment, semantic structure, language expression | |
NLU | 9, 11, 12 | NLU, user intent, user request, language understand, language understanding, understanding module | |
Response | 3, 12 | Generate, question answer, response user | |
Speech recognition | 8 | Speech, voice, audio, speech recognition, voice input, speech input, speech processing, input audio, speech word, automatic speech | |
Model | Model | 6 | Engine, machine learn, program, neural network, cluster classification, processing engine, dynamic training, dimensional vector, reduced aggregation |
System | User interface | 7, 11 | Human, display, call, user interface, user input, touch operation, online application, phone call, telephone call, user online, graphical user, real time, human computer, touch screen, object hovering |
Medium | 9 | Computer, voice apparatus, computer readable, voice input, transitory computer, | |
channel, communication | 10 | Communication, communication channel, communication interface, text communication, voice communication | |
Applied scenarios | Personal | 1 | Service, assistant, automate, client, automated assistant, virtual assistant, assistant client |
Medical | 2 | Medical, billing, clinical, medical billing, billing code, patient encounter, clinical patient | |
Skip | Skip | All | User, NLP, natural, language, interface, computer, analysis, query, recognition, process, processor, processing, structure, program, code, application, request, response, input, obtain, associate, service, instruction natural language, language processing, processing NLP, language input, input text, text interest, flow diagram, language text, determine, computer program, language input, information, input determine, input corpus, computer dialogue, free text, user query, language input, language request, input question, text natural, module, user input, user profile, component process, language story, structured natural, program instruction, frame structure, information processing, numeric code, computer implement |
This research takes natural language-enabled chatbot as the subject. A large number of words related to NLP appear in large numbers in each cluster, which is not helpful to find out the domain, such as “NLP,” “natural language,” and “processing.” In addition, many chatbot-related words are very versatile, which also increase the difficulty of domain exploration, such as “processor,” “request,” “input,” and “module.” The above vocabularies are skipped during the domain selection. One step in the preprocessing of patent documents before clustering is to vectorize the patent documents. Although those skipped terms in Table
“NLP” domain contains cognition, named entity recognition (NER), linguistics (which include syntactic, semantic, and morphology), natural language understanding (NLU), response, and speech recognition. Nine clusters, cluster 3, 4, 5, 6, 8, 9, 11, 12, and 13, are distributed in NLP domain. For cluster 3, two subdomains, cognition and response, are involved. For cognition subdomain, representative patent US9361884B2 (assignee: Nuance Communication Inc.) proposed a human-machine dialogue system, incorporating with an NLU engine and a dialogue manager for providing NLP application to identify and resolve anaphora. For response subdomain, patent US10417266B2 (assignee: Apple Inc.) proposed systems and processes for operating an intelligent automated assistant to provide a set of predicted responses. Cluster 4 and 5 focus on linguistics. Patent US20200327284A1 (assignee: ServiceNow Inc.) in cluster 4 proposed an agent automation system, which has processor that is configured to assign respective word vector to nodes and encodes semantic meaning of word or phrase represented by nodes. The system generates an annotated utterance tree by using a combination of rule-based and ML-based components, wherein an annotated utterance tree represents a syntactic structure of the utterance, and nodes of the annotated utterance tree include word vectors that represent semantic meanings. The annotated utterance tree is used as a basis for intent or entity extraction. Patent EP3111338A1 in cluster 5 also used automated text annotation for the construction of NLU grammars. Patent US10789426B2 in cluster 5 described a device for processing natural language text with the context-specific linguistic model. Patent US10304444B2 (assignee: Amazon Tech Inc.) applies NLU to the music field, which uses a hierarchical organization of intents and entity types, and trained models associated with those hierarchies, so that commands and entity types may be determined for incoming text queries without necessarily determining a domain for the incoming text. Although cluster 6 is mainly concentrated in the “model” domain, there are also many terms related to “named entity.” A representative patent US10755046B1 (assignee: Narrative Science) describes an NLP system for conversational inferencing with four-step parsing process.
Cluster 8 focuses on speech recognition. Patents US10446147B1 and US20200118564A1 (assignee: Amazon Tech Inc.) describe a speech recognition system to provide a contextual voice user interface. Patents US9245525B2, US9741347B2, and US10049676B2 describe an interactive response system mixes HSR subsystems with ASR subsystems to facilitate overall capability of user interfaces. Patents US9245525B2, US9741347B2, and US10049676B2 describe an interactive response system mixes HSR subsystems with ASR subsystems to facilitate overall capability of user interfaces.
Cluster 9 mentions about NLU, in which patent US9761225B2 (assignee: Nuance Communications Inc.) is representative. In US9761225B2, a method for identifying and resolving anaphora in multimodal conversational dialogue application for smartphone is proposed, in which multiple NLU interpretation selection models may be generated. The NLU interpretation selection models may include a generic model and one or more specialized NLU interpretation selection models, and each of which may be specific to a particular set of NLU interpretation type. Semantic reranking mechanism is applied in this method. Cluster 11 also mentions about NLU capability and focuses more on the follow-up actions, which are more related to “system” domain. Cluster 12 focuses on knowledge extraction in NLU. The representative patent is US10762113B2, which uses conversational knowledge graphs in virtual assistants to process natural language input, which involves receiving natural language queries from users at the virtual assistant’s NLU system. Cluster 13 also belongs to cognition subdomain. Patents US9965461B2, US9594745B2, US9569425B2, and US20140249801A1 in cluster 13 (assignee: The Software Shop Inc.) describe the method for improving efficiency of syntactic and semantic analysis.
“Model” domain, concentrated in cluster 6, has no subdomain, and the number of key words is relatively low. The possible reason is that since neural networks are mainly mathematical algorithms and computers are only the carriers of mathematical operations, they cannot contribute to the technology themselves. In this case, what field the close integration of technologies and functions are in has come an important basis for judging technicality. If AI is only used to analyze business data, and technical problems are not solved, it is likely to be regarded as having no technical ideas, and it is difficult to overcome the nonpatent reasons by applying for repetition or amendment [
“System” domain contains user interface, medium, and communication or channel subdomains, in which four clusters, cluster 7, 9, 10, and 11, are distributed.
As for “applied scenarios,” concentrated in cluster 1 and 2, terms such as “virtual assistant,” “medical,” and “billing” are found. In cluster 1, three patents assigned to Google LLC are representative for virtual assistants in “personal” subdomain. Patent US20200320136A1 proposes a method for using distributed state machines for human-to-computer dialogues with automated assistants to protect private data. Patent US20200050788A1 describes a system for assembling responses from remote automated assistants. Patent KR2020131299A proposes a method for generating Internet of things-based notification by automated assistant client of client device. In cluster 2, three patents assigned to Nuance Communication Inc. are representative for medical billing and coding in “medical” subdomain. Medical billing and coding are two closely related aspects of the modern health care industry. Both practices are involved in the immensely important reimbursement cycle, which ensures that health care providers are paid for the services they perform [
Three domains were found from the clustering results. It is particularly important to emphasize that the composition of natural language-enabled chatbot mostly relies on the three domains, NLP, model, and system. Since most of the related patents contain these three parts at the same time, it is difficult to determine the exact belonging domain for each patent and also meaningless.
According to the ontology construction process (see Figure
Topics in each domain.
Domain/method | Query | ||
---|---|---|---|
Search type | Input patent size | Topics | |
NLP/LDA | SSTO = (“natural language processing” “linguistics” “natural language generation” “natural language understanding” “speech recognition”) | ||
Smart | 570 | Linguistics, conversation, speech recognition, knowledge | |
Model/LDA | CTB = ((chatbot) or (automated adj conversation | ||
CTB | 2,535 | Features, voice device, question answer, classification, graph, automatic service | |
System/LDA | SSTO=(“natural language processing” “natural language understanding” “NLP” “NLU” “chatbot” “automated conversational interface”) AND SSTO = (“user interface” “medium” “communication” “channel” “immersive technology” “computer vision”) | ||
Smart | 534 | User interface, dialogue management, infrastructure | |
Applied scenarios/manual | CTB = (((chatbot | ||
CTB | 31 | Engineering, e-commerce |
Keywords for each topic.
Domain | Topic | Keywords |
---|---|---|
NLP | Linguistics | Personality, AI, discourse, syntactic |
Conversation | NLU, semantic, NLG, intent | |
Speech recognition | Audio signal, processor, channel | |
Knowledge | Entity, ontology, semantic, cognitive identification | |
MODEL | Features | semantic, vector representation, image recognition |
Voice device | Storage, server, control module | |
Question answer | Pair, retrieval, RNN | |
Classification | Segmentation, convolutional, encoder | |
Graph | Entity, ontology, intent | |
Automatic service | Call, recommendation | |
System | User interface | Portable, network, wireless, digital |
Dialogue management | NLG, processor, ML | |
Infrastructure | Channel, cloud, communication |
This research hopes to find the application field of NLP chatbot, but a lot of experts are describing natural speech-related technologies or the system framework of conversation management, which are not discussed in this section. This research mainly divides the application scenarios into engineering applications and e-commerce applications. It can be found from the patent search results that natural language-enabled chatbot is widely used in the field of e-commerce, while the application on the engineering side is difficult to find. 44 patents are reviewed manually and classified to certain topic or scenario. These patents with respect to the applied scenario are listed in Table
The manual induction result by applied scenario.
Topics | Scenario | Publication number | ||
---|---|---|---|---|
E-commerce | Medical | US20170323060A1 | WO2020061562A1 | US10679345B2 |
US20200185102A1 | JP2020518047A | US20200027535A1 | ||
CN111612752A | US10754925B2 | US10319004B2 | ||
Health | CN109591024A | US10748644B2 | ||
Driver assistant | KR2020000621A | WO2020069517A3 | US10752212B2 | |
CN111145731A | US20200216089A1 | US10543931B2 | ||
US10573299B2 | US20200135183A1 | EP3606797A2 | ||
Exercise | KR2173553B1 | US20200114207A1 | ||
Education | US10223934B2 | |||
Emotion | US10579742B1 | CN111312394A | ||
Smart home | IN202041050057A | CN110654738A | ||
Customer service | CN108282587B | IN201821029643A | CN111902878A | |
Smart assistant | US10748526B2 | US10747958B2 | US10733375B2 | |
EP3753017A1 | ||||
Entertainment | EP3566399A4 | |||
Engineering | Robot | CN111645073A | CN111267097A | JP2020526402A |
JP06792132B2 | US20200306958A1 | |||
Programming | US10843080B2 | |||
Manufacturing | CN107632845B | DE102018212503A1 | ||
Quality control | WO2020181365A1 |
Here are some patents in topic of e-commerce. Patent US20170323060A1 describes a system for facilitating automated natural language understanding for medical documentation of patient, which has processor for presenting set of medical billing codes for user review in graphical user interface (GUI) before finalizing coding of encounter. Patent KR2020000621A describes a conversation system for grasping user attention during various situations in a vehicle by using a mobile device. The system has a storage unit for storing situation information collected from a vehicle. A dialogue management module obtains a factor value of action factor used to perform an action corresponding to a dangerous situation when an input processor obtains an action corresponding to the starting situation from the storage unit. An input processor generates a dialogue to perform the action corresponding to the dangerous situation by using the factor value of the acquired action factor while obtaining the action corresponding to the dangerous situation and generates a conversation message. A result processor generates a conversation response corresponding to a delivered starter message. Patent US10223934B2 proposes a method for monitoring and analyzing language environment, vocalization, and development of key child, which provides metrics associated with key child’s language environment and development in a relatively quick and cost-effective manner. The proposed method is used to promote improvement of the language environment and key child’s language development and to track development of the child’s language skills. Key child’s language environment and language development are monitored without placing artificial limitations on the key child’s activities or requiring third party observer.
Here are some patents in topic of engineering. Patent JP06792132B2 defines an information-processing apparatus, which is used in the manipulator control system and NLP system and can be performed with high versatility. The information-processing apparatus has processing module groups, and each of which is equipped with several processing modules with specific processing capabilities. These processing modules have a neural network with a hierarchical structure. The information is processed by sending and receiving the information signal of the processing module in several interhierarchical structures. Patent CN111267097A proposes a natural language-based assisted programming method for industrial robots, involves parsing language instructions, matching parsing result, and combining coordinates output to generate final robot auxiliary code. The multiattention mechanism model adopted by the method improves the recognition accuracy and solves the problem that the current method cannot accurately recognize objects in an industrial environment. Modular programming technology solution simplifies engineers programming complexity and effectively improves development efficiency. Patent US10843080B2 describes a system for facilitating automated program synthesis from natural language. The system allows a user to be more comfortable and familiar with grammatical requirements for forming a proper sentence in native language as opposed to memorizing rules or required constructs for a potentially complicated programming language. The system employs fuzzy grammar matching to reduce complexity, while slightly trading off complexity for accuracy. The system allows the user or developer to examine to express an idea in a different manner to better reflect user an original intent. Patent DE102018212503A1 defines communication and control systems, which has control devices for operating machine based on software communication chatbot, for filling beverage in bottling plants. The chatbot recognizes a voice input and a text input by an operator to output or display information about an operating state of the machine. The systems realize production conversion of energy in an automatic manner and order completion in a rapid manner and improve media efficiency and scheduling efficiency. Patent WO2020181365A1 proposes an apparatus for 360-degree assistance for quality control system scanner with mixed reality (MR) and ML technology. The apparatus has an optical sensor, a display, and a processor to receive diagnostic information from a server related to a field device in an industrial process control and automation system. The processor identifies an issue of the field device based on the diagnostic information, detects, using the optical sensor, the field device corresponding to the identified issue, and guides, using the display, a user to a location and a scanner portion of the field device that is related to the issue. The processor provides, using the display, necessary steps or actions to resolve the issue, and connects, using a cloud server, a user to get modules of installation, commissioning, AMC, and training for a QCS as per the selected person.
In this section, the ontology map of NLP chatbot is drawn based on the previous outputs. A four-level ontology includes subject, domains, topics, and key phrases in a top-to-bottom sequence. Under the subject of NLP chatbot, the domains are NLP, model, system, and applied scenarios. The third level has the topics under each domain. For NLP domain, there are speech recognition, linguistics, conversation, and knowledge. For domain of model, topics are feature, graph, voice device, question answering, classification, and automatic service. For domain of system, the topics are infrastructure, dialogue management, and user interface. For applied scenarios, e-commerce and engineering are the two main topics. The fourth level has the key phrases under each topic. It is noticed that some key terms are shared by multiple topics. The ontology map of NLP chatbot is shown in Figure
NLP chatbot ontology.
Related patents are searched by entering keywords related to NLP and chatbots on the DI database, and patent management map analysis is conducted (see Table
Search query for patent management analysis.
Search type | DI query | Result |
---|---|---|
Claim/title/abstract | CTB=((chatbot) or (automated adj conversation | 12,840 DWPI families |
Since 2017, 10,480 patents have been published, accounting for 82% of the total 12,840 patents in the past decade. Furthermore, since 2019, 8,099 patents account for 62%. From the perspective of the annual growth rate of the number of patents, the number was a high 44% in 2014, but returned to 6% in 2015, which is the lowest number in the past decade. However, starting in 2016, the annual growth rate has increased sharply until it reaches a peak of 105% in 2019, and it then falls back to 66% in 2020. Whether the decrease in the number of 2020 is related to the impact of COVID-19 is unknowable, but this may be a signal that implies that the technology related to natural language-enabled chatbot may have gradually matured.
However, a single reduction in quantity cannot lead to any conclusions unless supported by more other data or evidence. IPC is a standard taxonomy developed and administered by WIPO for classifying patents and patent applications, which covers all areas of technology and is currently used by the industrial property offices around the world. From the annual number of patents with IPC analysis, to 2018, all The IPC classifications have been covered. In other words, among the 8,099 patents in 2019 and 2020 that accounted for 62% of the number in the past decade, no new technology has been produced.
Top 6 4-character IPCs, with a number of patents that greater than 1,000, are G06F (electric digital data processing), G06N (computer systems based on specific computational models), G06Q (data processing systems or methods), G10L (speech analysis or synthesis; speech recognition; speech or voice processing; speech or audio coding or decoding), H04L (transmission of digital information), and G06K (recognition of data), each in which has a number of 8,870, 3,144, 2,413, 2,176, 1,364, and 1,258 patents, respectively (see Figure
Top 10 IPCs (4 characters).
G06F’s patents accounted for 8,870 of 12,480 patents. Therefore, the complete IPC classification in G06F was further explored. Among the top 10 IPCs listed (see Figure
Top 10 IPCs in G06F.
In addition to statistics on the number of patents, the fluctuations in the number in recent years are also worthy of attention. Based on the annual growth rate of all patents, when the growth rate of an IPC is higher than average, it represents greater momentum; conversely, when the growth rate of an IPC is lower than average, it may imply that the technology has entered the mature stage early. The four 4-character IPCs with the largest number were selected for this analysis (see Figure
Annual patent growth rate under top IPCs.
G06F has an overwhelming 69% of total patents, but its annual growth rate is much inferior to the average annual growth rate. In 2014, the total number of patents related to natural language-enabled chatbot rose sharply by 44.37%. The growth rate of G06F in that year was only 41.40%, which was slightly lower than the average. Since 2016, during the period of rapid growth in the number of patents, the growth rate of G06F has not been outstanding. Even when the average growth rate reached a peak of 104.49% in 2019, G06F was 14.92% less than the average. By contrast, the annual growth rate of G06N is amazing. In 2014, it was 43.86% higher than the average, and from 2016 to 2020, the annual growth rate was 73.74%, 26.14%, 89.49%, 52.84%, and 74.29% higher than the average, respectively. G06Q and G10L fluctuate up and down in average annual growth rates and have not yet shown a clear trend.
In general, the average annual growth rate began to slow down after reaching a peak in 2019 after rapid growth, no new IPC appeared after 2018, and all of which indicate that the development of natural language-enabled chatbot has entered a mature stage. It is worth noting that the patents related to only reading G06N are still growing rapidly.
Assignee analysis helps to find the main players in the market, which are all technology giants from the results. The number one IBM has 1,358 patents, which is more than the total number from the second to the tenth. The well-known technology giants Apple Inc and Facebook Inc are ranked 16th and 17th, respectively. Although they are not in the top 10, they are also listed in the table due to their influence (see Table
Top 10 assignees.
Top | Assignee | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | 2020 | Total |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | IBM | 12 | 14 | 22 | 24 | 26 | 129 | 231 | 313 | 498 | 1,358 | |
2 | Microsoft Technology Licensing LLC | 0 | 0 | 0 | 0 | 12 | 14 | 23 | 39 | 125 | 285 | |
3 | Amazon Tech Inc. | 0 | 2 | 0 | 0 | 8 | 16 | 15 | 35 | 57 | 197 | |
4 | Google LLC | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 15 | 36 | 185 | |
5 | Samsung Electronics Co. Ltd. | 2 | 4 | 4 | 3 | 5 | 6 | 8 | 19 | 27 | 152 | |
6 | Nuance Communications Inc. | 4 | 4 | 9 | 14 | 12 | 12 | 17 | 15 | 15 | 23 | 125 |
7 | Accenture Global Solutions Ltd. | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 7 | 77 | 123 | |
8 | Beijing Baidu Netcom SCI & TEC | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 6 | 10 | 109 | |
9 | Microsoft Corp | 21 | 21 | 14 | 17 | 8 | 2 | 4 | 3 | 2 | 2 | 94 |
10 | Univ Kunming Science and Tech | 0 | 1 | 0 | 0 | 1 | 3 | 1 | 15 | 23 | 91 | |
16 | Apple Inc. | 0 | 1 | 0 | 3 | 1 | 1 | 6 | 10 | 19 | 36 | 77 |
17 | Facebook Inc. | 0 | 0 | 0 | 2 | 1 | 3 | 5 | 4 | 19 | 38 | 72 |
IBM’s patents began to grow rapidly in 2016, when IBM’s patents were concentrated in the two categories of G06F 17/30 and G06F 17/27, showing that IBM focused on information retrieval and grammar analysis in NLP. In 2019, the number of patents of Microsoft, Amazon, Accenture, and Univ Kunming Science and Tech began to grow significantly. In addition to G06F 12/27, Amazon and Microsoft use speech recognition technology based on natural language models in human-machine dialogue, which is mainly reflected in the two IPCs G10L 15/18 and G10L 15/22. In 2020, the number of patents of Google, Samsung, and Baidu increase rapidly at the same time. In addition to the two categories of G10L 15/18 and G10L 15/22 related to speech recognition in 2019, both Google and Samsung have more patents appearing in G06F 3/16, which focuses on the conversion between speech and digital information. On the other hand, Google and Baidu applied for many patents on G06N 3/08, which are the computer system based on learning methods. In addition, Baidu also has a large number of patents on G06F 40/30 for semantic analysis. Google and Baidu are both Internet service companies that started as search engines, and Google and Samsung are also close partners in the android camp. The highly increasing number of patents assigned to these three companies, which are quite close to the end user, might imply the maturity stage and mass application in this technology field. From the IPC distribution of Apple Inc.’s patents in 2019 and 2020, it can be seen that its patents are highly concentrated on speech recognition-related G10L 15/18, G10L 15/22, and G06F 3/16, which are similar to Google. Google and Apple coincidentally began to cut into a large number of patents in the field of speech recognition, speech, and digital information conversion in 2019. The clues can also be seen from their products. The Google Nest Mini launched in November 2019 and the Apple HomePod launched in August 2019 show the development path from smart speaker to smart home. With the maturity of natural language technology and IoT, the use of natural language to control objects around life will gradually replace the previous method of operating through buttons or operating with limited system interfaces. When other companies focus on deepening NLP-related technologies or developing speech recognition applications, Facebook Inc. has paid more attention to electric communication technique, including H04L 12/58 and H04 29/08. The two IPC codes represent message switching systems and transmission control procedure in network communication, respectively.
A Technology Function Matrix (TFM), which investigates the corresponding relation between technologies and functions on patent amount, is a critical approach for patent data analytics. The domain of NLP, model, and system, which is introduced before in Section
This research uses the TF-IDF-based TFM automatic construction method. After defining the technologies and functions, an unstructured text description that best represents each technology or function must be prepared. These text descriptions are transformed into a set of vectors through unsupervised learning, which acts as an agent for each technology or function. Then, specific fields are selected from each patent, converted into a vector, and compared with each technology and function through similarity, and a threshold is used to determine whether the patent can be classified as the technology or function. Thus, the text description of each technology or function is very important. Sections
13 TFM technologies, listed below in Table
TFM technologies.
ID | Domain | Technology |
---|---|---|
T01 | NLP | Speech recognition |
T02 | NLP | Named entity recognition |
T03 | NLP | Natural language understanding |
T04 | NLP | Natural language generation |
T05 | Model | Feature engineering |
T06 | Model | Recurrent neural network |
T07 | Model | Convolutional neural network |
T08 | Model | Transformer model |
T09 | System | Speech-generating device |
T10 | System | Cloud computing |
T11 | System | Voice activity detection |
T12 | System | Human-computer interaction |
T13 | System | Immersive technologies |
Nine TFM functions, which are information extraction, dialogue management, context prediction, recommendation system, algorithm efficiency, automated control, communication, user experience, and virtual assist, are listed in Table
TFM functions.
ID | Function |
---|---|
F1 | Information extraction |
F2 | Dialogue management |
F3 | Context |
F4 | Recommendation system |
F5 | Algorithm efficiency |
F6 | Automated control |
F7 | Communication |
F8 | User experience |
F9 | Virtual assistant |
For finding emerging trend of natural language-enabled chatbot, year 2020 patents are used as the source for TFM. The 13 × 9 TFM result is obtained through the automated process described before (see Table
The TFM result.
F1 | F2 | F3 | F4 | F5 | F6 | F7 | F8 | F9 | |||
---|---|---|---|---|---|---|---|---|---|---|---|
T01 | NLP | Speech recognition | 703 | 673 | 343 | 307 | 452 | 484 | 630 | ||
T02 | NLP | Named entity recognition | 412 | 484 | 301 | 339 | 852 | 134 | 318 | 217 | |
T03 | NLP | Natural language understanding | 809 | 503 | 514 | 195 | 177 | 627 | 87 | 249 | 188 |
T04 | NLP | Natural language generation | 989 | 724 | 827 | 308 | 348 | 759 | 105 | 378 | 272 |
T05 | Model | Feature engineering | 571 | 377 | 306 | 436 | 372 | 470 | 136 | 299 | 190 |
T06 | Model | Recurrent neural network | 569 | 386 | 613 | 257 | 244 | 332 | 190 | 110 | 161 |
T07 | Model | Convolutional neural network | 317 | 251 | 446 | 287 | 327 | 208 | 149 | 98 | 144 |
T08 | Model | Transformer model | 514 | 422 | 714 | 308 | 452 | 448 | |||
T09 | System | Speech-generating device | 622 | 384 | 792 | 998 | 963 | ||||
T10 | System | Cloud computing | 341 | 422 | 237 | 358 | 213 | 465 | 487 | 452 | 458 |
T11 | System | Voice activity detection | 309 | 377 | 241 | 283 | 120 | 379 | 780 | 557 | 705 |
T12 | System | Human-computer interaction | 685 | 858 | 509 | 512 | 260 | 626 | 803 | 850 | |
T13 | System | Immersive technologies | 307 | 417 | 211 | 272 | 123 | 211 | 439 |
The most applied function of speech recognition is information extraction (F3). Accuracy of speech recognition is the key to determining whether it can be applied to the commercial field, and good information extraction ability is a necessary condition. Although speech recognition technology has gradually matured, there are still a large number of patents in this field for better recognition capabilities and information extraction capabilities.
Google LLC’s patent US10431206B2 uses the hierarchical recurrent neural network (HRNN) structure handles the task of multiaccent speech recognition. Patent CN110033766A proposes a complex multiple deep neural network architecture, including single layer of one-way RNN model, binary bidirectional RNN model, and binary bidirectional LSTM (BiLSTM) model and other network structure, in pursuit of faster speed and less energy consumption. Patent EP3497630B1 uses CNN architecture, which allows better signal propagation and long-range dependency learning, thus improving output quality.
In addition, speech recognition and automated control functions (F6) are combined with each other to form the application of speech-driven automated control. When receiving speech data from the client, speech recognition and NLU model stored in the cloud are used to interact with other devices in the cloud space, such as unmanned aerial vehicles (UAVs), robots, augmented reality (AR), and virtual reality (VR) devices, through AI modules and 5G network technology.
In order to improve the accuracy of NER, preprocessing is very important. Patent CN110990525A proposes a sentiment-based information extraction method that achieves good performance in the field of financial sentiment information extraction through preprocessing and feature extraction modules. Data labeling and feature engineering are the two main steps in preprocessing. Patent CN111783466A proposes a named entity recognition method for Chinese medical record field, in which the label uses two-layer conditional random field (CRF) classification to determine the final output label thus improving the accuracy of NER and reducing the time consumed by training. There is similar research in literature studies. In view of the insufficient representation of potential features of Chinese characters, Han et al. [
The transformer model is widely used to improve the accuracy of the information extraction function (F1). Patent CN110941698A proposes a method based on the bidirectional encoder representation on BERT CNN, which generates rich contextual semantic information of word vectors, thereby effectively supporting service similarity calculation to find the most accurate target service, and achieving accurate retrieval of target services.
As for dialogue management function (F2), patent CN111274362A proposes a dialogue generation method based on the transformer architecture, which involves obtaining a vectorized representation of words, and generating a reply based on a comprehensive semantic vector and a copy mechanism, which is used to solve the NLG based on background domain knowledge dialogue. Patent US20200372341A1 proposes a pipelined natural language question answering system based on the BERT model, which involves receiving an input text of a natural language question and provides an answer to the natural language question considering context.
The transformer model is used in context (F3) function to improve the accuracy of NLP. Patent CN110737764A proposes a method for generating personalized dialogue content based on a multiround dialogue model. The transformer model effectively learns the dialogue sequence relationship between natural languages, can predict the generated content to reduce the probability of replying commonality, and increase the diversity of dialogue content. Patent CN111708882A proposes a method for complementing missing Chinese text information based on transformer encoder. This method starts from manually preprocessing Chinese text documents, dividing the text into a large number of short sentence corpora, and converting it into the smallest unit of BERT vector. Since the purpose is to find out the missing words and sentences in the article, the training method is to randomly generate noise to hide the words in the complete article to create the effect of the omission. Conversely, in order to be able to fill in the missing words, the model must have text generation capabilities. Through repeated information deletion and generation procedures, Chinese natural language processing task accuracy is further improved.
Speech-generating device is highly related to the three functions of information extraction (F1), dialogue management (F2), and context (F3), with 1,190, 1,141, and 1,123 patents, respectively. The speech recognition technologies of T09 and T01 are also highly related, but the classification of T09 in the “system” domain means that the description of this technology is more focused on the hardware or system framework, so that for T09, F1, F2, and F3. The gap between is blurred. From these large numbers of patents, it can be found that with the maturity of Internet technology and mobile devices, the past information retrieval systems have begun to be replaced by chatbots. However, when NLP technology is not yet mature, rule-based chatbots cannot exert influence. However, as NLP technology and speech recognition technology mature, speech-generating devices have also developed rapidly and combined with chatbot applications. Task-oriented retrieving systems began to be replaced by speech query systems. Patent CN110111766A claims a multifield multitask system, which solves the problem of the multidomain multitask switching in the dialogue system. The complex multitask dialogue system integrates a speech recognition module, a domain confidence state tracking module, dialogue managing module, an NLG module, and a speech synthesis module to realize the capability that semantic level information can be shared between each domain. Patent JP2020098308A proposes a voice inquiry system for information provision, in which each of chatbot servers and smart speaker operation server use the DL model, accept a spoken question, infer, and output the corresponding answer in spoken speech.
The next step after reaching the speech query system is speech-driven remote control. 1,006 patents related to automated control function also support this idea. Patent US10748529B1 (assignee: Apple Inc.) proposes a voice-based digital assistant for use with home automation of voice activated controllable device, such as TV, speaker, or camera. The application of speech-driven automated control is not so uncommon, but they are focused on devices that do not have safety hazards, such as home-related devices. It also means that speech-driven automated control is still at the auxiliary stage and cannot replace existing functions. However, it is believed that one day people will hope that many functions that require physical contact can be replaced by voice control, and the first thing to overcome is noise. Since the sound is not specified, the device may receive unexpected sounds and trigger actions at any time. Therefore, a gateway may be required to avoid unexpected actions caused by noise. Patent US20140214414A1 proposes a communication system for use in automatic speech recognition applications, which can transmit commands through wireless network to modify gateway’s noise reduction processing state.
When it comes to smart homes, in addition to speech control, there are more automatic control methods through HCI. Patent CN110932953A proposes s smart home control method and device, which can receive the user control command of the target home, login target start home residence in the target network, intelligently perform control, and return the result message back. This solution realizes the multihome for different manufacturers and different communication protocols for uniform control.
It is observed from TFM that HCI technology is widely used to improve user experience (F8), and there are 909 patents located in the interaction. Most people use chatbots to meet their needs, such as information retrieval or specific operational tasks. It is most important to be able to meet the needs of users in fewer conversations. Many patents also aim to reduce dialogue and improve dialogue efficiency, such as CN112015879A, CN110990594A, CN111488433A, and CN110827831A.
In addition to the HCI methods of contact and voice, the use of gaze tracking to help virtual assistants more accurately grasp the text or dialogue paragraph the user is paying attention to is an emerging application.
As mentioned in Section
The three-dimensional matrix.
Nine topics, including medical data, smart cities, IoT, data privacy, sustainable strategies, CRM, personalization, social media listening, and ML models, are identified as latent topics for future research based on data-driven strategies [
AI makes huge progress; algorithms are rapidly improving, managing massive amount of data; however, it still is not knowledge-driven technology. The knowledge behind the natural language-enabled chatbot is very important for dialogue with humans. The early development of chatbot was mostly dominated by a single domain. It has been observed that more research has been directed towards open domain [
With the rapid development of the semantic web, a large amount of structured data has been provided in the form of a knowledge based on the web. Making these data accessible and useful to end users is one of the main goals of chatbots based on link data [
Related patents in recent years have also focused on studying how the knowledge framework can improve the capabilities of NLU and integrating the KG into the knowledge base of chatbot. Patent US10733375B2 (assignee: Apple Inc.) provides a system and process for operating intelligent automated assistants. This process is based on a knowledge framework and can improve the validity of NLU, analyze the mapping of domain attributes and words from the natural language input, then correspond to the data of the knowledge base according to the analysis results, and determine the output response results according to the ranking mechanism. Patent EP3362972A1 proposed a system for authoring visual representation for text-based natural language document. User interface is provided that contains a document area and thus enables to interactively generate the visual representation information that accurately depicts the underlying source text. The system generates a node graph of at least one of the parse trees, the entity information, or the relational phrase information and processes the document to determine relational phrase information indicating that the portion of the text includes a relationship to at least one of a subject, verb, or object in a sentence that includes the portion of the text. Also, the system generates another visual representation links the nodes and the relations. Patent WO2020160264A1 proposed a method of identifying relevant data sets using training models related to topics of interest, involving access to one or more sources, each of which contains information systems and related methods used to organize, represent, find, discover, and access data. The embodiment represents information and data in the form of a data structure called a “feature graph.” The feature graph includes nodes and edges, where edges are used to “connect” nodes to one or more other nodes. The nodes in the feature graph can represent variables, that is, measured objects, features, or factors. The edge in the feature graph may represent a measure of the statistical association between a node and one or more other nodes that have been retrieved from one or more sources. The data set that represents or supports statistical correlation or measurement correlation variables is “linked to” form the “feature graph.” Patent US10762113B2 (assignee: Cisco) proposes the use of conversational knowledge graphs in virtual assistants to process natural language input. After receiving the natural language query of the user, the method retrieves the contextual information of the conversational knowledge according to the intention and calls the back-end service accordingly and obtains the response after the service is performed. Finally, the response is translated into natural language and provided to the user. There are similar studies in literature studies. Zhong et al. [
Patent US20200317093A1 proposed a query response system for converting natural language queries into standard queries using neural networks, with a processor that determines the relevance of documents and returns documents when they are determined to be relevant. This application describes a system and method for converting natural language queries into standard queries using sequence-to-sequence neural networks. As described in this article, when a natural language query is received, the natural language query is converted into a standard query using a sequence-to-sequence model. In some cases, the sequence-to-sequence model is associated with the layer of interest. The perform searches using standard queries and can return various documents. The documents obtained by the search are scored based at least in part on the determined conditional entropy of the documents. Use natural language queries and documents to determine conditional entropy.
The importance of algorithms related to AI and deep learning to chatbot is obvious. However, this kind of emerging technology is less noticeable in patent documents. Commonly used chatbots are LSTM, transformer, RNN, etc. Interestingly, the bidirectional mechanism is applied to almost all architectures. Chatbot-related articles using bidirectional architecture have appeared in large numbers since 2019, and their number accounted for more than 80% of all years (see Table
Bidirectional related article number.
Search terms | No. of results | |
---|---|---|
All | Since 2019 | |
Chatbot Bi-RNN | 30 | 25 |
Chatbot BiLSTM | 422 | 347 |
Chatbot BERT | 1,290 | 1,150 |
Patent CN111267097A proposed a natural language-based industrial robot-assisted programming method, including parsing language instructions, matching analysis results, and combining coordinate output to generate the final robot-assisted code. The present invention requires a method for auxiliary programming of natural language-based industrial robots according to language instructions and generating corresponding executable codes for the environment image robot. The present invention is divided into three parts. First, use LSTM bidirectional recurrent neural network (Bi-RNN) and fast regional convolutional neural network (F-RCNN) to extract language instructions and features of the factory environment. Second, provide the “attention mechanism” model of the alignment algorithm, and correctly match the machine translation of the instruction in the machine environment, so as to identify the specified object and the output coordinate point of the object. Third, use the model output of the generating operation to match the CoBlox result modular programming model.
The technical development of DL in NLP has been quite mature. Although academic research is constantly pursuing better performance, it is already more than enough at the applied level. When applying any framework commonly used today, even with little training data, a chatbot is able to be perceived satisfactory by users [
Patent CN108282587B proposes a mobile customer service dialogue management method based on state tracking and policy orientation for communication industry, involves adopting the deep Q-network-based strategy optimization method to select best action strategy. The method involves establishing a dialogue problem guiding strategy based on the partially observable Markov decision process (POMDP) model, and applying an action to dialogue environment state of user through the internal action of the POMDP model, so that the state of the conversation environment changes and a certain return is obtained. The likelihood of executing a series of strategies is measured based on the cumulative returns obtained, and the problem is turned into a strategy choice problem. A deep-enhanced learning problem-guided strategy optimization algorithm is constructed based on the dialogue problem guiding strategy obtained by the POMDP model, and a deep Q-network (DQN)-based strategy optimization method is adopted to select the best action strategy.
Chatbot has developed towards an integrated conversation system, where in the context of multiperson conversations, speech segmentation and speaker recognition algorithms have been the main research topics in recent years [
Patent CN111768768A proposes a method of processing voice in the fields of AI, DL, NLP and voice interaction, and noise reduction processing on voice data sent by peripheral control equipment. The specific implementation scheme is as follows: in response to the acquired voice recognition interface call request sent by the peripheral control device, start the voice recognition process; acquire the type of the peripheral control device; determine the target voice noise reduction mode according to the type of the peripheral control device. In the noise mode, noise reduction is performed on the voice data sent by the peripheral control device to obtain the voice data after noise reduction; after noise reduction, voice recognition is performed on the voice data to generate text data. Therefore, through the voice processing method, the noise level generated by other operations in the peripheral control device included in the voice data is reduced.
Interactive Smart Agents (ISAs), which are controlled by users through natural language dialogues, are becoming a part of life, especially in smart home scenarios [
Patent KR2020131299A (assignee: Google LLC) proposes a method of associating multiple remote automation assistant components through IoT devices, combined with voice recognition modules to monitor and send voice data. Patent US10543931B2 proposes a method for monitoring audible and message alerts received during flight in the aircrafts. IoT cockpit includes subsequently marking a cascaded message alert to associate with the display element. After receiving a plurality of alerts, including at least one of the audible alerts or message alarm, the first NLP task is applied to convert the auditory alarm into a text alarm that is structurally consistent with the format for aggregation, or a cascaded message alarm, where the second NLP task is applied to identify the context.
According to the A-TFM results in Section
The study conducts a comprehensive patent review on emerging technologies of natural language-enabled chatbots. The contribution of this study is addressed in Section
The contribution of this study is from three aspects. First, a patent analytic framework is proposed and proved to be effective. Second, emerging technologies are found. Third, application trend is addressed.
A patent analytic framework starts from patent-based ontology construction, followed by patent management map and TFM, and performing the case study part. The four-level hierarchical structure of the ontology is constructed with text-mining approaches such as k-means clustering algorithm and LDA topic modeling, to reduce human interference during the process. The ontology map can be used as the basis for strategic and sustainable R&D planning, from which researchers are able to quickly understand the development trends of key technologies and can identify technology gaps. It is worth noting that in some past patent analysis articles, detailed patent query conditions were first designed, on which the following analysis are based [
The emerging technologies are summarized as follows. Knowledge is the basis of natural language-enabled chatbot, among which feature graph is a feature generation framework that has recently attracted attention. DL is the core of the main method, and most of the DL algorithms are mature. In recent years, patents have focused on the combination of various DL algorithms, by capturing their respective advantages and filling each other’s shortcomings. In terms of speech technology, noise reduction is the focus of recent speech recognition technology. Sounds including voices and noise in operating equipment are obtained from the device and converted into refined text data through the integration of DL and NLP technologies. Furthermore, it is found that context is the main research subject, whether it is the exploration of the knowledge base or the logic of the algorithm. Previous research on NLP has focused on unstructured text, but in recent years, it has clearly turned to messages in dialogue. In unstructured texts, the term frequency-based method can have good results, but the message in the dialogue relies on a large number of pronouns and the continuity and relevance of the context, and the anaphora is more complicated. Even to be able to apply NLP to daily conversations, it faces a larger and broader domain and knowledge base. For this reason, the chatbots of various specific domains integrate with each other to become a more complete and powerful system. Communication technology and system integration are also very important.
As for the application trend, the increasing number of patents shows the rapid development of NLP chatbot in recent years. From the macroscopic patent trend analysis, the development trend of patents has been found. The patents related to natural language-enabled started in 2014 and developed rapidly since 2016. At first, it was mainly based on NLP and knowledge base. By 2018, speech recognition and communication technology have been developed and perfected, and then a large number of applications began to appear in 2019. These applications are concentrated in Silicon Valley’s technology giants, and they have also brought significant improvements to people’s lives. Natural language-enabled chatbot is widely used in the field of e-commerce, focusing on customer service and medical consulting. With the popularization of 5G network technology, more and more voice-driven applications, such as speech-driven automated control for IoT and system integration, along with immersive human-computer interaction interfaces provide better user experience. In addition to e-commerce applications, more applications in the product life cycle process have begun to be observed. The application scenarios of natural language-enabled chatbot have clearly begun to shift from e-commerce to engineering applications, such as product design, engineering assets management, smart manufacturing, and workshop management. Natural language-enabled chatbot, as an emerging smart system architecture using AI, has become a service integration solution through the integration of devices, algorithms, and network communication technologies. It is also expected to continue to impact the traditional information system architecture in the future.
At present, the application of chatbot is still focused on personal assistants and customer services, and these application scenarios are limited to a very limited field of knowledge
From the early rule-based dialogue interaction system to natural language interaction, coupled with the maturity of voice recognition technology, chatbot can provide good dialogue quality in chit-chat and single-round dialogue. The bottleneck of service provision has shifted from system development to the establishment of in-depth domain knowledge base. Many Internet service providers have been able to provide a convenient application framework for establishing chatbot as an automated customer service or personal service assistant. The success of the chatbot service depends on whether it accurately interprets users’ context or intented question and possesses the knowledge base needed to fully support the context and provide accurate replies.
The limitation of chatbot’s focus on a single domain has begun to be noticed, so the practice of integrating multiple domain chatbot into a chatbot advisory group has been seen in recent patents and research. With the changes in chatbot system structure, multiple domain knowledges are integrated into a complex system. In recent years, the strategy of focusing on data-driven innovation has led to new products and business models in the emerging and developing digital markets. However, while exploring knowledge from data, user privacy is an issue that needs to be treated with caution [
To sum up, the feature of chatbot shifts from simple information provision to complex information integration and versatile decision supports, which means the reasoning and automatic dialogue and interface controls must be addressed. Patents on the control of electronic devices for smart homes or cars also support this idea.
The three main motivations of chatbot usage imply the importance of social media to the development of chatbot, the potential of chatbot, and immersive technology in the entertainment industry, and the issues of chatbot implementation [
As a platform for people to initiate conversations, social media has become main chatbot interface applications to the end users. The rapid integration of social media and chatbot in e-commerce sites continues to grow and evolve.
The second most important application motivation is entertainment, which is rarely addressed in patent documents. The realism of chatbot is still insufficient, but it can already provide rich and interesting interaction. In terms of industrial development process, VR is at a similar stage. The VR experience itself is very attractive, just like an exciting game, so the user experience when creating a virtual environment is far more important than the degree of realism [
The third most application motivation is about social services, such as social care for the elderly living alone. In the 3D-TFM proposed in this research, some patents for chatbot applications in social services and education scenarios have indeed been observed. The Turing Test was proposed in 1950 as a method to examine how a machine behaves like a person [
The first limitation is that the data source selected for this study is patent documents from the DI collective global database
The smart search feature of the DI database uses natural language processing and deep learning methods to help find related patents that match the user’s domain description. Compared with the traditional field search, this is a great feature that can help identify related patents faster and more accurately. Nonetheless, this limits the use of paid DI database for comprehensive patent set. The second limitation is that even though data-driven ontology construction methods are investigated in this study, domain experts are still needed to be involved in the entire operation of the framework for two main purposes, key term extraction and result verification. When searching for patents in a specific domain, relevant term will appear in a large number of patent documents. Although the TD-IDF vectorization mechanism has considered both the number of terms and the uniqueness in all documents, the clustering results show that each cluster still contains a large number of common terms. In the results of topic modeling, these general terms are the main topics corresponding to the clustering results, which indirectly confirms the validity of the method of this research. However, even though we construct ontology from patent documents through a data-driven method, we still need domain experts to verify the correctness of its ontology. In addition, in the construction process of TFM, this research also explores the scenarios in which these technologies and functions are applied. Terms related to these scenarios are mentioned in patent data but occupy little number of words. This is also a limitation on TF-based text-mining method.
Future research will solve the problems mentioned above. The first is to expand the source of data. In addition to patent data, Ribeiro-Navarrete et al. [
The patent analysis method proposed in this research is used to explore the emerging technologies and trends of natural language-enabled chatbot, which can reach high consistency with the hints given in academic research. The methodology of this research is not restricted by a specific domain, so the authors hope that this methodology can be used as a reference for researchers to explore more emerging technologies and trends in other fields, so as to demonstrate the contribution of this research.
The authors declare that they have no conflicts of interest.
This research was partially supported by research grant funded by the Ministry of Science and Technology (grant no. MOST-108-2221-E-007-075-MY3). The authors also express their gratitude to Yi-An Su for helping refine the illustrations in the paper.