PQPS: Prior-Art Query-Based Patent Summarizer Using RBM and Bi-LSTM

ive summarization system (QEASS),” in Proceedings of the ACM India Joint International Conference on Data Science andManagement of Data, pp. 301–305, Kolkata, India, January 2019. [41] R. Nallapati, F. Zhai, and B. Zhou, “Summarunner: a recurrent neural network based sequence model for extractive summarization of documents,” in Proceedings of the Kirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, February 2017. [42] Y. Zhang, M. J. Er, R. Zhao, and M. Pratama, “Multiview convolutional neural networks for multidocument extractive summarization,” IEEE transactions on cybernetics, vol. 47, no. 10, pp. 3230–3242, 2016. [43] J. Cheng andM. Lapata, “Neural summarization by extracting sentences and words,” 2016, https://arxiv.org/abs/1603.07252. [44] N. Ramesh, B. Zhou, C. Gulcehre, and B. Xiang, “Abstractive text summarization using sequence-to-sequence RNNS and beyond,” 2016, https://arxiv.org/abs/1602.06023. [45] K. Cho, B. Van Merriënboer, C. Gulcehre et al., “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” 2014, https://arxiv.org/abs/1406. 1078. [46] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in Advances in Neural Information Processing Systems, pp. 3104–3112, MIT Press, Cambridge, MA, USA, 2014. [47] A. M. Rush, S. Chopra, and J. Weston, “A neural attention model for abstractive sentence summarization,” 2015, https:// arxiv.org/abs/1509.00685. [48] S. Chopra, M. Auli, and A. M. Rush, “Abstractive sentence summarization with attentive recurrent neural networks,” in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 93–98, San Diego, CA, USA, June 2016. [49] A. J. C. Trappey, C. V. Trappey, and C.-Y. Wu, “Automatic patent document summarization for collaborative knowledge systems and services,” Journal of Systems Science and Systems Engineering, vol. 18, no. 1, pp. 71–94, 2009. [50] A. J. C. Trappey and C. V. Trappey, “An R&D knowledge management method for patent document summarization,” Industrial Management & Data Systems, vol. 108, no. 2, pp. 245–257, 2008. [51] N. Bouayad-Agha, G. Casamayor, G. Ferraro, S. Mille, V. Vidal, and L. Wanner, “Improving the comprehension of legal documentation: the case of patent claims,” in Proceedings of the 12th International Conference on Artificial Intelligence and Law, pp. 78–87, Barcelona, Spain, June 2009. [52] Y.-H. Tseng, C.-J. Lin, and Y.-I. Lin, “Text mining techniques for patent analysis,” Information Processing & Management, vol. 43, no. 5, pp. 1216–1247, 2007. [53] E. Y. Igde, S. Aydogan, F. E. Boran, and D. Akay, “Linguistic summarization of structured patent data,” International Journal of Computer and Information Engineering, vol. 11, no. 9, pp. 1062–1065, 2017. [54] N. B. Bynagari, “.e difficulty of learning long-term dependencies with gradient flow in recurrent nets,” Engineering International, vol. 8, no. 2, pp. 127–138, 2020. [55] Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” IEEE Transactions on Neural Networks, vol. 5, no. 2, pp. 157–166, 1994. [56] A. Vaswani, N. Shazeer, N. Parmar et al., “Attention is all you need,” in Advances in Neural Information Processing Systems, pp. 5998–6008, MIT Press, Cambridge, MA, USA, 2017. [57] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” 2014, https://arxiv.org/abs/1409.0473. [58] M.-T. Luong, H. Pham, and C. D. Manning, “Effective approaches to attention-based neural machine translation,” 2015, https://arxiv.org/abs/1508.04025. [59] A. See, P. J. Liu, and C. D. Manning, “Get to the point: summarization with pointer-generator networks,” 2017, https://arxiv.org/abs/1704.04368. [60] Y. Wu, M. Schuster, Z. Chen et al., “Google’s neural machine translation system: bridging the gap between human and machine translation,” 2016, https://arxiv.org/abs/1609.08144. [61] T. L. Griffiths and M. Steyvers, “Finding scientific topics,” Proceedings of the National Academy of Sciences, vol. 101, no. 1, pp. 5228–5235, 2004. [62] W. Zhang, R. A. J. Clark, Y. Wang, and W. Li, “Unsupervised language identification based on latent Dirichlet allocation,” Computer Speech & Language, vol. 39, pp. 47–66, 2016. [63] J. Chang, J. Boyd-Graber, S. Gerrish, C. Wang, and D. M. Blei, “Reading tea leaves: How humans interpret topic models,” in Proceedings of the Advances in Neural Information Processing Systems, Vancouver, Canada, 2009. [64] C. Sievert and K. Shirley, “LDAvis: a method for visualizing and interpreting topics,” in Proceedings of the Workshop on 18 Mobile Information Systems


Introduction
e importance of innovative technology development has been well established in many industrial sectors. Also, enterprises assess their invention in terms of intellectual property rights (IPRs) primarily through their patents. With the rapid advancement of various technologies worldwide, patent search and analysis have become an essential task for both the government and the private sector [1]. Enterprises use this legal and technical document (patent) to gain stateof-the-art technology, reveal business trends, and inspire novel solutions [2][3][4][5]. ese patent rights last for around 20 years and give the inventor the right to use the invention commercially. e patentable subject matter and patentability restrictions differ by region. e enterprise's patent attorneys, inventors, and researchers devote a significant amount of time and resources to finding the right patents to discover new technological developments and focus their research in that direction [6]. ey also perform this priorart search to prevent the infringement of the current innovation with the established technology and intellectual property. It is usually performed to ensure the invention's originality. It is publicly available evidence that the invention already exists.
is search technique is much effective in assessing the invention's novelty and non-obviousness, identifying the potential related and competing art, and finally determining the patented invention's strength and scope. e majority of traditional prior-art search techniques are keyword-based. e patent examiners or the patent analyst usually frame the patent search query from the patent application document by considering the term frequency. e priority date and classification codes are typically included in this frequency-based keyword search technique. Because of the patent's ambiguous and nonstandard language, the documents found by a keywordbased prior-art search are insufficient to invalidate the claims. e formulated query is expanded with terms or phrases from external resources such as International Patent Classification (IPC) code definitions [7], thesaurus [8], or knowledge base [9] to boost the retrieval rate and to cope up with the term mismatch problem.
Patent citations, in addition to patent textual fields and classification code, have been shown to improve retrieval rates [10]. ey represent the relationship among the patents. Citation links assist in the discovery of more critical and valuable documents by granting authority to a cited or citing text. Approaches based on citations [11] include bibliographic coupling (BC), co-citation, and direct citation. In co-citation, two documents are relevant if they are cited together by one or more documents, while in BC, the document pair is relevant if both cite together one or more related documents. e stronger the bibliographic pairing is, the more citations the bibliographically coupled text pair shares. BC is retrospective while co-citation is forward looking. is paper makes use of BC to enhance the retrieved prior-art patent search set consisting of thousands of documents. e result set has many irrelevant documents. To search through the entire set and to find the relevant ones is tedious and time consuming. So instead, ranking them based on relevance by incorporating the patent characteristics will do a better job and improve precision.
Furthermore, since the lexicon in these documents varies more, manual processing (reading and understanding) and identifying prominent information from each patent document in the retrieved prior-art search set will be more difficult. e development of text summarizers for technical and legal documents has been prioritised to address this problem. Summarization aims to create succinct and insightful summaries of retrieved patent document collections while preserving the document's sense. Automatically producing summaries from broad text corpora has long piqued the interest of researchers in information retrieval and natural language processing. ese summaries produce a gist (condensed version) of the text that emphasizes only the most relevant points [12]. Automatic summarization is classified as extractive or abstractive depending on how the summary is produced. e most important sentences or paragraphs are selected and assembled to form a description in extractive summarization. On the contrary, abstractive summarization generates meaningful sentences. e proposed PQPS focuses on generating effective summaries on prior-art search results. e prior-art search patent documents are based on the search query and obtained by expanding the initial query with information from the knowledge base. e prior-art patents obtained using this method lack some relevant documents and may include irrelevant documents. Topic modeling approaches and citation analysis are used to enhance further the prior-art result set. Extractive and abstractive summaries are generated from the resultant set. e PQPS encompasses both extractive and abstractive techniques as the patents are lengthy and are challenging to obtain the gist by retaining the information in its entirety. e main contributions of this paper are as follows: (1) Filtering the base query processor patent resultant set as it contains more irrelevant documents through Latent Dirichlet Allocation (LDA).
(2) Enhancing the filtered patent set through bibliographic coupling. (3) Ranking the retrieved prior-art search patent set based on the structural similarity. (4) Generating extractive summaries with stacked RBM. (5) Employing the Seq2Seq model with pretrained embeddings and attention for generating abstractive summaries.
e rest of the paper is laid out as follows. Section 2 portrays the existing works carried out on search query formulation, patent citation analysis, and patent summarization. Section 3 outlines the background of the models and techniques employed for text summarization. e detailed flow of the proposed system is portrayed in Section 4. Sections 5-7 discuss the methodology of the proposed system in detail. e experimental results carried out as part of this work are detailed in Section 8. Finally, Section 9 concludes this paper and discusses the future work.

Related Work
is section presents the challenges associated with prior-art search and presents it in three dimensions. First, we focus on query formulation and expansion techniques for prior-art search. Secondly, we consider the methods that improve the retrieval rate through citations, and finally, we present the techniques for summarizing the patent documents.

Prior-Art Search Query
Processing. Prior-art search query formulation and expansion are the foci of research on enhancing prior-art search and retrieval. As a result, most of the previous search queries relied on patent terms from various textual areas [13][14][15][16][17][18]. Because of abstract or generic terms offered by patentees to optimize their protection scope, this keyword-based query formulation technique falls behind, and the vocabulary mismatch problem persists. is method frequently necessitates additional research on the patent application domain. To address this issue, the authors use external resources such as thesaurus and domain-independent knowledge bases (WordNet, Wikipedia, and Wiktionary) [9,19,20] and domain-dependent knowledge bases (IPC and domain ontology) [7,9,21] to expand queries. Expansion with the domain-independent knowledge bases improves precision but recall drops due to lack of contextual information. IPC definitions were also utilized to expand queries [7]. Although it enhances recall in the chemical area, the results were not consistent across topics.
is system creates a domain ontology and expands the query with terms and phrases from the domain ontology to address the vocabulary mismatch problem.

Patent
Retrieval through Citations. Patent citations are essential for establishing relationships between patents and demonstrating technical developments and evolutions [22]. In this work, as a source for key extraction, the authors employed in-text citations from both patent and non-patent literature and additional metadata. Mahdabi and Crestani used a similar approach, expanding the prior-art search query with term distribution of publications from the citation network [23]. Fuji et al., on the other hand, used citation connections to re-rank patent publications [24]. e authors used textual data and citation linkages to score and rank the patents.
is proposed PQPS varies from prior systems where it uses a bibliographic coupling network of patent citations to find missing relevant patents.

Patent Summarization.
Advancements in machine learning and artificial intelligence have simplified many tasks. One of the major tasks made easy for human through these techniques was automatic text summarization. Several approaches for text summarization have been developed to date. ese summarization systems need to produce a concise summary while representing the information presented in the source document. Based on the way the summaries are generated, summarization techniques fall in to two categories: extractive and abstractive. Extractive summarization [25] techniques select sentences from source document while abstractive technique [26] generates the summary like human-crafted one by considering the whole document. e most common techniques used for extractive summary generation are statistics-based, topic-based, discourse-based, and graph-based methods. Statistical techniques use statistical features [27,28] such as sentence location [29,30], sentence centrality, word or proper noun frequency [31], title similarity, and sentence bushy direction. Individual sentence scores are computed based on assigned feature weights, and sentences with high scores are more likely to be included in the generated summary. On the other hand, topic-based methods identify the terms that characterize the document's topic and use signatures or templates to score the sentences. Sentences are represented as nodes in graph-based approaches [32,33], and a linkage is formed if there is a relationship between them. Many machine learning techniques have been used for summarization, including latent semantic analysis (LSA), Bayesian models [34], topic models, and hidden Markov models (HMMs) [35]. External knowledge bases, such as Wikipedia [36] and ontologies [37,38], were also used for text summarization to identify meaningful sentences by mapping them to concepts in the ontology. Recently, text summarization has grown fast with advances in profound learning technology like RBM [39,40], recurrent neural network (RNN) [41], and convolutional neural network (CNN) [39,42]. Some researchers viewed text summarization as a sequence labeling task [43] and generated the summary.
e SummaRuNNer [41] proposed by Nallapatti et al. is a sequence labeling task where the author evaluated the probability of a sentence to be included in the summary and then included them until it reaches summary length. e abstractive summarization task has recently received ample attention because of its ability to generate sound and verbally robust summaries as that of humans [44]. is task was mostly performed with many-to-many Seq2Seq model, and it was first introduced by Cho et al. [45] and Sutskever et al. [46]. Rush et al. [47] proposed an abstractive sentence summarization model encompassing local attention-based encoder and neural network language model decoder. Chopra et al. [48] proposed a conditioned RNN model for a decoder with a convolutional attention-based encoder in line with sentence summarization.
is model outperforms other state-of-the-art models on the Gigaword corpus dataset. We can see that these summarization models mainly focused on news articles or CNN mail datasets.
Even though text summarization has garnered much attention in recent years, the summaries generated for patent documents are far from human-derived summaries and only a few research studies [49][50][51][52][53] address the problem of patent text summarization. ese works either rely on some metrics to retrieve sentences or paragraphs to be included in summary using ontology [49] or focus on the patent document's claim section [51]. ey have used metrics of discourse summarization.
ese methods are insufficient because the patents contain many recurring abstract terms such as "apparatus," "methods," "means," and "device." Additionally, focusing just on the claim section results in the inclusion of embodiment of the invention in the generated summary. e proposed PQPS system is novel because it combines both extractive and abstractive techniques for generating patent summary through deep learning techniques mainly RBM and Bi-LSTM, respectively.

Encoder-Decoder Architecture
is section provides an overview of the many deep learning-based models used for abstractive summarization techniques such as RNN, LSTM, and GRU. e encoderdecoder architecture is based on the sequence-to-sequence model [46]. Text summarization is a many-to-many sequence problem, where the input sequence (paragraph or document) is mapped to another similar sequence (summary) of varying length. e encoder and decoder are the two primary components of this approach. ey are stacks of recurrent neural network units. e encoder reads the entire input sequence and generates context vector as an internal representation. At each timestep, the decoder reads the context vector and generates the output summary. In the following section, we will look at how different deep learning models can be integrated with this framework to generate abstractive summaries.

Recurrent Neural Network.
e input text is processed in a sequential order by RNN through feedback loops. ese loops distribute data among the various nodes and make predictions based on the gathered information. us, the RNN preserves the order of input words in the sequence. Whenever a new input is received, prediction is made by considering the output of the preceding states. During training, RNN computes gradients at each timestep using backpropagation through time (BPTT) algorithm. is network performs well with shorter sequences. With a lengthy input sequence, it suffers from vanishing gradient Mobile Information Systems problem [54,55] during backpropagation as the gradient becomes smaller and smaller so that update becomes insignificant. Another major issue with larger sequence is training and evaluation due to the computation and memory constraints [56].

Gated RNN.
Long short-term memory (LSTM) and gated recurrent unit (GRU) handle the problem of vanishing gradients using their gates. ey have control over the information passing between the hidden states. ese two networks are essentially the RNN variants with independent hidden and cell states. Figure 1 depicts the variances of the two networks, RNN and LSTM. e LSTM has three gates as shown in the diagram: forget, input, and output gates. e forget gate (equation (1)) is a single-layered architecture with sigmoid activation. is activation function in the forget gate assists in determining whether to preserve the information or discard it.
With the information available, the input gate attempts to learn new information (equations (2) and (3)) and quantifies the significance of the information carried (equation (4)). Based on the significance, the information is stored in the cell state.
e information is passed from current timestamp to the next through the output gate, and the same is given by equation (5). As stated in these equations, the value of the hidden state is determined by passing through the sigmoid and tanh functions. is hidden state (h t )(equation (6)) is used for prediction. GRU is quite similar to LSTM; however, it lacks memory unit. Also, it is less complex with only two gates, namely, reset and update gates.

Bidirectional RNN.
A unidirectional RNN during prediction considers only the previous sequences, and there are possibilities of having noise. As a result, the future predictions suffer, lowering the quality of the summary. To address this issue, bidirectional RNN processes the input sequence in both forward and backward directions, i.e., the input sequence is fed in normal time order for one network and in reversed order for another network. At each time step, the output of the two networks is concatenated and transmitted to the next level. us, the network will carry information about both preceding and next sequences to construct a summary. Bidirectional RNN enhances the quality of the summary generated.

Network with Attention.
e performance of the Seq2Seq models can be improved with better network structure. A single context vector is passed as input from encoder to decoder in the encoder-decoder network. However, if the input sequence is lengthy, this alone will not capture the complete essence. As a result, various context vectors are derived in order to focus on certain parts of the input sequence [57]. Local attention and global attention were distinguished by Luong et al. [58]. Local attention considers only a few hidden states of the encoder when determining the attended context vector, whereas global attention considers all hidden states.

Beam Search.
Beam search techniques are frequently employed in conjunction with the decoder in tasks such as multiple language generation, text summarization, and machine translation [59,60]. Decoding the sequences entails searching over all the potential sequences and ranking them based on their likelihood. Because the vocabulary in these tasks often consists of dozens or millions of words, this search becomes intractable (NPcomplete). As the size of the input rises, the heuristic approaches offer one or more approximate output sequences, which may or may not be sufficient. ese algorithms decode sequences using probability and greedy or beam search. In greedy search, the best candidate for an input sequence is chosen at each time step based on likelihood. However, producing only one top candidate may result in a suboptimal solution. In contrast, beam search analyzes many candidates for an input sequence at each timestep.

PQPS: Prior-Art Query-Based
Patent Summarizer e functioning of the proposed PQPS is described in Figure 2. e query processor retrieves the initial set of patents based on the query built with knowledge bases (domain ontology and WordNet) and the novel patent application document.
ough this retrieved set retrieves multiple relevant documents, it can have irrelevant documents and miss some relevant documents due to information overload.
e PQPS system filters the irrelevant documents using LDA and uses a bibliographic coupling network on citations to improve the retrieval efficiency. e resultant document set is then ranked using the structural similarity metrics. e PQPS then addresses the high workload of the patent analyst by summarizing the ranked patents in both extractive and abstractive manner using deep learning techniques. A detailed explanation of each module is given in the following sections. Mobile Information Systems

Query Processor
e query processor builds an initial query from the patent application document issued by the patent analyst. e initial query is built by extracting noun phrases from different textual fields title, abstract, technical field, and description. e candidate noun phrases are selected to build an initial query based on the term frequency-inverse field frequency (TF-IFF) scoring. e patent document is lengthy and verbose, and each patent has its lexicon; therefore, vocabulary mismatch arises. To rectify this mismatch, the PQPS document retrieval system uses knowledge bases such as domain ontology and WordNet to enrich the initial query with semantically related concepts and terms. e domain ontology-based query expansion uses smart device domain ontology to expand the domain-related concepts while the WordNet-based query expansion system relies on WordNet, a lexical database for English. e document retrieval system's Google patent search employs Google prior-art search API to retrieve patents from the initial query. e citation analyzer module is used for further processing in all the documents obtained by these three systems. More details about this query processor were detailed in our previous work [9].

Citation Analyzer
Patent analysis conducted on the query processor module reveals that irrelevant documents were retrieved, in addition to relevant documents. Some relevant documents were uncovered from the retrieval due to the prevailing vocabulary mismatch problem.
is citation analyzer module adopts a filtering mechanism through LDA and bibliographic coupling methods to reduce the irrelevant patent retrieval and further enhances the relevant document retrieval.

Topic Filterer.
e topic filterer finds the abstract topics using LDA, an unsupervised model from topic modeling. e central intuition behind LDA for document filtering is that it groups each document based on its words, and further related documents are clustered to form a topic. It is based on the assumption that each document in the collection is a mixture of topics, and therefore, the document belongs to the topic whose strength is vital. is filterer analyzes title, abstract, and description of the relevant patent set. e fields are preprocessed, and LDA with collapsed Gibbs sampling [61] is employed. It is a Markov chain Monte Carlo approach where the model parameters are drawn from the posterior distribution for each iteration.

Identification of Number of Topics.
e number of topics is usually determined based on the statistical measure perplexity [62]. It determines the predictive quality of the model. e low perplexity value indicates better performance. But according to Chang et al. [63], perplexity is not correlated to human judgments. erefore, the PQPS incorporates a trial-and-error approach with different values for number of topics based on the coherence value. e topics obtained along with their main keywords and manually generated category names are detailed in Table 1 for a sample patent application entitled "Bluetooth beacon attendance system based on smartphone and application method." For this sample from the patents retrieved, through a trial-and-error approach, the number of topics is decided as 45.

Relevance with Novel Patent Application.
e filterer uses the topic probability distribution of each document to filter the irrelevant retrieved patent documents. LDAvis [64], an interactive tool, is used to interpret and visualize their distributions, and the same is shown in Figure 3. Here the topics are represented as circles where their centres are determined by computing the distance between the topics. e size of the circle depicts the topic prevalence in the corpus.
e intertopic distance is computed using the Jensen-Shannon divergence, a symmetric similarity measure. Based on the intertopic distance, the closely related clusters with the sample patent application topic clusters are chosen as relevant documents and the remaining clusters are filtered out. Figure 3 represents the intertopic distance using the LDAvis tool for the sample patent application titled "Bluetooth beacon attendance system based on smartphone and application method." is figure focuses on the 44 th topic and its closeness with other related topics. As their closeness represents their similarity, closely linked topics are only taken into account for further processing. In this case, the documents which belong to the topics highlighted by a red box are only chosen as relevant document clusters.

Bibliographic
Coupling-Based Patent Retriever. After filtering, the citations for each relevant patent are obtained through Open Patent Services (OPS), a European Patent Office (EPO) web service.
is process allows access to EPO's raw data through the XML interface.
is web service extracts all the citations links for each filtered patent set and is stored in a database. With the data available, we build a citation graph where each patent document acts as a vertex and between the vertices, and there is a directed edge if one patent document cites or is cited by another. Bibliographic coupling helps to retrieve relevant documents that have not been previously retrieved because of information overload. BC groups the patent documents in this citation graph referring to the same set of cited patent documents. e fundamental idea is that if a document d 1 is cited by another document d 2 , it means that d 1 is in some way related and essential to d 2 . is relatedness helps to identify missing relevant documents for the patent application document. e BC strength represents the number of common citations. For each patent document and application document pair, this BC strength is computed and patents with BC strength greater than a threshold are identified as missing relevant patents and included to the newly retrieved set.
Since a patent encompasses numerous subject areas, it may cite another document for any of these topics or subject areas. erefore, the newly retrieved set has the possibility of having few irrelevant documents. All these references and topics need not be relevant to the patent application document. us, the newly retrieved patent set is filtered based on the cosine similarity with the patent application document and a threshold value.

Structural Relevance-Based Patent Ranker.
e documents are ranked based on relevance with the search query terms. In a prior-art search, since the entire patent application is compressed as a query and because of the verbose nature of the patent documents, relevance metrics alone will not be sufficient to order the patents. e patent inherent feature structural similarity is incorporated in the relevance evaluation. Our analysis on the importance of different textual fields (title, abstract, background, and description) in our previous work [9] found that different fields have varying influences. e terms from the description field have more similarities than the abstract and title fields. is phenomenon occurs because the description field contains technical terminologies. Consequently, their similarity with the source document is given different weights. e relevance estimator assigns the field weights in the following order: w title < w abs < w desc . Here w title denotes the weight of terms from title field, w abs denotes the weight of abstract field, and w desc denotes the weights of words from the description section. e structural relevance score calculated with these textual fields is given by the following equation: Here, SR (Q m , d i ) is the structural relevance score, and sim title (Q m , d i ) is the similarity between the semantically enriched query, Q m , and the document title. Similarly, sim abs (Q m , d i ) and sim desc (Q m , d i ) represent the similarity between the semantically enriched query Q m and the document abstract and description, respectively. l(title d i ), l(abs d i ), and l(desc d i ) represent the length of the document title, abstract, and description, respectively.

Patent Summarizer
Patent summarizer creates summary through a unified model by combining both state-of-the art extractive and abstractive approaches. It composes two neural network modules, i.e., summary extractor and abstractive summary generator. e summary extractor encodes each document, extracts the sentences from them, and clusters the individual summaries, and the abstractive summary generator paraphrases each summary clusters.

RBM-Based Extractive Patent Summarizer.
e RBMbased extractive patent summarizer (RBM-EPS) inputs a document set D with multiple related patent documents RBM-EPS encompasses three submodules and will delve into each of them in detail.

Patent Feature
Extractor. e first step towards extractive summarization is through identifying prominent features for the sentence selection. Patent feature extractor relies on the hand-crafted features that correspond to syntactic and semantic information of patent document sentences. Many of these features are widely used by summarizers for sentence selection [38,40,[65][66][67], and their measures are normalized in the range of 0 to 1 for practical usage.
e features that are extracted in this module are detailed in Tables 2 and 3.

Stacked RBM.
is system makes use of a restricted Boltzmann machine, a non-deterministic generative model, to extract the salient sentences. RBM is a two-layer network with an input layer of visible nodes (m nodes) and an output layer of hidden nodes (n nodes). e two layers of a single unit of RBM form a fully bipartite graph, as seen in the workflow of PQPS ( Figure 2). Here, the connections exist only between the nodes of two layers and not among the nodes within a layer where the i th input node (I i ) is connected to h th hidden node (H h ) by a weight (w ih ). Furthermore, all the nodes (visible and hidden) have a constant bias represented as a i and b h for visible and hidden layers accordingly. is system stacks the RBM to create a deep layer architecture. e first unit is a Gaussian-Bernoulli RBM [67], and the second is Bernoulli-Bernoulli RBM.

Summary Aggregator.
e summarized documents are classified into three groups as strongly related, mediumly related, and weakly related based on Word Mover's Distance (WMD) [71] score. WMD (equation (8)) measures the dissimilarity between the documents using word embedding and also takes into account the bag of words representation.
where d i represents the extractive summary of i th patent and dis i indicates the WMD scoring between the search query or the source document and extractive summary. It uses pretrained word2vec embedding [27].

Bi-LSTM-Based Abstractive Patent Summarizer.
Abstractive patent summary generation uses sequence-tosequence (Seq2Seq) network [46], an encoder-decoder architecture. In this many-to-many sequence problem, the encoder parses the input sequence x � (x 1 , x 2 , . . . , x M ) and creates a hidden sequence h e � (h e 1 , h e 2 , . . . , h e T ) and forwards to the decoder. e decoder makes use of this hidden representation as context information and generates the summary sequence y � (y 1 , y 2 , . . . , y S ). Here M and S represent the number of encoder tokens (input document length) and number of decoder tokens (summary length), respectively. For encoding, the PQPS makes use of Bi-LSTM as it understands the context better by preserving the information in both directions backward (past) and forward (future). e structure of Bi-LSTM-based abstractive patent summarizer is represented in Figure 4. Here, three-layered bidirectional long short-term memory (stacked Bi-LSTM) forms the encoder and a single-layered LSTM is used as a decoder along with an embedding layer. In addition to this basic structure, it encompasses the attention mechanism for effective summarization, and we will explore each in detail. Beam width determines the number of sequences to be kept in memory at each t. e target word (w) for time step t is predicted based on probability scores. [30] pretrained embeddings are opted for two reasons in the embedding layer. Firstly, it is built on other pretrained embeddings such as Glove [72] and word2vec [73], and secondly, it combines embeddings with knowledge bases such as WordNet and DBpedia.

Attention Mechanism.
In a simple Seq2Seq model, the encoder usually returns a fixed length context vector that will not retain important information mainly if the input sequence is very long as in the case of the patent document. To deal with this, Bahdanau et al. [57] developed an alignment mechanism where at each time step, it focuses on the crucial parts of the text and generates a context vector (c t ). e context vector is achieved by computing attention distribution (α e t,M ) over the entire sequence of tokens given hidden encoder state (h e M ) and decoder state (h d t ) at time step t. e alignment scores (s e t,M � s(h e M , h d t )) are computed using additive attention. Additive attention linearly combines encoder and decoder hidden states and is given by Table 2: Patent features for RBM-EPS. Feature with description Title and search query similarity: patent title usually reflects the innovativeness or the main theme. is similarity feature helps to retain the sentences that are relevant and related to the patent title and search query provided by patent analyst. To compute them, both cooccurrences-based [40] and similarity-based features are incorporated. FE 1 � |T ∩ sen i | + |SQ ∩ sen i |/log|T| + log|SQ|, FE 2 � sim(T, sen i ) + sim(SQ, sen i )/log|T| + log|SQ| T-set of words in title, SQ-set of words in search query, sen i -set of words in i th sentence, |T|-title length, |SQ|-search query length, and sim(T, sen i )and sim(SQ, sen i )are computed using cosine similarity.
Sentence field position: the first and last sentences in a paragraph or section provide meaningful and prominent information to the reader [68] and therefore it is likely to be part of the summary.
frequency of the field f, n t,f -No. of occurrences of term tin the field f j , k n k,f is the size of the field f j , |F|is the number of fields in the document, and 1 + j: t t ∈ f j represents the field frequency.

Term frequency-inverse concept frequency (TF-ICF):
this is more like TF-IFF where it measures importance of noun phrases to the concepts in smart device ontology hierarchy rather than field wise. ere are chances where the term in the corpus does not belong to any concepts. In that case, its ICF value is assigned to 1.
where |C| is the total number of concepts in the smart device ontology is the total number of concepts in the smart device ontology and 1 + c: t t ∈ C presents the concept frequency with respect to term. Sentence length: generally in patents, the sentences are long. To discard very short sentences and very long sentences from the summary, this feature is included. FE 8 retrieves sentences that are close to the mean length of sentences in the document. FE 7 � t * t, FE 8 � ln(t − |t − t(sen i )/σ|). t(sen i )refers to number of terms in sentence sen i , tdenotes the mean of all sentences in a document, and σis the standard deviation. Sentence centrality: this measure retains sentences that are close to each other. is sentence centrality is computed by considering unigrams and bigrams and average score is treated as feature score. SC ij � sim(sen i )(sen j )/max( sim(sen i , sen j )), FE 9 − Cosine similarity measure, FE 10 − Jaccard similarity measure ematic words(FE 11 ): thematic words are the related words with the topic of the document and their frequency will be higher. Sentences with these words indicate that they are informative. e top 20 most frequent phrases are chosen as thematic words and the sentence score is calculated accordingly.  [32] computes the score based on the number of overlapping words between sentences. Cue phrases: these phrases such as "in particular," "in summary," "as a result," "as a consequence," and so on are usually followed by important information, and so they are good indicators for estimating the sentence salience [69]. FE 13 � SCP i � cp i /tcp. Here, cp i denotes the number of cue phrases in sentence i and tcp indicates the total number of cue phrases in the document. Sentence semantic relatedness score (SSR): this score computes the relatedness between the search query and the phrases in a sentence with respect to smart device ontology using Lin's measure [70]. Higher value on SSR score denotes they are more related and if there is no semantic match, SSR is assigned to 0. FE 14 � SSR i � p sim lin (SQ, p j,i )/tp i sim lin (c 1 ,c 2 ) � (2 * IC(LCS(c 1 ,c 2 )))/IC(c 1 ) + IC(c 2 ),IC(c) � 1 − log(hyponyms(c) + 1)/log(max _ nodes) sim lin (SQ, p j,i ) represents the summation of Lin's similarity between search query and phrase jin sentence iand tp i denotes the total number of phrases in sentence i. c 1 and c 2 are two ontologically related concepts, LCS(c 1 , c 2 ) represents their most common ancestor, hyponyms(c) denotes the number of subsumers of a concept c, and max _ nodes denotes the maximum number of nodes (concepts) in the ontology.
where both v T align and w align are weight matrices.

Query Processor.
Experiments with query processor are carried out with the textual fields of smart device patents collected through the Google Patents search engine. e dataset for this experimentation includes 753 smartphone patents, 478 smartwatch patents, and 421 smarthome patents. e query processor analyzes the following aspects of PQPS: (i) Influence of patent textual fields in a prior-art search query:the system examines the patent textual fields (title, abstract, background, technical field, summary, description, and claims) and finds their impact on each of them on the prior-art search query. e result shows that the terms from the description field in the prior-art search query yield better results than other fields. (ii) Document retrieval system: it encompasses domain ontology-based query expansion system (DOQES), WordNet-based query expansion system (WQES), and Google Patent Search System (GPSS) for retrieving patents.
e prior-art search query for DOQES and WQES is built through query expansion of initial query with smart device domain ontology and WordNet. GPSS automatically constructs a prior-art search query. e retrieval efficiency of the three subsystems in terms of mean average precision (MAP) and recall is portrayed in Table 4. e table results show that DOQES performs better than WQES which is better than GPSS. is difference in retrieval performance was due to the number and quality of search terms. A more detailed analysis of these two aspects of the query processor system was presented in [9].

Citation Analyzer.
e PQPS citation analyzer focuses on three aspects of its submodules: topic-based filtering of the retrieved patent document set, missing relevant patent documents identification through BC, and patent ranking based on structural relevance. Each of these aspects is delved in the following subsections.

Topic-Based Filtering of Retrieved Patent Set.
e topic filterer processes around 1000 patents each time as retrieved by the document retrieval system. Only the title and abstract fields are considered for filtering. Even though all these patents were retrieved in response to a specific query, the patents retrieved covered a wide range of topics. e same can be observed from the token and vocabulary frequencies depicted in Table 5 for the patent set retrieved for various queries. e number of topics for each retrieved patent set must be selected before generating the topic model to filter out irrelevant documents. e filterer computes the coherence score to determine the number of topics. It employs a trialand-error approach to discover the best model by constructing multiple LDA models with topics ranging from 10 to 120. After comparing the coherence scores of multiple models, the model with the optimum coherence score is picked. e coherence scores of multiple LDA models for the sample prior-art search query after iterating 5 times and their mean scoring for different topics are shown in Figure 5. Here, the coherence score varies from 0.22 to 0.28, increasing as the number of topics increases. e optimal coherence score is chosen as the model has highest coherence score before a significant drop or flattening. e optimal coherence score is attained when Num − Topics � 45. Based on the intertopic distance between the topics, the topics are chosen. e topic clusters and their documents that are close to the principal relevant topic are considered as relevant.
IPC codes were used to inspect the patents filtered out to see if any relevant patent documents were included. is filtered-out set does not have any relevant documents. Furthermore, the filtered-out documents significantly reduce the dataset size for further processing. Table 6 shows the retrieved patent set size statistics before and after filtering for sample patent applications.
is table also enlists a sample set of IPC codes for manually investigated patents that had been filtered in and filtered out. For instance, the IPC codes of filtered in patents for patent application "Bluetooth beacon attendance system on smartphone and application method" are G07C1, H04L29, H04B5, H04M1, G06Q50, G06Q10, and so on. ese IPC codes, in turn, are assigned to patents that specify "time or attendance registers' registration or indication or recording," "arrangements related to the transmission of digital information," "nearfield transmission system," "telephonic communicationsubstation equipment," "data processing systems for specific business sectors," and "administration and management of data processing systems," respectively ese topics are much relevant to the patent application. e IPC codes of filtered out retrieved patents, on the other hand, specify cardboard   Figure 5: Deciding the number of topics for the sample patent application "Bluetooth beacon attendance system based on smartphone and application method" based on coherence score. or indoor games, measuring the diagnostic devices and exploring or analyzing the materials through specific methods, security arrangements for protecting computers, and so on. is result confirms that the patents related to these topics are irrelevant and can be filtered out.

Identifying Missing Relevant Patents through Bibliographic Coupling.
e citing and cited patents are retrieved for the topic filtered patents. e date range considered for this citation data collection is from the priority date of the sample patent application to 2020/11/31. Because of abundant patent applications and granted patents, relevant patents can be overlooked in citations. As a result, the BC strength between patent pairs is examined to ensure that no relevant patents are excluded from processing. e BC strength, as previously stated, represents the correlation between the patents. For instance, there are 6,265 patents connected through the citation for the sample patent application "Bluetooth beacon attendance system based on smartphone and application method." Among 6,265 patent citations, 3,494 bibliographic coupled patent pairs were identified. e pairs with low BC strength are excluded. e mean BC strength is computed and set as the threshold value. Around 263 patents with BC strength greater than threshold (3) were retrieved as relevant patents, and overall, both the query processor module and citation analyzer yield 1337 patents (1074 + 263). e same along with the statistics of other patent applications are tabulated in Table 7.

Ranking Patents Based on Structural Relevance-Based Patent Ranker.
e ranker computes patent similarity based on inherent patent characteristic, precisely structural similarity. All the patents filed with USPTO and the WIPO have a defined structure that includes the necessary textual fields title, abstract, description, and background. During our previous work's experiment of query processor module [9], we found that various textual fields have variable effects on the generation and retrieval of prior-art search queries. As a result, weights are applied to the field description, abstract, and title in decreasing order, with values of 0.75, 0.5, and 0.25, respectively. 8.6. Patent Summarizer 8.6.1. Dataset. Experiments for extractive summarization methods are carried out with a smart device patent document set. is patent document set encompasses patents from smarthome, smartwatch, and smartphone domains and is retrieved as part of the query processor module and citation analyzer module in Figure 2.
ese patent documents are collected through Google search Application Programming Interface (API) using the expanded search query and citation analysis. Here, the detailed description of the patent is used as input, and the summary field acts as the reference summary.
is document set consists of 500 documents for each search query.
Abstractive summarization models are trained using BIGPATENT [28] dataset. is dataset comprises 1.3 million patent documents grouped under nine categories based on Co-operative Patent Classification (CPC). Each patent embodiment is used as the input, and the abstract written by the applicant can be used as the gold standard summary. e average length of the gold standard summary is around 100 words. It is difficult to retain such long sequences in memory and generate a summary of this length. erefore, this abstractive summarization uses only the first two from the abstract and use them as gold standard summary. Random patents under the technology categories "g" and "h" are chosen for training and validation.
All the patent documents considered were preprocessed to remove digits and special symbols, and the texts were converted to lowercase. Among the 1.3 million documents, the models were trained with 17,743 documents and validated with 7605 patents. On an average, the documents chosen for training and validation have an average of 100 sentences and 45 words per sentence. So, the patent documents with less than 50 sentences are not considered during training and validation. e statistics of this dataset are summarized in Table 8. e average extractive length for training and validation was 756 and 687, respectively. Similarly, the average human-crafted summary length was 40 words for both training and validation. ese summarization models were tested with our summary extractor module results.
(2) Components are writing tool, a Bluetooth beacon, a smartphone mobile attendance application system, a smartphone Bluetooth 4.0 chip, a smartphone Bluetooth 4.0 recognition system, a smartphone information transmitting system, a remote attendance recording and information storage system, and a remote attendance management server system.
(3) According to the Bluetooth beacon attendance system, based on the Bluetooth beacon sensing technique, an attendance scanning device is completed by virtue of the co-operation of a smartphone with the Bluetooth 4.0 during employee attendance, the information which is confirmed to be qualified is transmitted to the remote attendance management server system to be stored and to generate an attendance result and attendance abnormality in real time, the attendance abnormality prompting and attendance record inquiry can be acquired by an employee from the smartphone mobile attendance application system in real time, so that the appealing statement can be made by the employee. LSA summary (state-of-the-art method) (1) e work attendance relevant information that instrument writes encryption in blank Bluetooth beacon.
(2) Employee can obtain turn out for work abnormity prompt and attendance record inquiry in real time in smart mobile phone movable attendance checking application system and carries out complaint explanation.
(3) Employee can obtain turn out for work abnormity prompt and attendance record inquiry in real time in smart mobile phone movable attendance checking application system and carries out complaint explanation. TextRank summary(state-of-the-art method) (1) And write instrument, Bluetooth beacon, smart mobile phone movable attendance checking application system, intelligent mobile phone Bluetooth.
(2) And write instrument, Bluetooth beacon, smart mobile phone movable attendance checking application system, intelligent mobile phone Bluetooth.
(3) e work attendance relevant information that instrument writes encryption in blank Bluetooth beacon, step user opens the smart mobile phone movable attendance checking application system on mobile phone, input customer id, log in after user name and password, select "the Bluetooth attendance checking function" in "movable attendance checking," by the Bluetooth beacon that mobile phone is pressed close to make, call the signal content that intelligent mobile phone Bluetooth. Free Summarizer (state-of-the-art method) (1) Recognition system, smart mobile phone information transmitting system, long-range attendance record information storage system, and long-range attendance management server system, wherein Bluetooth of mobile phone.
(2) Recognition system connects smart mobile phone information transmitting system via smart mobile phone movable attendance checking application system, attendance record is exported to long-range attendance record information storage system by smart mobile phone information transmitting system, preserve attendance record by attendance record information storage system and result exported to long-range attendance management server system, long-range attendance management server system generates checking-in result, and by the inquiry of this checking-in result input smart mobile phone movable attendance checking application system reminding user. (3) Step user opens the smart mobile phone movable attendance checking application system on mobile phone, input customer id, log in after user name and password, select "the Bluetooth attendance checking function" in "movable attendance checking," by the Bluetooth beacon that mobile phone is pressed close to make, call the signal content that intelligent mobile phone Bluetooth. ROUGE-1 (unigram), ROUGE -2 (bigrams), and ROUGE-L (longest common subsequence) are used here for evaluation. For these metrics, it presents results in terms of precision, recall, and F-score.

Effect of Extractive Summarizer.
Our summary extractor module comprises two layers of RBM with 14 perceptrons in the input layer. e hidden layer's size is twice that of the input as it helps in discovering the latent factors. e final hidden layer is a softmax layer with 2 neurons. e 2 neurons represent the classes where the sentences are to be included in the summary or not. e learning rate for this model is fixed to 0.1. e quality of summary generated by our summary extractor module is tested by applying it on the smart device patent document set and their results are compared with the state-of-the-art methods latent semantic analysis (LSA) and TextRank. e results are compared with a summarization tool Free Summarizer (http:// freesummarizer.com/). Excerpts (first 3 sentences) of the summary generated by the extractive models stacked RBM, LSA, TextRank, and Free Summarizer for a test document are shown in Table 9.
As observed in Table 9, the extractive summary generated using stacked RBM is consistent and well organized. On other hand, LSA and TextRank have redundant entries. In LSA, sentences 2 and 3 are redundant.
Similarly, in TextRank, redundant behavior is observed in sentences 1 and 2. ough we eliminate redundant entry and read through them, they are not consistent and difficult to find out the ultimate objective or topic of the document. Free summarizer on the other hand produces non-redundant summaries but the sentences are systematically organized. Table 10 reveals the ROUGE scores obtained by different extractive summarization models and tools. Among these models, LSA, TextRank, and Free Summarizer have a 50% compression rate, while stacked RBM had an average compression rate of around 60%. As seen from Table 10, our summary extractor module using stacked RBM achieves much better results in terms of ROUGE-2 scoring than other metrics (ROUGE-1 and ROUGE-LCS). Stacked RBM outperforms all other methods and tools for extractive summarization because of the following two reasons. e foremost reason is the feature extractor. e features are extracted by considering semantics, sentence saliency, redundancy, and coherence with the source document and prior-art search query. Semantic importance is computed with smart device domain ontology and sentences with concepts related to domain ontology are given priority. Redundancy is eliminated through similarity computation among sentence pairs. Also, coherence with prior-art search is achieved with title and search query similarity computation feature. Secondly, RBM discovers more latent factors than other methods. As observed from Table 10, the precision score for all ROUGE metrics is in a higher range than recall. is is because the obtained summaries are shorter than the gold standard extractive summary. A possible solution to address this issue is to have summary generation from the input text with a limit on the number of sentences to be generated rather than salient sentence extraction based on features.
Although based on the metrics, stacked RBM performs better and can extract relevant and prominent sentences from the patents, it is essential to evaluate the candidate summaries qualitatively through domain experts. e readability part of the generated summary is qualitatively analyzed by the patent analyst and domain experts from academia. Two patent analysts and five computer science domain experts assessed 50 candidate summaries independently regarding gold standard truth and input patent text. e assessors evaluated by concentrating on informativeness, readability, and validity aspects of generated summary. Informativeness evaluates whether the generated summary is relevant or not. It also checks whether the overall content of the input text is conveyed in the generated summary or not. Readability checks for uniformity or coherence or understandable nature of the summary. Finally, validity evaluates whether the generated summary can be used as such or not. ese factors are evaluated by measuring the score in the range of 0 to 5 with 5 being more coherent, readable, and informative, and 1 being unreasonable and not effective replacement of summary. In Table 11, the average scoring of the three factors (informativeness, readability, and validity) for the extractive models is tabulated. e results tabulated in Table 11 show that stacked RBM attains good scoring in all three focus points compared to other methods. e average execution time of each algorithm is presented in Figure 6. It can be noticed from the figure that the execution time of stacked RBM is on higher side than other models as it involves the number of internal parameter evaluation. On the other hand, with Free Summarizer, the average execution time is static as it is web-based, and it does not consume much time.
ough it consumes time, the summary generated proves that it is worth the time consumed.
8.9. Effect of Abstractive Summarizer. All the experiments with LSTM and Bi-LSTM are carried out with 512 latent dimensions and 128 embedding sizes. e Seq2Seq model with single-layered LSTM learns embeddings from the training documents while the Seq2Seq model with stacked LSTM (2 layers) and attention and our Seq2Seq model with stacked Bi-LSTM (3 layers) make use of pretrained embedding ConceptNet NumberBatch and attention. To avoid overfitting and to further improve the performance of the model, dropout is employed. LSTM and Bi-LSTM layers in the encoder had a dropout of 0.3 while dropout of 0.2 is employed at the decoder. 0.3 or 0.2 dropout means that 30% or 20% of neurons can be dropped during training. Adam [75] with parameters β1 � 0.9, β2 � 0.999, and ∈ � 1e − 8 with learning rate η � 0.001 was used in all abstractive summarization experiments for optimization. Adam was chosen as it combines the properties of other stochastic gradient optimization algorithms such as RMSProp and AdaGrad. e LSTM models were trained for 50 epochs while the Bi-LSTM model was trained for 100 epochs. For all the models, early stopping was set up when the loss for validation data does not improve after 5 epochs (patience � 5). Also, to avoid exploding gradients problem, the gradient clipping technique with a threshold of 5 is applied. e beam width of 10 was used in the models which means it considers at most 10 words at each time step while generating the target word. All these abstractive summarizations models were trained in Google Colab Notebook with Tesla T4 GPU environment. ese models were run 10 times as they are stochastic by initializing    the models with these parameters, and their average scores are presented in Table 12.
As observed from Table 12, the Seq2Seq model using stacked Bi-LSTM with attention and ConceptNet pretrained embedding achieves better performance than other models.
is can be seen from the ROUGE scores where stacked Bi-LSTM improves stacked LSTM by 5.7% on ROUGE-1, 3.6% on ROUGE-2, and 4% on ROUGE-LCS. e sample summary generated by these models is represented in Table 13. In the table, the text is discussing about Bluetooth-based attendance management systems using a smartphone. It discusses the components involved and the working model of the system. e findings in Table 13 show that the Seq2Seq model using LSTM lacks main keywords and has repetitive common keywords while the Seq2Seq model with stacked LSTM has a better summary than that of LSTM. Also, the summary of stacked LSTM has more vocabulary representation than LSTM. is improvement is due to Con-ceptNet embedding. On other hand, the Seq2Seq model with LSTM learns embeddings from the training data available which are lesser when compared to the pretrained embedding dataset size. Comparing these two models, the summary generated by the Seq2Seq model using Bi-LSTM, attention, and ConceptNet embedding was much logical.
ough the summary does not represent all the keywords present in the reference summary such as "recognition system," "application system," "chip," and so on, it is understandable and has main concepts related to the text.

Conclusion
In this paper, we presented PQPS, an extractive and abstractive summarizer for patents. is summarizer is search query based where it extracts prominent terms from the patent application document and expands them with a domain-dependent and domain-independent knowledge base. PQPS filters irrelevant documents using LDA-based topic modeling and enhances the relevant patent retrieval through bibliographic coupling to further improve the retrieval efficiency. e PQPS proposes a ranking model that ranks the resultant retrieval set by providing weightage to different fields of the patent. Finally, it uses deep learning models stacked RBM and Bi-LSTM to summarize the ranked set of patents extractively and abstractively.
Evaluation results of PQPS modules support the effectiveness of the proposed approach. Around 1600 patent applications from the smartphone, smartwatch, and smarthome domains have been tested with the PQPS system. e PQPS query processor module uses domain-dependent and domain-independent ontologies, thereby retrieving roughly 1000 prior-art patents for prior-art search query generation and expansion. e retrieval efficiency of the query processor system's submodules was evaluated, and it was discovered that queries expanded with domain ontology improve relevant document retrieval in terms of recall by around 28% and 56%, respectively, over WordNet-based query expansion system and Google prior-art search system. LDA-based patent document filtering excludes extraneous documents using coherence score and intertopic distance map. e results are manually reviewed using IPC, and the patents that may be missed due to information overload are retrieved using BC. e resultant patent set is extractively summarized with stacked RBM. e average ROUGE-1, ROUGE-2, and ROUGE-LCS recall scores for stacked RBM were 0.46, 0.68, and 0.46, respectively, which were better than those of other state-of-the-art models like LSA, Extractive summary (input): the invention relates to a Bluetooth beacon attendance system based on a smartphone. Components are writing tool, a Bluetooth beacon, a smartphone mobile attendance application system, a smartphone Bluetooth 4.0 chip, a smartphone Bluetooth 4.0 recognition system, a smartphone information transmitting system, a remote attendance recording and information storage system, and a remote attendance management server system. According to the Bluetooth beacon attendance system, based on the Bluetooth beacon sensing technique, an attendance scanning device is completed by virtue of the co-operation of a smartphone with the Bluetooth 4.0 during employee attendance, the information which is confirmed to be qualified is transmitted to the remote attendance management server system to be stored and to generate an attendance result and attendance abnormality in real time, the attendance abnormality prompting and attendance record inquiry can be acquired by an employee from the smartphone mobile attendance application system in real time, so that the appealing statement can be made by the employee. Above-mentioned technical matters of the present invention is mainly solved by following technical proposals: based on the Bluetooth Beacon attendance checking system of smart mobile phone, comprise Bluetooth of mobile phone 4.0 and write instrument, Bluetooth Beacon, smart mobile phone movable attendance checking application system, intelligent mobile phone Bluetooth 4.0 chip, intelligent mobile phone Bluetooth 4.0 recognition system, smart mobile phone information transmitting system, long-range attendance record information storage system and long-range attendance management server system, wherein Bluetooth of mobile phone 4.0 writes instrument by the data of turning out for work of encryption write Bluetooth Beacon, this information is inputted intelligent mobile phone Bluetooth 4.0 chip via Bluetooth signal by Bluetooth Beacon, the output terminal of intelligent mobile phone Bluetooth 4.0 chip connects intelligent mobile phone Bluetooth 4.0. Reference summary: Bluetooth beacon attendance system based on a smartphone. e Bluetooth beacon attendance system comprises a mobile phone Bluetooth4.0 writing tool, a Bluetooth beacon, a smartphone mobile attendance application system, a smartphone Bluetooth 4.0 chip, a smartphone Bluetooth 4.0 recognition system, a smartphone information transmitting system, a remote attendance recording and information storage system, and a remote attendance management server system. Stacked LSTM + attention: Bluetooth for smartphone attendance Bluetooth chip information system Bluetooth storage LSTM Bluetooth attendance for system Bluetooth Bluetooth attendance for for Stacked Bi-LSTM + attention: it is a Bluetooth beacon system for attendance using phone. It has Bluetooth beacon and transmitter server and generates attendance. 16 Mobile Information Systems TextRank, and the Free Summarizer tool. Abstractive patent summary generation using seq-seq Bi-LSTM with Num-Batch embedding and attention surpasses other models with an average recall of 0.399, 0.252, and 0.35 for ROUGE-1, ROUGE-2, and ROUGE-LCS, respectively. As part of future work, we intend to update the summarization model with more sentences.

Data Availability
e patent data used to support the findings of this study were collected through Google patent search API and Open Patent Services. ey can be crawled and retrieved.

Conflicts of Interest
e authors declare that they have no conflicts of interest.