Article Generative Text Secret Sharing with Topic-Controlled Shadows

. Secret image sharing has been extensively and thoroughly researched. However, in the social network environment, shadow images are subject to compression or noise pollution during uploading and transmitting, which makes it challenging to recover secrets losslessly. Texts are more suited for transmission in social networks as shadows because of the broad variety of application scenarios and inherent robustness. Trough a secret sharing technique of ( k, n ) threshold, a secret is encrypted as n shadows, where any k or more shadows can recover the secret, while less than k cannot obtain any information on the secret. In this article, we propose a generative text secret sharing scheme with topic-controlled shadows, which encrypts a secret message as a number of semantically natural shadow texts and controls the topics of shadow texts using bag-of-words models during text generation by the language model. Tis study also proposes two goal programming models to improve the shadow texts’ topic relevance and fuency. Te shadow texts of the proposed scheme satisfy loss tolerance, semantic comprehensibility, topic controllability, and robustness. An ablation study, comparative test, and anti-detection experiment verify the efectiveness of the proposed scheme.


Introduction
Protecting sensitive information from malicious interference is essential when transmitted over public channels. Shannon [1] summarizes three basic information security systems, encryption system, privacy system, and concealment system. By creating a secret key, the encryption system protects the confdentiality of the message itself. Te purpose of privacy systems is to prevent unauthorized users from accessing confdential messages. To protect the existence of confdential messages, the concealment system transmits them through open channels using diferent types of carriers. Te secret sharing (SS) of (k, n) threshold [2,3] satisfes both encryption and privacy requirements. It encrypts a secret message as n shares (shadows), which are distributed to diferent participants. Any k shadows can recover the secret message, less than k shadows cannot obtain anything. A wide range of applications can be performed with it, including access control, password transmission, distributed storage systems, blockchain security, and cloud computing security [4][5][6][7]. the text are robust when transmitted in a public channel. Tese indicate that the text may be a more suitable form of data to be transmitted as the shadow in public channels than images. Since the value range of an image's pixels is [0, 255], SIS can directly establish a mapping relationship from the shared value to the pixel value, thus combining the shared values into an image form. In contrast, text, as a sequence of words with semantic relevance and syntactic rules, cannot form a direct correspondence between shared values and words. Yang et al. [13] proposed a generative text steganography scheme in which they encode candidate words in the text generation process of language model (LM). Later they choose the corresponding words for output according to the secret bits to be embedded, thus completing the mapping of binary bits to word space. With the help of this mapping method, this article proposes to encode the candidate words with the perfect binary tree in the generation process of shadow text, then determine the output words according to the shared values. Since the generation process is under the constraint of the language model, each shadow text is a fuent and natural utterance, which makes this scheme also satisfy the characteristic of the concealment system, that is, imperceptibility.
Social networks' complex and open characteristics provide an excellent camoufage environment for transmitting shadow texts. Te language characteristics of each social account are diferent due to the interest felds, professional directions, etc. Te concealment and security of the transmission of shadow texts through social networks can be further enhanced if the semantics of shadow texts can be efectively controlled combined with the social accounts' characteristics. Controllable text generation (CTG) can control text characteristics, such as emotion and style, while ensuring the content [14][15][16]. CTG involves modeling P(x|α), α is the target attribute and x is the sample to generate. A Plug and Play Language Model (PPLM) for controllable language generation was proposed by Dathathri et al. [17], which is a combination of a pretrained generative model P(x) and attribute models. We propose using this CTG method to control the topic of each shadow text through the bag of words (BoW) associated with diferent topic words, thus making the shadow texts more suitable for social network scenarios.
In this article, we propose a generative text secret sharing with topic-controlled shadows (GTSS) scheme, which shares a secret message as n shadow texts, each of which can have a diferent topic, and any k shadow texts can obtain the secret message. Tis article's motivations and contributions are summarized as follows: (i) In response to the problems faced by the secret image sharing scheme, such as the transmission process of shadow images easily sufering from compression and noise pollution and the shadow images being susceptible to suspicion, this article proposes to use texts as shadows, which are more suitable for robust and covert transmission in the social network environment.
(ii) To address the problem that the shared values cannot be directly linked to the word space, this article proposes to encode the candidate words with the perfect binary tree in the text generation process to establish the mapping from the shared value space to the word space. (iii) Due to the diferences in characteristics of social network users' speeches, we propose to control the generated shadow texts' topics with BoW so that they can be more concealable in social network scenarios. (iv) Most importantly, we propose two goal programming models, which deeply integrate secret sharing, encoding, and controllable text generation techniques and can enhance the topic relevance and text fuency, respectively. Compared with existing generative text steganography schemes, GTSS has certain advantages in both generated text quality and detection resistance.

Preliminaries and Related Work
Preliminaries and related work regarding the proposed scheme are presented here. First, we introduce the defnition of SS and the SS scheme based on the matrix theory used by GTSS. Ten, we introduce the method of mapping binary bits to word space. Finally, we introduce the transformer principle and the transformer-based controllable text generation method.

Secret Sharing Based on Matrix
Teory. Secret sharing can be defned as follows [18]. Share: a randomized algorithm that outputs a sequence (s 1 , . . . , s n ) of shares based on the input message m ∈ M.
Reconstruct: a deterministic algorithm that outputs a message based on a collection of k or more shares.
M is the space of message, k is threshold. Te users can be numbered as 1, 2, . . . , n { } and the user i holds share s i . Denote U ∈ 1, . . . , n { } as a subset of users. Te set of shares belonging to users U is s i |i ∈ U . If |U| ≥ k, then U is authorized, otherwise it is unauthorized. Secret sharing aims to allow authorized sets of users/shares to recover the secret, while unauthorized sets cannot.
Yu et al. [19] proposed an SIS scheme modulo 256, which is not limited to the restriction that the modulo (denoted by p) of the traditional SS scheme must be a prime number, and making the shared value space correspond perfectly to the range of grayscale image pixel values. Tey frst construct an n × k sharing matrix K, which satisfes the determinant of any k × k submatrix is odd. After that, the secret pixel value to be shared is put into the frst element of vector a � (a 0 , a 1 , . . . , a k− 1 ) T and the rest elements of a take value randomly from [0, p − 1]. Ten, n shared values s � (s 1 , s 2 , . . . , s n ) T are obtained by matrix multiplication as shown in the following equations: Each shared value s i corresponds to one row vector K[i], and recovery can be performed when k combinations of (s i , K[i]) are obtained. During the recovery process, the recovery matrix K ′ is constructed by which is a submatrix of K. Te vector a can be recovered by (3), and the frst element is the secret pixel value.
where K ′ − 1 is the inverse matrix of the recovery matrix K ′ , which is obtained by dividing the adjoint matrix K ′ * by the determinant |K ′ |, and s ′ consists of k shared values.
In this article, we choose to encode the candidate pool using a perfect binary tree with tree height h. Te integer form of the code word ranges in [0, 2 h − 1], if we take p � 2 h then the shared value space corresponds perfectly to the code word space. Te size of the candidate pool (CPS) is 2 h . So GTSS generalizes the SS scheme mentioned above and chooses the modulo to be 2 h , so that the range of shared values will not exceed CPS, which ensures the feasibility of information embedding and the correctness of extraction. And this scheme chooses to share l (l ≤ k) secret values at a time, which means that the frst l elements of a are secret values, and the rest k − l elements are chosen from [0, p − 1]. Sharing l secret values at a time can signifcantly improve the efciency of the scheme.

Mapping Method of Binary Bits to Word Space.
Te natural language processing feld typically considers text to be a sequence of words organized according to their semantic associations and syntactic properties. Te chain rule of probability [20,21] can be used to describe the joint probability distribution of word sequences, which is expressed as follows: P(X) is the generation probability of the word sequence x 1 , x 2 , . . . , x N , and P(x N |x 1 x 2 · · · x N− 1 ) represents the conditional probability of generating word x N when x 1 x 2 · · · x N− 1 are given. Conditional probability measures the degree of ftness between x N and the previous text. Generally, the generated text is more reasonable if the conditional probability is higher. In general, multiple candidate words x N are available for a given string of x 1 x 2 · · · x N− 1 , which makes the generated text conform to grammatical and semantic rules.
To achieve the mapping of secret bits to words, Yang et al. [13] proposed to use a fxed-length coding (FLC) based on the perfect binary tree and a variable-length coding (VLC) based on the Hufman tree. Te FLC scheme is simpler to implement, and more time-efcient [22]. Figure 1 illustrates FLC schematically, in which we choose a candidate pool size of 4 and a perfect binary tree height of 2 for illustration. At the t-th time step, the FLC scheme frst inputs the prefx text into LM to get the candidate words and their probability distribution at t + 1-th time step, and then intercepts the candidate pool of fxed size in descending probability order, and encodes the words using a perfect binary tree. After that, the codewords of word 1 , word 2 , word 3 , and word 4 are 00, 01, 10, and 11, respectively. Terefore, the corresponding candidate word (word 2 ) can be chosen based on the secret bits (01) to be embedded. Te output words are added to the prefx for embedding the secret bits in the next time step.
Using the method discussed above, a certain number of secret bits can be carried by each word in the generated text. We frst encrypt the secret value into shared values by SS, then complete the mapping from the shared values to words in shadow texts using perfect binary tree encoding.

Transformer-Based Controllable Text Generation.
On the basis of traditional text generation, controllable text generation adds control over some key information, styles, attributes, etc., making generated text meet some of our expectations.
To create a conditional generative model, Dathathri et al. used a transformer [23] to model natural language distribution and proposed PPLM [17] to sample from P(x|α) ∝ P(α|x)P(x). Equation (5) summarizes the recurrent interpretation of transformers [24].
History matrix H t consists of key-value pairs from timesteps 0 to t in the past. Te x t+1 is then sampled according to where T is a linear transformation that maps the logit vector o t+1 to a vector of vocabulary size.
In the next time step, H t can be adjusted to increase the more relevant words' probability in the candidate pool. Based on the conditioned attribute model P(α|w), history H t can be shifted toward higher log-likelihood (LL) of attribute α and the distribution of unmodifed language model P(x). With ΔH t being the update to H t , (H t + ΔH t ) will increase the likelihood that the generated text possesses the target attribute. Te initial value of ΔH t is zero and the attribute model P(α|x) is rewritten as P(α|H t + ΔH t ) by PPLM. Ten gradient updates are made to ΔH t by going as follows:

Security and Communication Networks
where β indicates the update step size and c represents the scaling coefcient for the normalization term. Tis updating step can repeat m times. Te value of m is usually between 3 and 10. Afterward, the updated logits o t+1 are obtained by a forward pass through o t+1 , Using the modifed o t+1 at time step t + 1, a new probability distribution P t+1 can be generated. GTSS chooses BoW as the attribute model to modify H t . After that, the modifed probability distribution P t+1 is arranged in descending order. Te frst 2 h words are encoded by a perfect binary tree. Ten select the corresponding shadow words for output according to the shared values, thus completing the generation of shadow texts that satisfy specifc topics. Te specifc method and detailed algorithm are described in the next Section.

GTSS Methodology
In this section, we frst defne text secret sharing and introduce the basic idea of GTSS, followed by a detailed description of the sharing algorithm and the recovery algorithm, after which we analyze the applicability of GTSS. Table 1 illustrates the main notations used in this article.

Te Basic Idea.
Tis scheme uses BoW models corresponding to specifc topics as attribute models. A BoW is a set of keywords word 1 , . . . , word m that specify a topic. Equation (7) can be used to represent logP(α|x).
where P t+1 represents the conditional probability distribution for the language model output at time t + 1. By (6), we can calculate ΔH t and modify H t to obtain the conditional probability distribution P t+1 .
We defne text secret sharing: a secret message is shared as n shadow texts; each shadow text is a natural fuent sentence and can have a specifc topic. Te original secret message can be recovered by any k shadow texts, while less than k shadow texts cannot complete the recovery.
To achieve this function, we propose using SS based on matrix theory to share the secret message, controlling the topic of shadow text by BoW, and completing the mapping of shared values to word space using perfect binary tree coding. GTSS includes two parts sharing algorithm and recovery algorithm. Te sharing algorithm includes three modules: secret sharing module, mapping module, and goal programming model. And the recovery algorithm includes reconstruct module and inverse mapping module. We explain them separately below.

Te Sharing Algorithm.
A schematic diagram of the sharing phase is shown in Figure 2, where we consider a scheme with (2, 2) threshold, h � 2, l � 1, and p � 2 h . We assume that the secret message is a piece of secret text, and this scheme can share any binary bits. Te secret text is encoded into a binary bitstream, then sliced into several units per h bit, as well as converted into secret integer values. An n × k sharing matrix K is generated before sharing, which satisfes that any k × k submatrix's determinant is odd, after which the secret values are continuously put into the frst l positions in a � (a 0 , a 1 , . . . , a k− 1 ) T , and the rest k − l elements' value range is [0, p − 1]. Te secret sharing module performs matrix multiplication to obtain the shared values s � (s 1 , s 2 , . . . , s n ) T . Te mapping module continuously generates text using the language model. By using BoW corresponding to a specifc topic, the mapping module modifes the probability distribution to increase the probability of the more topic-compatible words in the candidate pool. Ten the mapping module uses a perfect binary tree to encode the candidate words, and the corresponding words are chosen based on the shared values. Te goal programming model (GPM) guides all the above processes.
For diferent applications, we propose a goal programming model GPM-topic to optimize topic relevance and another goal programming model GPM-ppl to improve text quality. Equation (8) expresses GPM-topic.  [13].
Te conditional probability distribution P(w i |prefix i ) represents the likelihood that the next word w i will be generated given the prior word prefix i of the i-th shadow text. Te probability distribution P is modifed by BoW i to increase the relevance to topic i . Te constraints in GPMtopic are the operations of the secret sharing module and the mapping module. Using the perfect binary tree, the mapping module M(·) completes the mapping of the shared value s i to the word space generated by LM. For one set of secret values, it would not be unique for the combination of shared values because the rest k − l elements of a take values in [0, p − 1]. By continuously adjusting the last k − l elements of a, we can produce diferent shadow word combinations. To generate more appropriate shadow texts, GPM-topic utilizes this point to fnd the combination of shadow words with the largest product of probabilities, which is most relevant to the topic.
Te mapping module modifes the original probability distribution P t+1 with BoW to get P t+1 which has a higher likelihood of ftting the topic. Te language model is trained on thousands of natural texts to ft the natural language probability distribution. Terefore, the modifcation of the  Te t-th word in a word sequence α Te desired controllable attribute P t+1 Te conditional probability distribution of the output of language model at time t + 1 P t+1 Te modifed conditional probability distribution according to a specifc topic probability distribution afects the fuency of the generated text, which is the price of topic control. (9) shows the perplexity (ppl) as a metric for evaluating generated text [25][26][27].
We can see that increasing the conditional probability decreases the perplexity and improves the quality of the word sequence. As shown in (10), we propose GPM-ppl as a method for improving shadow text quality.
Te conditional probability distribution P(w i |prefix i ) represents the likelihood that the next word w i will be generated given the prior word prefix i of the i-th shadow text, and it is the original probability distribution obtained by LM. Te rest is the same as GPM-topic. To reduce the ppl and improve the text quality, we have to fnd the combination of shadow words with the largest product of original probabilities. In this way, each shadow word matches the original distribution with its prior words more closely, resulting in a reduced ppl of shadow texts. Meanwhile, this decreases the ability to select words related to the topic and ultimately reduces shadow text's relevance. Terefore, GPM selection should be based on real-world application requirements.
Algorithm 1 shows the details of the sharing method, by which we can generate n natural and topic-controlled shadow texts based on the input secret bitstream. Te generated shadow texts can then be sent through open channels carrying confdential messages with high robustness and concealment.

Te Recovery Algorithm.
Te secret message can be recovered when k or more shadow texts are obtained. Figure 3 shows the schematic diagram of the recovery phase. Based on the same text generation process as in the sharing phase, the inverse mapping module calculates the conditional probability distribution for the next time step. A perfect binary tree is used to encode the candidate pool. In order to obtain the shared values, we need not use a sampling strategy similar to the sharing stage to select the shadow words, but to fnd the corresponding codewords through the determined shadow words. After that, the reconstruct module is used to put the obtained shared values into the vector s′ and multiply it with the inverse matrix of the

Input:
Secret bitstream B � 0, 0, 1, . . . , 1, 0 { }; (k, n) threshold; the number of secret units to be shared at one time l; perfect binary tree's height h (then CPS � p � 2 h ); the topics of each shadow text topic 1 , . . . , topic n ; bag of words BoW 1 , BoW 2 , . . . , BoW n related to topic i ; initial words for each shadow text prefix 1 , predix 2 , . . . , prefix n . Output: n shadow texts shadow 1 , shadow 2 , . . . , shadow n . (1) Slice B per h bits, transform each unit into integer form to get the secret values Secrets; (2) Construct the sharing matrix K, which satisfes any k × k submatrix's determinant is odd; (3) for each prefix i do (4) Input prefix i into LM to get H i t of shadow i ; (5) shadow i ←prefix i ; (6) i←0; Create a vector a with the frst l values being l secret values, and the remaining k − l values coming from [0, p − 1]; (10) s←Ka; (11) for each s i in s do (12) Based on BoW i , ΔH i t can be obtained through (6) Arrange the P t+1 in descending order, encode the words using the perfect binary tree, and determine the word w i for output by the shared value s i ; (17) else (18) Add w i to shadow i ; (19) i←i + l; (20) return shadow 1 , shadow 2 , . . . , shadow n ALGORITHM 1: Te sharing phase. recovery matrix to get vector a, whose frst l elements are flled by secret values. Te recovery process is shown in detail in Algorithm 2. Te k shadow texts obtained are assumed to be the frst k of the n shadow texts for convenience of representation.
In contrast to images or videos, texts are not compressed or distorted during transmission. Texts, therefore, have excellent robustness, making shadow texts very suitable for transmission in many scenarios. For example, shadow texts can be transmitted by instant messaging software such as Telegram and Skype, or by posting them on social media platforms such as Twitter and Facebook. Ten, the receiver can obtain the shadow texts by browsing and copying through the above platforms, using the recovery algorithm to get the secret message.
Te topic of each shadow text does not need to be transmitted but can be obtained from the shadow text itself. Te topic can be identifed artifcially by the semantics of the shadow text, or it can be determined by which BoW has the highest number of words in the text.

Teoretical
Analysis. First, we analyze the algorithm's input and output text length.
Since GTSS uses GPT-2 [28] as the generation model, the most extended sequence length it can process at one time is 1024, so the upper limit of the length of the most extended input and output text supported is 1024. Terefore, the most extended length of both input prefx text and output shadow text is 1024.
Te design idea of GTSS is that l consecutive secret units are shared into n shadow values, each corresponding to a word in n shadow texts. When all the secret units are shared, the generation of shadow text is ended. Assuming that the length of the secret bitstream B to be shared is |B|, then B is divided into (|B|/h) secret units after being split, and GTSS shares l secret units at a time, so a total of (|B|/h · l) sharing process will be performed, and the shadow text will generate (|B|/h · l) words later. Te actual length of the fnal generated shadow text is the length of the prefx text plus the number of subsequent generated words, that is, |prefix| + (|B|/h · l).
Ten we discuss the computational complexity of the algorithm. Since this article proposes a secret sharing scheme, the analysis focuses on the secret sharing module and reconstruction module for the complexity calculation.
Each sharing process is multiplied by an n × k matrix and a k × 1 matrix, as shown in (2), and a total of (|B|/h · l) sharing processes will be performed, so the time complexity of the secret sharing module in GTSS is O(k · n · 1 · (|B|/h · l)). In GTSS, n � k or n � k + 1, so the time complexity is O(k 2 · |B|/h · l).
In the part of the shadow text that removes the prefx, one word corresponds to one shared value. So k shared values can be extracted from the words at the corresponding positions in the k shadow texts, and then one unit of the secret message can be recovered. Te number of words corresponding to shared values in a shadow text is (|B|/h · l), and GTSS uses matrix multiplication in the recovery phase, a k × k matrix is multiplied by a k × 1 matrix, as shown in (3), so the time complexity of the reconstruction module is O(k · k · 1 · (|B|/h · l)) � O(k 2 · |B|/h · l).

Application Analysis.
Te GTSS scheme proposed in this article has two main application scenarios, multi-channel transmission and access control for the secret message. Te application scenario diagrams are shown in Figure 4, where ST i represents the i-th shadow text.
Considering that social networks are public channels, and each social platform has staf monitoring it, a suspicious account might be deleted or banned. Whenever the transmission encounters the above scenario, the secret message will be lost. Terefore, we can use GTSS to share the secret message as multiple shadow texts with diferent topics and transmit them through diferent social accounts or even social platforms. Tanks to the loss tolerance feature of secret sharing, even if some of the shadow texts are lost due to abnormal reasons, the secret message can be reconstructed when the receiver gets k shadow texts.

Security and Communication Networks
In the traditional secret image sharing scheme, n participants hold n shadow images, where k or more participants use the shadow images in hand to recover the secret message. Since an image is a digital media, a storage device is needed to keep it. In contrast, the shadow of GTSS is in the form of text, whose carrier can be a simple paper or even the participants' memory, so it is not easily limited by the storage device and is easy to remember and manage.

Experiments
Tis section evaluates the proposed GTSS scheme regarding shadow text quality, topic relevance, and antidetection capability, conducts an ablation study to verify each module's efectiveness, and compresses it with text steganography schemes regarding embedding rate and perplexity. [29] dataset for object recognition, captioning, and segmentation, is used to evaluate GTSS' performance. Our corpus is the portion of the dataset used for image captions, which contains 591,753 sentences. (9), is used to evaluate the fuency of shadow text. Te smaller ppl indicates that the closer the statistical distribution of the

Input:
k shadow texts shadow 1 , shadow 2 , . . . , shadow k ; k row vectors corresponding to shadow i ; the number of secret units to be shared at one time l; height of perfect binary tree h; the topics of each shadow text topic 1 , topic 2 , . . . , topic k ; bag of words BoW 1 , BoW 2 , . . . , BoW k related to topic i .

Output:
Original secret bitstream B. (1) Combine k row vectors to obtain the recovery matrix K′, calculate the inverse matrix K′ − 1 according to (3); (2) for each shadow text shadow i do (3) Input the prefx of shadow i into LM to get H t ; (4) while not the end of shadow i do (5) Based on BoW i , ΔH i t can be obtained through (6); Input the last word of shadow i and H i t into LM, then o t+1 , H t+1 ←LM(x t , H t ); Arrange the P t+1 in descending order, encode the words by a perfect binary tree; (10) Extract the codeword corresponding to x t+1 and transform it to integer value, which is the shared value s i . Ten add s i to Shares i ; (11) for each s i in each Shares i do (12) Combine s 1 , s 2 , . . . , s k into vector s′, and calculate a according to a←K′ − 1 · s′, the frst l elements are l secret values, which are added to Secrets; (13) Transform each value in Secrets into the binary form of h bits, then B is obtained; (14) return B ALGORITHM 2: Te recovery phase. generated shadow text and the natural text, the higher the text quality. Since we use BoW as the attribute model to control the topic by adjusting the conditional probability distribution at each time step, we evaluate topic relevance (TR) to topic i using the percentage of words belonging to BoW i in the shadow text, as shown in the following equation: TR i describes the topic relevance between shadow i and topic i , N is the length of shadow text, and N BOW i is the number of words in the shadow text that belong to BoW i .

Language Model.
A transformer-based GPT-2 model [28] with 345M parameters is used for text generation.

Some Examples. Te hyperparameters of GTSS include
(k, n), l, topic i , h, and prefix i . Below we show some examples of shadow texts when these parameters are denoted as diferent values (Tables 2-5). Te secret text to share is "Secret message." Te target topics of the shadow texts are colored and bracketed (e.g., [science]). Te words of BoW are highlighted brightly (e.g., evolution). A softer highlight is used for words related to the topic but not in BoW (e.g., brain). Te prefx of every sentence is underlined (e.g., It has been shown).
To further demonstrate the scalability of the shadow text generated by GTSS in terms of topic control, we add a sample of the (2, 2) threshold, where the topics of the frst shadow text are restricted to "space" and "military," and the topics of the second shadow text are restricted to "technology" and "science." As shown in Table 6, the shadow texts generated by GTSS can satisfy multiple topics at the same time.

Ablation
Study. An ablation study was conducted on fve variants: B: the baseline with no topic control method, and a l ∼ a k− 1 are random selected; BP: the variant with no topic control method under the constraint of GPM-ppl; BT: the variant with topic control method, and a l ∼ a k− 1 are random selected; BTP: the GTSS scheme with topic control method under the constraint of GPM-ppl; and BTT: the GTSS scheme with topic control method under the constraint of GPM-topic.
In order to identify the average perplexity and topic relevance of each shadow text, we randomly select sentences from "Microsoft Coco" as the secret texts. Tables 7 and 8 show the experimental results.
Following are the conclusions we can draw from the above experimental results.
(i) Trough GTSS's topic control mechanism, words that match a specifc topic are more likely to be selected when creating shadow texts, resulting in shadow texts that satisfy the topic. (ii) Because of the topic control means, the modifed probability distribution does not match the training samples, so the BTmethod without the optimization of the goal programming model has the poorest text quality.
(iii) Te shadow texts of BP method optimized by GPMppl possess the smallest complexity and highest quality.
In comparison with both BT and BTT, the GPM-ppl optimized BTP method has a lower perplexity. (iv) Te shadow texts of BTT method optimized by GPM-topic possess the highest topic relevance.

Comparative Experiment.
Although we design a text secret sharing scheme in this article, the shared value is mapped to word space in generating shadow text, which inevitably afects the normal text generation process so that it will have a certain impact on the concealment of shadow text. In this section, to examine the concealment of shadow text, we conduct experiments on GTSS and two classical text steganography schemes Bins [25] and FLC [13] in terms of embedding rate (ER) and perplexity (ppl). Te embedding rate is the average number of efective secret bits carried by each word in the text. Te Bins scheme divides the word space into blocks and then encodes the blocks. In the text generation process, the corresponding block is determined according to the secret bitstream, and the appropriate word is selected for output, thereby completing the embedding of the secret bits. Terefore, the ER of the Bins scheme is related to the size of block block size. Te larger the block size, the more secret bits each word can carry, and the larger the ER, and ER � block size. Te FLC scheme performs perfect binary tree coding on the candidate pool and then output the codeword according to the secret bits, so ER � tree height. GTSS will share l units of the secret message (a total of l × h bits) into n shared values at a time, and the length of each unit of the secret message is h bits, and then map the shared values to the word space through the perfect binary tree. Each word in the shadow text corresponds to a shared value, so ER � h · (l/n) in GTSS.
It can be concluded from the tables and fgures that for the same scheme, with the increase in embedding rate, perplexity tends to increase, and the quality of the text continues to decline. For GTSS and the FLC scheme, with the increase of h and tree height, the candidate word space will become larger, and the possibility of selecting words with small conditional probability becomes larger, so the overall perplexity will increase. For the Bins scheme, as the block size increases, the number of words in each block will decrease, resulting in no words matching the previous text being selected, so the text quality will decrease.
threshold (k, n), the better the text quality. Tis is because ER � h · (l/n) in GTSS, we choose l � n − 1 in the experiment, and lim n⟶∞ h · (l/n) � lim n⟶∞ h · (n − 1/n) � h, so the higher the threshold, the closer the embedding rate is to h . When h is equal, that is, when the size of candidate word space is equal, the high-threshold GTSS scheme has a higher embedding rate, so the text generated by the high-threshold scheme is of higher quality under the same ER. Te quality of the text generated by GTSS-BTP is better than that of the Bins scheme under various thresholds.
Under the (4, 4) and (6, 6) thresholds, when ER is relatively large, the text quality of GTSS-BTP is better than that of the FLC scheme. At the (2, 2) threshold of GTSS-BTT, the text quality drops sharply when the ER is greater than 4. Te text quality of GTSS-BTTstill outperforms the Bins scheme at the high threshold and is not much diferent from the FLC scheme.
GTSS has an advantage over existing text steganography schemes in terms of the ppl of the generated text at higher thresholds under the same embedding rate.
TP is positive sample success prediction, TN is negative sample success prediction, FP is negative sample error prediction, and FN is positive sample error prediction. Te closer the accuracy of the scheme is to 0.5, or the smaller the values of Precision, Recall, and F1-score, the more resistant the generated steganographic text or shadow text is to detection. Te related results are shown in Table 12.
We can see that the Accuracy of Bins, FLC, and GTSS is close to 50%, indicating that these schemes have some  resistance to detection. However, in the performance of the remaining three metrics Precision, Recall, and F1score, the variants of GTSS are better than Bins and FLC, which indicates that the shadow text generated by GTSS has better resistance to detection than the Bins and FLC schemes.

Conclusions
A text secret sharing scheme is proposed in this article, where the secret message is divided as n topic-controlled and fuent shadow texts, and any k shadow texts can reconstruct the secret message. First, we encrypt the secret message using matrix theory to get shared values. Ten we use BoW to modify the conditional probability distribution in order to increase the probability of words meeting the topic. Shadow texts are generated by mapping the shared values into word space using the perfect binary tree. Most importantly, we propose two goal programming models that deeply integrate secret sharing, encoding, and controllable text generation techniques. Te two GPM can enhance the fuency and topic relevance of the shadow text, respectively. We discuss two application scenarios of this scheme: multi-channel transmission and access control of the secret message. Our experimental section illustrates the efectiveness of GTSS through examples and ablation studies. Comparative and anti-detection experiments show that the text generated by GTSS has good quality and anti-detection ability. In the meantime, there are still some defciencies in the proposed scheme, which need to be addressed in the future.
(i) Te limitation of the SS scheme used in GTSS itself, which can only satisfy the (k, k) and (k, k + 1) thresholds, leads to the limitation of the current GTSS scheme in the values of the threshold parameters. Te operations in the fnite feld GF(p n ) are polynomial operations, which can avoid the restriction that the modulo number of operations must be prime in the prime feld GF(p), and using it in GTSS may extend the threshold parameters. (ii) A modifed or deleted word in a shadow text will cause GTSS to fail to fnd the corresponding word in its candidate pool at a certain point, which will further afect the extraction of shared values, as well as the recovery of the secret message. Tere is an urgent need to improve shadow text's ability to withstand word modifcation or deletion attacks.

Conflicts of Interest
Te authors declare that they have no conficts of interest.