A quantitative structure-activity relationship (QSAR) model of angiotensin-converting enzyme- (ACE-) inhibitory peptides was built with an artificial neural network (ANN) approach based on structural or activity data of 58 dipeptides (including peptide activity, hydrophilic amino acids content, three-dimensional shape, size, and electrical parameters), the overall correlation coefficient of the predicted versus actual data points is
In recent years, some progress have been made in bioinformatics study of functional peptide preparation, such as comparing active peptide sequences in database, hydrolysis enzyme choosing, simulated hydrolysis, activity prediction of hydrolysate, and so forth [
Besides comparing characterized peptide sequences in databases, peptide quantitative structure-activity relationship (QSAR) models could also be used in peptide bioinformatics study. QSAR models are mathematical functions that describe the relationship between activity and chemical structure expressed by variables. Such models are applied both to predict activity of untested chemical structures and to predict the chemical structure of compounds with specific activity [
An artificial neural network (ANN) is an interdisciplinary technique, involving biology, mathematics, physics, electronics, and computer technology. It is a kind of information processing system based on imitation of the structure and function of brain networks. It is the theoretical model of the human neural network. ANN technique can simulate any nonlinear process; therefore, it can avoid the linear deficiencies [
In this study, illustrated by preparation of ACE-inhibitory peptides from defatted wheat germ protein, a QSAR model was built with ANN. The structural characteristics of the ACE-inhibitory peptides were investigated according to the model. Based on the structural characteristics analysis and experimental result of DWGP digestion, appropriate protease was selected to produce high-activity ACE-inhibitory peptides from DWGP isolates.
Defatted wheat germ protein was purchased from Man Tian Xue Flour Industry (Henan, China). Alcalase 2.4 LFG (2.670 AU/g) and medium temperature amylase 480 L (527.50 KNU/g) were purchased from Novo Co. (Shanghai, China). Angiotensin I-converting enzyme (ACE; EC 3.4.15.1) was purchased from Sigma Chemical Co. (St. Louis, MO, USA). N-(3[2-Furyl]Acryloyl)-Phe-Gly-Gly (FAPGG) was purchased from Fluka Chemical Corp. (Milwaukee, WI, U.S.A.). All the other reagents were in analytical purity grade.
The instruments used were as follows: thermostat-controlled water-bath (model HH), Jintan Zhongda Instruments Co., Ltd. (Jintan, Jiangsu, China); pH meter (model PHS-3C), Shanghai Precision & Scientific instrument Co., Ltd. (Shanghai, China); electrothermal blast drying oven, Shanghai Laboratory Instrument Works Co., Ltd. (Shanghai, China); Agilent 1100 HPLC, Agilent Technologies Inc. (Santa Clara, CA, U.S.A.); SPX-250B biochemistry incubator, Changzhou Guohua Electric Co., Ltd. (Changzhou, China); Multiskan Spectrum Microplate Reader, Thermo Scientific Inc. (Hudson, NH, U.S.A.).
DWGP isolates were prepared according to the method described by XIN Zhi-hong [
Ten grams of DWGP was dispersed in 1 L distilled water and was digested in batch by Alcalase at pH 9.0, 50°C or by Neutrase at pH 7.0 at 50°C, both at the enzyme/substrate mass ratio of 8% ([E]/[S]). Samples were collected at 0.5, 1, 1.5, 2, 3, 4, and 5 h and were immediately heated in a boiling water bath for 10 min. After cooling, the samples were centrifuged at 10,000 r/m for 15 min, and the supernatants were diluted with distilled water to determine their ACE-inhibitory activities.
In this study,
Amino | Code | Amino | Code | Z2 | |||||
---|---|---|---|---|---|---|---|---|---|
acid | acid | ||||||||
Ala | A | 0.07 | −1.73 | 0.09 | His | H | 2.41 | 1.74 | 1.11 |
Val | V | −2.69 | −2.53 | −1.29 | Gly | G | 2.23 | −5.36 | 0.30 |
Leu | L | −4.19 | −1.03 | −0. 98 | Ser | S | 1.96 | −1.63 | 0.57 |
Lie | I | −4.44 | −1.68 | −1.03 | Thr | T | 0.92 | −2.09 | −1.40 |
Pro | P | −1.22 | 0.88 | 2.23 | Cys | C | 0.71 | −0.97 | 4.13 |
Phe | F | −4.92 | 1.30 | 0.45 | Tyr | Y | −1.39 | 2.32 | 0.01 |
Trp | W | −4.75 | 3.65 | 0.85 | Asn | N | 3.22 | 1.45 | 0.84 |
Met | M | −2.49 | −0.27 | −0.41 | GIn | Q | 2.18 | 0.53 | −1.14 |
Lys | K | 2.84 | 1.41 | −3.14 | Asp | D | 3.64 | 1.13 | 2.36 |
Arg | R | 2.88 | 2.52 | −3.44 | Glu | E | 3.08 | 0.39 | −0.07 |
Fifty-eight kinds of ACE-inhibitory peptides (dipeptides) samples and their activity data (50% inhibitory concentration on ACE, i.e., IC50 value) were used in the text and were shown in Table
The ACE-inhibitory peptides’ sequences with
Peptide | Log(1/IC50) | ||||||
---|---|---|---|---|---|---|---|
AA | 3.21 | 0.07 | −1.73 | 0.09 | 0.07 | −1.73 | 0.09 |
AW | 5 | 0.07 | −1.73 | 0.09 | −4.75 | 3.65 | 0.85 |
DG | 1.85 | 3.64 | 1.13 | 2.36 | 2.23 | −5.36 | 0.3 |
GF | 3.2 | 2.23 | −5.36 | 0.3 | −4.92 | 1.3 | 0.45 |
GP | 3.35 | 2.23 | −5.36 | 0.3 | −1.22 | 0.88 | 2.23 |
GR | 2.49 | 2.23 | −5.36 | 0.3 | 2.88 | 2.52 | −3.44 |
GW | 4.52 | 2.23 | −5.36 | 0.3 | −4.75 | 3.65 | 0.85 |
GY | 3.68 | 2.23 | −5.36 | 0.3 | −1.39 | 2.32 | 0.01 |
IF | 3.03 | −4.44 | −1.68 | −1.03 | −4.92 | 1.3 | 0.45 |
IW | 5.7 | −4.44 | −1.68 | −1.03 | −4.75 | 3.65 | 0.85 |
IY | 5.43 | −4.44 | −1.68 | −1.03 | −1.39 | 2.32 | 0.01 |
RF | 3.64 | 2.88 | 2.52 | −3.44 | −4.92 | 1.3 | 0.45 |
RP | 1.1818 | 2.88 | 2.52 | −3.44 | −1.22 | 0.88 | 2.23 |
VG | 2.96 | −2.69 | −2.53 | −1.29 | 2.23 | −5.36 | 0.3 |
VW | 1.6 | −2.69 | −2.53 | −1.29 | −4.75 | 3.65 | 0.85 |
VY | 4.66 | −2.69 | −2.53 | −1.29 | −1.39 | 2.32 | 0.01 |
YG | 2.7 | −1.39 | 2.32 | 0.01 | 2.23 | −5.36 | 0.3 |
RW | 4.8 | 2.88 | 2.52 | −3.44 | −4.75 | 3.65 | 0.85 |
AY | 4.28 | −2.69 | −2.53 | −1.29 | −4.92 | 1.3 | 0.45 |
RP | 3.89 | −4.44 | −1.68 | −1.03 | −1.22 | 0.88 | 2.23 |
AF | 3.72 | 0.07 | −1.73 | 0.09 | −4.92 | 1.3 | 0.45 |
AP | 3.64 | 0.07 | −1.73 | 0.09 | −1.22 | 0.88 | 2.23 |
VP | 3.38 | −2.69 | −2.53 | −1.29 | −1.22 | 0.88 | 2.23 |
IG | 2.92 | −4.44 | −1.68 | −1.03 | 2.23 | −5.36 | 0.3 |
GI | 2.92 | 2.23 | −5.36 | 0.3 | −4.44 | −1.68 | −1.03 |
GM | 2.85 | 2.23 | −5.36 | 0.3 | −2.49 | −0.27 | −0.41 |
GA | 2.7 | 2.23 | −5.36 | 0.3 | 0.07 | −1.73 | 0.09 |
GL | 2.6 | 2.23 | −5.36 | 0.3 | −4.19 | −1.03 | −0.98 |
AG | 2.6 | 0.07 | −1.73 | 0.09 | 2.23 | −5.36 | 0.3 |
GH | 2.51 | 2.23 | −5.36 | 0.3 | 2.41 | 1.74 | 1.11 |
KG | 2.49 | 2.84 | 1.41 | −3.14 | 2.23 | −5.36 | 0.3 |
FG | 2.43 | −4.92 | 1.3 | 0.45 | 2.23 | −5.36 | 0.3 |
GS | 2.42 | 2.23 | −5.36 | 0.3 | 1.96 | −1.63 | 0.57 |
GV | 2.34 | 2.23 | −5.36 | 0.3 | −2.69 | −2.53 | −1.29 |
MG | 2.32 | −2.49 | −0.27 | −0.41 | 2.23 | −5.36 | 0.3 |
GK | 2.27 | 2.23 | −5.36 | 0.3 | 2.84 | 1.41 | −3.14 |
GE | 2.27 | 2.23 | −5.36 | 0.3 | 3.08 | 0.39 | −0.07 |
GT | 2.24 | 2.23 | −5.36 | 0.3 | 0.92 | −2.09 | −1.4 |
WG | 2.23 | −4.75 | 3.65 | 0.85 | 2.23 | −5.36 | 0.3 |
HG | 2.2 | 2.41 | 1.74 | 1.11 | 2.23 | −5.36 | 0.3 |
GQ | 2.15 | 2.23 | −5.36 | 0.3 | 2.18 | 0.53 | −1.14 |
GG | 2.14 | 2.23 | −5.36 | 0.3 | 2.23 | −5.36 | 0.3 |
QG | 2.13 | 2.18 | 0.53 | −1.14 | 2.23 | −5.36 | 0.3 |
SG | 2.07 | 1.96 | −1.63 | 0.57 | 2.23 | −5.36 | 0.3 |
LG | 2.06 | −4.19 | −1.03 | −0.98 | 2.23 | −5.36 | 0.3 |
GD | 2.04 | 2.23 | −5.36 | 0.3 | 3.64 | 1.13 | 2.36 |
TG | 2 | 0.92 | −2.09 | −1.4 | 2.23 | −5.36 | 0.3 |
EG | 2 | 3.08 | 0.39 | −0.07 | 2.23 | −5.36 | 0.3 |
PG | 1.77 | −1.22 | 0.88 | 2.23 | 2.23 | −5.36 | 0.3 |
LA | 3.51 | −4.19 | −1.03 | −0.98 | 0.07 | −1.73 | 0.09 |
KA | 3.42 | 2.84 | 1.41 | −3.14 | 0.07 | −1.73 | 0.09 |
RA | 3.34 | 2.88 | 2.52 | −3.44 | 0.07 | −1.73 | 0.09 |
YA | 3.34 | −1.39 | 2.32 | 0.01 | 0.07 | −1.73 | 0.09 |
FR | 3.04 | −4.92 | 1.3 | 0.45 | 2.88 | 2.52 | −3.44 |
HL | 2.49 | 2.41 | 1.74 | 1.11 | −4.19 | −1.03 | −0.98 |
DA | 2.42 | 3.64 | 1.13 | 2.36 | 0.07 | −1.73 | 0.09 |
EA | 2 | 3.08 | 0.39 | −0.07 | 0.07 | −1.73 | 0.09 |
DM | 2.7782 | 3.64 | 1.13 | 2.36 | −2.49 | −0.27 | −0.41 |
IP | 3.89 | 2.92 | −4.44 | −1.68 | −1.22 | 0.88 | 2.23 |
*
Because of the quite different physical meaning of the input parameters, the following formula was used in this study to make the sample sets data normalized so as to accelerate network convergence and overfitting:
39 dipeptides were randomly selected as study samples in the neural network model, the rest were test samples. Each of two peptides corresponding to 6
Diagram of BP-ANN model of ACE-inhibitory peptides’ QSAR.
A three-level BP neural network model was built using MATLAB neural network tool (from Matrix Laboratory). Transfer functions of neurons in hidden layer and output layer were Tansig function and Purelin function, respectively. Because the BP neural network is not easily converged or easily falls into local minimum, the following steps were applied to avoid it: (1) network training algorithm using gradient descent momentum Traingdm, (2) network training objectives (mean square error) is set to 10−2, (3) the number of training steps is controlled in 6000. The number of hidden layer neurons was determined through repeated verification.
N-(3-[2-Furyl]Acryloyl)-Phe-Gly-Gly (FAPGG, purchased from Fluka Chemical Corp., Milwaukee, WI, U.S.A.) was used as substrate in ACE-inhibition assay. The reagents were sequentially added in for test reaction according to Table
Reagents used in determination of ACE inhibiting activity.
Blank ( | Sample ( | |
---|---|---|
ACE (0.1 U/mL) | 10 | 10 |
FAPGG (1 mmol/L)1 | 50 | 50 |
HEPES buffer2 | 40 | 0 |
Sample | 0 | 40 |
1FAPGG (1.0 mmol/L): prepared with 0.08 M HEPES buffer (pH 8.3) containing 0.3 M NaCl.
2HEPES buffer: HEPES 1.910 g, NaCl 1.755 g, dissolved with double-distilled water, pH adjusted with NaOH, and metered volume with double-distilled water to 100 mL, stored at 4°C.
The degree of hydrolysis (DH) was measured by pH-stat method. The release of amino acids in protein digestion makes pH of the hydrolysate decrease significantly, the alkali solution was added into hydrolysates to maintain pH value. By recording the amount of alkali consumed, the degree hydrolysis of protein and the amount of the rupture protein bonds can be figured out according to the following formula:
Amino acid composition analysis was employed in this study to determine DWGP amino acid composition by o-phthalaldehyde (OPA) precolumn derivatization RP-HPLC determination [
In this study, 4–10 hidden layer neurons were selected to build QASR model, each hidden layer neuron was modeled five times in order to identify the optimal number of hidden layer neurons. Network convergence speed rises when the number of neurons increases, but too many or too few of hidden layer neurons will decrease the generalization performance of model. Under the premise of guaranteed network convergence, a fewer number of neurons are preferred. The correlation coefficients
The correlation coefficient
The predicting results of the (6-7-1) BP network model on the set of prediction.
The back stepping method was used to find out the operator which has the greatest impact on the activity. The steps are as follows: (1) find out which hidden layer neuron has the greatest impact on output (activity), (2) find out which input neuron (specific
In Figure
The parametric diagram of BP network model.
After searching the hidden layer neurons, the input layer neurons with the greatest impact on the hidden layer neurons were subsequently searched. In Figure
Table
The neurons weights from input layer to hidden layer.
Neurons | ||||||
---|---|---|---|---|---|---|
(1) | −0.3515 | 0.28331 | 0.28286 | −0.44043 | 0.037431 | −0.085447 |
(2) | 0.27249 | 0.041643 | −0.9822 | −0.087255 | 0.47412 | −0.18318 |
(3) | 0.57416 | −0.14092 | 0.60775 | −0.08924 | 0.11108 | −0.5922 |
(4) | 0.24392 | 0.21597 | −0.20115 | 0.24221 | −0.07429 | −0.54316 |
(5) | 0.23817 | 0.07212 | 0.68217 | −0.00285 | 0.26846 | −0.47522 |
(6) | −0.30588 | 0.21639 | 0.036361 | −0.41676† | −0.2514‡ | −0.23479 |
(7) | −0.090315 | −0.058403 | 0.0745 | −0.32458† | −0.24483‡ | −0.10795 |
*
The DWGP contains 42.84% hydrophobic amino acids (Table
Amino acid composition of wheat germ protein isolates (g/100 g protein).
Amino acid | Content |
---|---|
Asp +Asn | 8.40 |
Glu + Gln | 15.28 |
Ser | 4.40 |
His | 3.15 |
Gly | 6.19 |
Thr | 3.94 |
Arg | 9.79 |
Ala | 6.41 |
Tyr | 2.97 |
Cys | 0.39 |
Val | 7.20 |
Met | 1.70 |
Phe | 5.22 |
Ile | 4.94 |
Leu | 8.07 |
Lys | 6.33 |
Pro | 5.63 |
Trp | 0.69 |
Hydrophobic amino acids | 42.84 |
Aromatic amino acids | 8.89 |
Neutrase (a kind of neutral protease) tends to hydrolyze protein to produce peptides whose C-terminals are hydrophobic amino acids, such as Tyr, Try, or Phe. Alcalase (a kind of alkaline protease) tends to hydrolyze protein to obtain peptides whose C-terminals are amino acids with large side-chain and no charge (aromatic and aliphatic amino acids), such as Ile, Leu, Val, Met, Phe, Tyr, or Trp. Moreover, the hydrolysis process will be accelerated when N-terminals of peptides have hydrophobic amino acids [
The degree of hydrolysis of DWGP hydrolysate treated with Alcalase and Neutrase.
The ACE-inhibitory activity of DWGP hydrolysate treated with Alcalase and Neutrase.
From Figure
Based on data of activity, hydrophilic amino acids, three-dimensional shape, size, and electrical parameters of 58 dipeptides, a quantitative structure-activity relationship (QSAR) of amino acids ACE-inhibitory peptides was built with ANN, the related coefficient is 0.928, and by analyzing the ANN model, it was found that (1) C-terminal is primarily important to ACE-inhibitory activity; (2) proteins containing abundant hydrophobic amino acids are potential good source to produce ACE-inhibitory peptides; (3) as for DWGP, Alcalase was a proper protease for ACE-inhibitory peptides preparation.
The authors wish to thank the support from Grants of the China Postdoctoral Science Foundation (20100471386), Jiangsu Postdoctoral Grant (0902029C), and Research Foundation for Talented Scholars of Jiangsu University (08JDG032). The study is a pure academic behavior without financial support of any company.