Rapid Determination of Escherichia coli Concentration in Water Using Multiwavelength Transmission Spectroscopy

,


Introduction
Bacterial microbes, as the primary pollutants in water, play a crucial role in assessing water quality and safety.Consumption of water contaminated with excessive bacteria concentration can lead to various infectious diseases, such as hepatitis, infuenza, SARS, pneumonia, gastric ulcers, and respiratory illnesses [1,2].E. coli is recognized as one of the principle contributors to water contamination.Terefore, rapid and accurate detection of bacterial concentration in water provides valuable insights for efective prevention and control of water pollution.
Te methods for measuring bacteria concentration in water include the standard plate counting [3], fuorescence microscopy [4], and digital image analysis.Although these methods yield relatively accurate results but will require complex sample preparation, is time-consuming, and involve expensive biological reagents, making them unsuitable for real-time and on-site detection.To achieve rapid and automated detection of bacteria concentration in water, relevant scholars home and abroad have explored spectrophotometry [5,6] and fow cytometry [7,8].Spectrophotometry determines bacterial concentration based on the absorbance at a specifc wavelength, which is a fast and simple approach; however, the calculated bacteria concentration is infuenced by the absorption capacity of the selected wavelength, leading to high detection limits or low accuracy.Flow cytometry quantitatively analyzes bacteria concentration based on the scattering spectra generated from diferent angles of measurement and fuorescence spectra of exogenous labeling.Tis method provides advantages such as high measurement speed and accuracy but requires the costs for instrumentation, as well as trained personnel for operation.
Multiwavelength transmission spectroscopy is a novel technique for rapidly determining bacteria concentration in water.It provides rich spectral information, enhancing the accuracy of calculating results and reducing the detection limit for measurable bacteria concentration.However, directly employing full-spectrum data for quantitative analysis poses challenges due to high computational requirements, slow processing speeds, and susceptibility to noise interference in certain spectral regions.Terefore, selecting efective spectral bands for bacterial microbes, the quantitative modeling is established between the concentration and the band spectral data, to achieve rapid and accurate measurement of bacterial concentration.
Currently, there are several linear quantitative modeling algorithms for spectroscopy, including multiple linear regression (MLR) [9], principal component regression (PCR) [10], and partial least squares regression (PLSR) [11].In spectroscopic analysis, MLR sufers from noise interference and limitations in the number of spectral bands, requiring manual selection of modeling bands based on empirical knowledge or repeated trials, resulting in substantial workload.PCR compresses the full spectrum information and selects a small number of independent bands to establish a regression model but in failing to consider the correlation between the extracted principal components and the target substance to be measured.PLSR compresses the full spectrum information and selects principal components that are highly correlated with the target substance, which is projected in the direction of the target concentration to be measured.Although PLSR's method is sensitive to the presence of anomalous data [12], if no abnormal data are found in measured samples, it can be efectively used to extract quantitative information of the sample.Terefore, it is the most widely used and efective method in current spectroscopic linear quantitative modeling.
In this study, multiwavelength transmission spectroscopy combined with PLSR is employed to detect E. coli concentration in water.First, the spectral characteristics of E. coli suspensions at diferent concentrations are measured and analyzed.Ten, we calculate the sensitivity, correlation, and detection limit of bacterial spectral variations with concentration under diferent wavelengths, to select the optimal spectral bands.Finally, through the PLSR algorithm, calculate the bacterial concentration and analyze the feasibility and accuracy of this method.

Teoretical Basis.
Multiwavelength transmission spectroscopy, when light of diferent wavelengths passes through water containing bacterial cells, because of the absorption and scattering of bacteria, the transmitted light intensity through the medium is attenuated and its attenuation degree is related to the concentration and bacteria size.Terefore, by analyzing multiwavelength transmission spectra, bacterial concentration can be determined.
Assuming that light scattering of each measured bacterial cell satisfes the condition of unrelated single scattering, according to Mie scattering theory, the optical density of the bacterial cell population at a given wavelength can be expressed as follows [13]: where f(D) represents the particle size distribution of bacteria, l is the path length, N p is the number of cells per unit volume, Q ext (m(λ), D) represents the total extinction coefcient, which is a function of the incident light wavelength λ, the bacteria diameter D, and the relative refractive index m (λ).In order to better characterize the relationship between the measured spectral values and bacterial concentration, the bacterial suspension is assumed to consist of monodisperse cells.Te transmission spectrum of bacterial microorganisms can be expressed as follows: where D 32 represents the average equivalent particle size of bacteria.As indicated by equation ( 2), the measured optical density at diferent wavelengths contains information about bacteria size and concentration.Furthermore, there exists a linear relationship between the measured optical density and bacterial concentration.Tese observations provide a foundation for establishing a quantitative model between multiwavelength transmission spectroscopy and the target bacteria concentration.

Bacterial Sample Preparation and Transmission Spectral
Measurement.E. coli(CICC #10389) was obtained from the China Center of Industrial Culture Collection (CICC), stored in nutrient agar at 4 °C and transferred monthly.Te bacterial suspension was made by activating bacteria, culturing at 37 °C in beef extract peptone medium (PH7.0)containing 0.3% beef extract, 1% peptone, 0.5% NaCl, and 2% agar, expanding propagating in solution culture, centrifuging in the centrifuge (H-1650, Jiangdong instrument), and washing in sterilized deionized water [14].Bacterial suspension was divided into two parts.One part was used for bacterial counting, and the plate counting method was employed to determine the bacterial concentration [3], the other part was used for spectral measurements.A total of 57 sets of E. coli suspensions with diferent concentrations were prepared with deionized water.E. coli suspensions were recorded using a UV-Vis spectrophotometer (UV2550) in the range of 200-900 nm and a step size of 1 nm.Te measurements were performed in 1 cm path-length quartz cuvettes at room temperature, and deionized water was used as a reference.E. coli is rod-shaped with approximately 2.0-6.0 µm in length and 1.1-1.5 µm in width.Te interparticle distance greater than three times the particle diameter is the condition to ensure independent scattering [13].In order to guarantee independent scattering, the particle concentration in the medium should not exceed the values provided in Table 1.

Journal of Spectroscopy
According to the data in Table 1, the maximum particle concentration at diferent particle sizes can be calculated with use cubic Hermite function, when D � 6.0 µm and N p � 1.162 × 10 9 cells/ml.In order to not give rise to the multiple scattering problem, under the assumption that E. coli diameter is 6.0 µm (the maximum length of E. coli), we have chosen a conservative 10 × 10 8 cells/ml as our upper limit to provide enough separation of the cells in water.In addition, the cell suspensions were absorbed back and forth for several times by a clean pipette before the measurement, and then the spectra were recorded with the averages of 3 replicate measurements at each concentration.57 measured spectra are shown in Figure 1 with the wavelength on the xaxis and the optical density on the y-axis.
It can be observed that there are signifcant diferences in the spectra of E. coli suspensions with diferent concentrations, the transmission spectra intensity increases versus E. coli concentration increases.According to the similarity of spectral patterns, the concentrations can be divided into three groups: low concentration (1.09 × 10 6 , 2.18 × 10 6 cells/ml), medium concentration (4.35 × 10 6 , 8.70 × 10 6 , 1.740 × 10 7 , 3.480 × 10 7 , 6.960 × 10 7 cells/ml), and high concentration (1.3920 × 10 8 cells/ml).Tis indicates that bacterial concentration infuences the spectral patterns.Te main reason for this phenomenon is that E. coli suspension at low concentration, the scattering efect of bacterial cells on light is weak, while the internal chromophores (proteins, nucleic acids, etc.) exhibit strong light absorption.Te spectra primarily refect the light absorption characteristics of E. coli cells.As the concentration increases, the scattering efect of E. coli cells on light becomes stronger, and the spectra predominantly demonstrate the cells scattering characteristics, overshadowing the absorption characteristics of intracellular chromophores.

Analysis of Variations in Spectra for E. coli Suspensions at Diferent Concentrations.
From the full spectrum in Figure 2, it can be observed that the spectra of bacterial suspensions exhibit higher optical density in the 200-230 nm wavelength range than in the 230-900 nm wavelength range.Tis is attributed to the joint contributions of the scattering efect of E. coli cells, as well as the absorption efect of its internal chemical components (the twenty amino acids and peptide bonds that constitute proteins).In the 230-320 nm wavelength range, the spectral characteristics of E. coli suspensions vary signifcantly among diferent concentrations.Spectral absorption peak is concentrated on this range.However, there are distinct diferences in the location of absorption peaks among diferent concentrations, as indicated in Table 2.Note that as the increase of E. coli concentration, the spectral absorption peak moves towards the shortwave direction, resulting in a "blue shift" phenomenon, which is generally attributed to the absorption of nucleic acid and certain aromatic amino acids (such as tryptophan, tyrosine, and phenylalanine) that constitute proteins [15,16].With the change of concentration, the leading role of aromatic amino acids on light absorption changed.When the concentration is low, the absorption efect of tryptophan may overshadow the absorption efects of other amino acids.As the concentration increases, other amino acids (such as tyrosine and phenylalanine) begin to play a signifcant role in light absorption.Consequently, this "blue shift" phenomenon leads to distortion in the relationship between spectral data and bacterial concentration, which is not suitable for the measurement of E. coli concentration.
In the wavelength range of 320-900 nm, the spectra for E. coli suspensions at diferent concentrations exhibit remarkable similarity.As the concentration decreases, the optical density values decrease, indicating a strong correlation between spectral data and concentration.However, when the concentration reaches 1.09 × 10 6 cells/ml, the spectrum of E. coli overlaps with the spectrum of pure water for wavelengths greater than 480 nm.Tis phenomenon may be attributed to the weakened scattering and absorption capabilities of E. coli cells at lower concentrations, resulting in the measured optical density values that approach the instrument detection limit.In the wavelength range of 480-900 nm, the spectra with concentrations below 1 × 10 6 cells/ml is indistinguishable.Terefore,  utilizing multiwavelength spectroscopy for E. coli concentration quantifcation, it is crucial to consider the sensitivity and detection limit of diferent wavelength spectra.

Sensitivity of E. coli Spectra to Concentration Variations.
To accurately quantify concentration, it is necessary to analyze the sensitivity of the optical density measurements at various wavelengths to concentration changes.According to equation (2), the optical density measurements are proportional to the concentration.Based on the aforementioned wavelength divisions, the relationship between optical density and concentration is depicted for boundary wavelengths (200, 230, 320, and 900 nm) and a commonly used wavelength (600 nm), as shown in Figure 3. Utilizing the least squares ftting algorithm, the spectra of E. coli concentrations (1.3920 × 10 8 , 6.960 × 10 7 , 3.480 × 10 7 , 1.740 × 10 7 , and 8.70 × 10 6 cells/ml) are selected.Linear ftting is performed for the fve data points at each wavelength to obtain the optical density-concentration linear correlation function: where the parameter m represents the slope of the curve, indicating the sensitivity of the individual spectroscopic line to changes in the concentration.Table 3 provides the optimal ftting slopes and linear correlation coefcients for the fve spectral features plotted in Figure 3.It can be observed that the slopes at 200 nm and 230 nm are larger than those at 320 nm, 600 nm, and 900 nm, indicating that the optical density values at 200 nm and 230 nm are more sensitive to changes in E. coli concentration.Figure 4 illustrates the slope spectrum of E. coli suspension across the entire wavelength range.Note that in the region from 200 to 320 nm, the slope spectrum exhibits a pattern of sharp decrease, followed by a gradual increase, and then a subsequent decrease.Tis behavior is attributed to the strong absorption characteristics of the bacterial chromophores in response to incident light within this wavelength range.In the 320-900 nm range, the slope values decrease Optical Density (a.u) Wavelength (nm) 1: 1.3920×10 8 cells/ml 2: 6.960×10 7 cells/ml 3: 3.480×10 7 cells/ml 4: 1.740×10 7 cells/ml 5: 8.70×10 6 cells/ml 6: 4.35×10 6 cells/ml 7: 2.18×10 6 cells/ml 8: 1.09×10 6 cells/ml 9: 0.00×10 6 cells/ml  Knowing the slope of the best ftting curve at each wavelength point and the detection limits of E. coli concentration at diferent wavelengths can be calculated by the following equation [17,18]: N L is the detection limit of the tested bacterial concentration; K is set to 3; S b represents the standard deviation of the deionized water spectrum obtained from 10 measurements; and M is the slope of the calibration curve for the tested bacteria.
Figure 5 illustrates the UV-visible transmission standard deviation spectra of deionized water for 10 consecutive measurements.By applying equation (4), the detection limit curve for E. coli concentration at diferent wavelengths is shown in Figure 6(a).It can be observed that there are some outliers of zero in the calculated detection limit.Terefore, these outliers need to be removed; the result of eliminating outliers is presented in Figure 6(b).For ease of analysis, the  detection limit spectrum with the outliers removed is processed with 10-point data smoothing, as shown in Figure 6(c).It is evident that, with increasing wavelength, apart from slight fuctuations in localized spectral regions, the detection limit for measuring E. coli concentration gradually increases.Among them, the detection limit minimum (1.266 × 10 5 cells/ml) is observed at 277 nm, while the detection limit maximum (1.858 × 10 6 cells/ml) is observed at 897 nm.Te detection limit of bacterial concentration is wavelength-dependent; selecting appropriate wavelengths can reduce the detectable limit of bacterial concentration.Additionally, when conducting quantitative analysis of low-concentration bacterial microorganisms, it is necessary to exceed the lower limit of microbial concentration determination.

Correlation Characteristics between Spectra and Concentration.
To achieve accurate quantifcation of bacterial concentration, it is not sufcient to only consider the sensitivity of spectral lines at diferent wavelengths to changes in concentration.It is also necessary to determine the correlation coefcient between the optical density (τ(λ)) and concentration (N p ). Te correlation coefcient is given by the following equation [19]: where Cov(N p , τ) is the covariance between N p and τ(λ), Var(N p ) is the variance of concentration, and Var(τ) is the variance of the optical density.Te correlation coefcient (R 2) ranges from 0 to 1, where 0 indicates no linear correlation and 1 indicates perfect linear correlation.Te calculated correlation coefcient spectrum for E. coli suspensions in the region from 200 to 900 nm is shown in Figure 7.
As indicated in Figure 7, the correlation coefcient varies signifcantly within the 200-320 nm wavelength range.Te reason is some chemical components (nucleic acids, peptide bond, and certain aromatic amino acids in composed of proteins) within the cells have strong absorption in the wavelength region (200-320 nm), there are diferences in the spectra of bacterial suspensions with diferent concentrations in this band, such as blue shift occurred in the spectral absorption peak (see Table 2), which afect the correlation between optical density and concentration.In the 320-550 nm wavelength range, the correlation coefcient increases with wavelength and ranges from 0.9988 to 0.9999.Tis indicates a strong linear relationship between optical density and concentration, and the sensitivity of the spectrum to concentration is also moderate.Within the 550-900 nm wavelength range, there is signifcant noise in the correlation coefcient spectra, and the slope values remain in a lower range (8.775 * 10 −10 ∼2.176 * 10 −9 ), indicating low sensitivity, which is not suitable for the measurement of E. coli concentration.
Obviously, achieving accurate quantifcation of bacterial concentration typically requires satisfying two requirements.A high slope value ensures sensitivity of the spectral line to the changes in concentration is selected, while a high correlation coefcient ensures high quantitative accuracy.In the region from 320 to 550 nm, setting the correlation coefcient threshold to 0.9998, the optimal wavelength range that meets these requirements is approximately 388-550 nm, consisting of approximately 162 wavelength points.In the spectral regions 388-550 nm, a minimum R 2 value is 0.9998, the values of slope m remain in a moderate range (2.176 × 10 −9 -3.922 × 10 −9 ), greater than the slope values in the region from 550 to 900 nm.In addition, the chemical components in cells have no absorption not to interfere with the linear correlation between optical density and concentration in this wavelength region, and the R 2 spectrum is low noisy.Taken together, the spectral data of this band are considered the best for quantifying bacterial concentration.

Calculation of E. coli Concentration Using Partial Least Squares Regression.
Here, the optimal wavelength range of 388-550 nm is selected to perform the bacterial concentration measurement.Considering that spectrum contains a signifcant amount of redundant information, the correlation between adjacent wavelength points is higher than that between distant wavelengths, and more number of wavelengths would result in longer data processing times.Terefore, the partial least squares regression (PLSR) method is utilized to compress the spectra in this region (388-550 nm).A quantitative regression model is established by obtaining the spectral data of a few independent wavelengths.
To validate the superiority of the proposed method, 49 spectra in the wavelength range from 388 to 550 nm and their corresponding bacterial concentrations are selected.A mathematical model was established using the PLSR algorithm to describe the relationship between the spectra and E. coli concentrations.Based on this model, 8 concentrations of three diferent orders of magnitude (10 8 , 10 7 , and 10 6 ) are predicted.In addition, a standard curve is constructed based on the relationship between optical density at a single wavelength (600 nm) and concentration.Te concentrations are calculated using this standard curve.Two spectroscopic methods and standard plate counting are compared to determine E. coli concentration in water.Te results are showed in Table 4.As indicated in Table 4, compared to plate counting, the concentrations calculated by the proposed method, the maximum relative error is 4.500%, the average relative error is 0.677%, both falling below 5%.In contrast, the concentration calculated by single-wavelength spectroscopy, the maximum relative error is 36.958%and the average relative error is 13.355%.Tis indicates that calculating bacterial concentration, the accuracy, and stability of multiwavelength spectroscopy are much better than results of the single-wavelength method.

Conclusions
In this study, a rapid detection method for E. coli concentration in water is proposed, utilizing multiwavelength transmission spectroscopy combined with partial least squares regression.Te results show that, compared to plate counting, E. coli concentrations estimated by the multiwavelength transmission spectra outperforms the single-wavelength method in terms of accuracy and stability, the average relative error and the maximum relative error are both less than 5%.Tis method ofers advantages such as short detection time, simplicity in operation, and accurate results, providing a new approach for the detection of bacterial concentration in water.

Figure 1 :
Figure 1: Fifty-seven measured multiwavelength transmission spectra from suspension of E. coli.

Figure 2 :
Figure 2: Multiwavelength transmission spectra of E. Coli with 8 diferent concentrations; the inset plot is the spectra of E. coli suspensions at low concentration.

Figure 4 :Figure 3 :
Figure 4: Slope spectrum of the optical density varied with the concentration in the region of 200-900 nm.

Figure 6 :Figure 5 :
Figure 6: Detection limit of E. coli concentrations at diferent wavelengths.(a) Te detection limit was calculated; (b) eliminate outliers; (c) after 10 points of data smoothing.

Figure 7 :
Figure 7: Spectrum of the coefcient of determination R 2 between optical density and concentrations in the region of 200 to 900 nm.

Table 1 :
Corresponding values of particle size and concentration that satisfes independent scattering.

Table 2 :
Te absorption peak wavelength and optical density of transmittance spectra of E. coli in diferent concentrations (230-320 nm).

Table 4 :
Comparison of E.coli concentration in water between optimal band, single wavelength, and serial dilution plate counting method.