Automated Analysis of Crackles in Patients with Interstitial Pulmonary Fibrosis

Background. The crackles in patients with interstitial pulmonary fibrosis (IPF) can be difficult to distinguish from those heard in patients with congestive heart failure (CHF) and pneumonia (PN). Misinterpretation of these crackles can lead to inappropriate therapy. The purpose of this study was to determine whether the crackles in patients with IPF differ from those in patients with CHF and PN. Methods. We studied 39 patients with IPF, 95 with CHF and 123 with PN using a 16-channel lung sound analyzer. Crackle features were analyzed using machine learning methods including neural networks and support vector machines. Results. The IPF crackles had distinctive features that allowed them to be separated from those in patients with PN with a sensitivity of 0.82, a specificity of 0.88 and an accuracy of 0.86. They were separated from those of CHF patients with a sensitivity of 0.77, a specificity of 0.85 and an accuracy of 0.82. Conclusion. Distinctive features are present in the crackles of IPF that help separate them from the crackles of CHF and PN. Computer analysis of crackles at the bedside has the potential of aiding clinicians in diagnosing IPF more easily and thus helping to avoid medication errors.


Introduction
Crackles are a common finding in patients with interstitial pulmonary fibrosis (IPF). Their presence in a patient is often the first clue that the disease is present. Unfortunately, they can be misinterpreted as being due to congestive heart failure (CHF) or pneumonia (PN), and as a consequence patients may receive inappropriate therapy. On occasion, this can lead to serious, unwanted side effects such as dehydration due to the inappropriate administration of diuretics or an adverse reaction to an antibiotic that was not indicated in the first place. In an attempt to reduce these complications, we studied the sound patterns of patients with these diseases using a multichannel lung sound analyzer (STG16) to determine if such analysis could help differentiate IPF from CHF and PN.
Using advanced statistical techniques we compared features of IPF crackles to those in patients with CHF and PN. Our goal was to determine if there are features of the lung sounds in IPF patients that would help to distinguish them from the lung sounds of patients with CHF and PN.

Materials and Methods
Patients were selected for this study from a pool of patients who had undergone lung sound analysis as a part of a broader study of the correlation of disease processes with lung sounds patterns. To acquire patients into this study, we identified hospitalized patients and outpatients of a community teaching hospital who were diagnosed as having a specific cardiopulmonary disease or were considered to be normal by their caregivers. The studies were not made on consecutive patients; this is a convenience sample and we currently have over 1,000 patients for whom we have both the diagnosis and the lung sound analysis. The diagnostic category of each of the patients was that of the clinicians caring for these patients. The CHF and PN patients were inpatients in a teaching hospital, and diagnoses were confirmed by board certified specialists. The IPF patients were outpatients and were all seen by pulmonary specialists. There were 39 patients with IPF, 95 with CHF, and 123 with PN. All patients were examined using a multichannel lung sound analyzer (STG16).

ms
Highest peak (a) Negative polarity (c) Figure 1: The waveform of a typical crackle (a). The crackle analysis starts by identification of the crackle's highest deflection highest peak. The half-period to the left of the highest peak is marked as T 1 . The half-period to the right of the highest peak is marked as T 2 . Crackle pitch is calculated from 4 consecutive half-periods, with T 1 as a 1st half-period. The amplitude is determined separately for each half-period and marked as A 1 , A 2 , and A 3 . Crackle polarity (b) crackle polarity is defined positive if the highest peak is upward (c). Crackle polarity is defined negative if the highest peak is downward.  Crackle timing (timing) Crackle timing is defined as follows: 1 for early inspiration, 2 for mid-inspiration, 3 for late inspiration, 4 for early expiration, 5 for mid-expiration, 6 for late expiration Crackle transmission coefficient (CTC) The degree of crackling sound transmission through the ipsilateral chest, as calculated from crackle family observation by multiple microphones. The CTC has a value of 0% in the absence of any transmission and 100% when there is equal transmission to all ipsilateral channels see [7] for detailed description and discussion.

Amplitude
Amplitude of the highest peak (arbitrary units) Half period amplitude variability (%) (Standard deviation Crackle polarity (polarity) Direction of the highest peak, Figures 1(b) and 1(c) see [8] for detailed description and discussion The details of this device have been described [1]. In brief, patients are asked to lie on a soft foam pad, which has stethoscope chest pieces embedded in it. Each of these chest pieces contains a microphone. The sounds detected by these microphones are amplified, filtered, and input into a computer for analysis. In our usual practice, patients are asked to perform several breathing maneuvers: normal breathing, deeper than normal breathing, coughing, and a vital capacity maneuver. In this study, we chose the data obtained during the deeper than normal breathing maneuver. Crackles were defined in accordance with accepted criteria [2,3]. The STG software automatically identified crackles in all full breaths. The validation of the use of the device as a crackle counter has been reported [4]. A single recording lasted 20 seconds and typically contains a minimum of 3 breaths. To develop algorithms for testing, the crackle features shown in Table 1 were assessed.
Crackle features were calculated separately for inspiratory crackles and for expiratory crackles. Figure 1 demonstrates the process of calculating features of the crackle. In addition to these features, we combined the individual crackle features in the form of a median (median T1, median pitch, etc.) In addition to features based on individual crackle properties we captured information reflecting the distribution These 4 features count the total number of crackles observed in each quadrant of the chest. Together they add up to the total number of crackles per breath Percentage differences between crackle quadrants (6 total) Calculated from the 4 features described above, these features represent a comparison between quadrants. Each percentage is a pairwise comparison of all 6 possible combinations of quadrants Maximum distances (x,y,z) Distances between crackles in 3-dimensional space. There are separate features for x, y, and z planes. One feature also records a maximum distance across all 3 dimensions

Channel distances
These features are similar to those described above, except that they are defined based upon which channel microphone picked up the crackle. Distances are defined accordingly of the patient's crackles. Diseases differ in the pattern of crackles distribution over the chest. Distribution information required aggregation of data on a per-breath level and led to the development of aggregate crackle features shown in Table 2.
To perform classification and prediction we utilized supervised learning nonparametric classifiers: neural networks and support vector machines [5,6]. Supervised learning can teach the system to nonlinearly map the input features to the associated label of disease. We divided the data into a training set, used for feature extraction and model building, and a validation set, used for evaluation of the results. Validation data set performance indicates how well the features generalize to the unseen data. We used a fivefold cross-validation to increase the pool of validated data.
We used individual crackle features to distinguish crackles of IPF from CHF crackles and PN crackles. Once individual crackles were classified as IPF, CHF, or PN, majority voting was used to classify the patient into one of the three disorders. To incorporate features of crackle distribution, we performed majority voting among individual breaths during the single recording; for example, if a patient had 6 breaths, and 3 of them were classified as IPF, 2 as CHF, and 1 as PN, then the patient would be classified as having IPF. The final classification of IPF versus CHF and IPF versus PN was performed using this breath majority voting.
The study was approved by the Institutional Review Board of the Brigham and Women's/Faulkner Hospitals. The display of a single crackling event reveals that the crackling sound is transmitted differently in the three diseases. The right panels in Figure 2 show time-expanded sound waveforms that were recorded by the 14 microphones positioned over the posterior chest. The waveforms are superimposed on a body plot. Each waveform is positioned on the part of the body where the sound was recorded.

Results
In the patient with CHF, panel (d), a prominent crackle is seen on the tracing from channel 6 (indicated by a large triangle). At the same time the crackling sound was also detected in all ipsilateral microphones 1, 2, 3, 4, 5, and 7 (marked by triangles). The set of crackles generated by a single event and recorded by multiple microphones is referred to as a crackle family [7,8]. The crackle waveforms corresponding to the crackle family are shown in the stack mode in the insert in the upper-right corner. Notice that the crackle recorded by microphone 6 (the most prominent crackle or mother crackle) occurs earlier than the other crackles.
The crackle transmission coefficient was calculated for each crackle family. In the crackle family shown in the CHF patient (Figure 2(d)), the CTC was 50%. (The CTC has a value of 0% in the absence of any transmission and 100% when there is equal transmission to all ipsilateral channels.) In the crackle family shown in the PN patient, Figure 2(f), the CTC was 16%. In contrast, the crackle in the IPF patient was detected at only a single microphone (Figure 2(b)). The CTC of this crackle family was 1%. The low CTC is typical in IPF patients.
In addition to the CTC, note the difference in the pitch of the crackles shown in Figure 2, right panels: 588 Hz in IPF versus 218 Hz in CHF and 364 Hz in PN. Also note the difference in the number of zero crossings: 15 in IPF versus 5 in CHF and 5 in PN.
The observations in single patients shown in Figure 2 are supported by statistical analysis of all available data. Table 3 shows crackle rate and individual crackle features in IPF, CHF, and PN. Note that multiple individual crackle features are significantly different between IPF and the other two diseases. In order to perform classification of patients into one of the three diseases we utilized two statistical methods: neural networks and support vector machines. Table 4 presents the results of binary comparisons of individual crackles in IPF versus CHF and IPF versus PN. As seen in Table 4, the sensitivity, specificity, and overall accuracy are over 70%, consistent with the conclusion that individual IPF crackles have features that differ from those of patients with PN and CHF.   The accuracy increased to 83% (Table 5) on the application of majority voting to the classification of individual breaths based on crackle features. The addition of aggregate crackle features improved the accuracy to 86% (Table 6). Finally, we used majority voting to classify patients based  Table 7). The performance of perbreath and that of per-patient classification are quite similar suggesting that most breaths of the same patients are classified in a similar manner.

Discussion
This study shows that the crackles of IPF have features that help distinguish them from the crackles of patients with CHF and PN. As noted, we believe that the crackles of IPF are not infrequently misinterpreted. They are most commonly considered to be due to CHF, and diuretics are administered inappropriately. There is not much literature to support this observation, but it is our personal experience and an informal survey of clinicians confirmed this opinion.
In addition to providing evidence that helps in accurately identifying IPF crackles, computerized lung sounds analysis also quantifies them. It has long been noted that crackles of IPF become more widespread when the disease progresses. Thus crackle quantification can be important in assessing the severity of IPF, and this could be useful in providing evidence of response to therapy. We focused on the difference between crackles of IPF and those of CHF and PN. Baughman et al. took a different approach. They showed that the presence of crackles could help clinicians in distinguishing sarcoidosis from IPF [9]. Crackles were much less numerous in patients with sarcoidosis than in those with roentgenologically equivalent severity of changes due to IPF.
Among features that are significantly different between IPF and CHF/PN is the crackle pitch (P < .0000001). This is consistent with the commonly held believe that the crackles of IPF are generated in smaller airways than those of CHF and PN. The distinctive features of crackles of IPF have been long recognized. For example, the crackles of pulmonary fibrosis caused by asbestos, described in early as 1930 by Wood and Gloyne to be a prominent feature of this industrial disease, were described by Smither as "characteristic in their sound and distribution" [10,11]. He also pointed out that they are present first at the bases in the midaxillary line and then tend to spread to the posterior bases. As the disease progresses, crackles become audible higher on the chest. In one study a technician was able to screen workers for asbestosis by detecting crackles. The technician correctly identified all workers in whom the diagnosis was most certain, that is, those with all the clinical, physiological, and roentgenologic criteria used in the study [12]. Using time-expanded waveform analysis, Kawamura et al. studied 18 patients with IPF and 23 patients with crackles who did not have this disease. Two crackle parameters (initial deflection width and two cycle duration) were shorter in the IPF patients. This finding correlated with HRCT findings in these patients [13]. British investigators have reported that detecting crackles on time-expanded waveform analysis was equivalent to CT scans in detecting asbestosis [14]. Finnish investigators also showed a significant positive correlation with frequencies of lung sounds and pulmonary fibrosis detected on HRCT [15]. Of course in industrial settings, in contrast to ER's and ICU's, neither CHF nor PN crackles are likely to be confounding variables.
To perform classification and prediction, we utilized well-established supervised learning nonparametric classifiers: neural networks and support vector machines [5,6]. Neural networks (NNs) are the name for non-linear statistical data modeling tools. They are used to model complex relationships between inputs and outputs and are an attempt to build an architecture similar to the one of the human brain. NNs consist of an interconnected group of artificial neurons that learns and updates its internal structure using a connectionist approach to computation. NNs utilize a data-driven approach where changes in internal structure are based on external or internal information that flows through the network during the learning phase. In this study, we used a back propagation neural network. Support Vector Machines (SVM) are one of the newest methods in the supervised learning field. Generally speaking, a support vector machine seeks to create a hyperplane in a high-dimensional space that separates the two data classes. Not only does the hyperplane separate the data, but also it is oriented in such a fashion that creates the maximum "margin" on both sides of it ensuring the largest possible separation between the two classes. The algorithm proved to be fast and very efficient. We note here that both NN and SVM classification achieved similar results.
The technology for this study came about in part because there has been resurgence in interest in lung sounds. This has been stimulated by the development of computerized techniques. A number of investigations demonstrating the usefulness of computerized lung sound analysis have been reported [16][17][18][19][20][21]. While crackle pitch can be assessed by a clinician using an acoustic stethoscope, other crackle features that are significantly different between IPF and CHF/PN can only be gained with the use of a computerized stethoscope. And some crackle features such as crackle transmission coefficient can only be calculated with the use of a multichannel lung sound analyzer.
Computerized lung sound analysis can now be done at the bedside. The examinations are easy to do and can be performed in a few minutes. They have been shown to help in the detection of pneumonia [22]. Unfortunately, devices capable of doing this are not currently widely available. However, it is likely that this will change as the advantages of this technology become more widely known. Used in the context of a complete medical evaluation, we believe that this information could help avoid misinterpretation of IPF crackles and thus potentially decrease the occurrence of inappropriate treatment.