Socioeconomic Conditions and Risk of Mental Depression: An Empirical Analysis for Brazilian Citizens

The aim of this study is to investigate whether there is any relationship between socioeconomic factors and mental depression, speciﬁcally in connection with family income and education. Our empirical model was estimated using the database of three National Household Sampling Surveys (1998, 2003, and 2008) and their special supplements on the health status of the Brazilian population. Analyses for men and women were conducted separately. Family income proved to be a protective factor against depression for both men and women. With regard to the e ﬀ ects of education, the estimates indicate that the maximum risk of depression occurs when women have approximately four years of schooling and men have about eight. Above these levels, the risk decreases as schooling years increase.


Introduction
The quantity and quality of human capital stock are key to economic growth in a country ( [1] and others). This factor depends, in turn, on the education and life expectancy of individuals directly affected by health conditions.
In terms of health, there is increasing concern about the quality of mental health. Among the many mental disorders that an individual can develop in his or her life cycle, depression has gained prominence, especially in recent years.
According to data from the World Health Organization, approximately 121 million people worldwide are estimated to suffer from depression [2]. This type of chronic mental disorder causes severe damage to patients, their families, and also the country. Evidence found by Marcotte and Wilcox-Gök [3] shows that depression reduces the chances of getting a job, and for those who succeed despite being depressed, the income is usually lower. According to results by Ettner et al. [4], the chance of a depressed woman getting a job is 11% lower than for a woman that is not depressed. And when a woman with depression gets a job, her annual income is, on average, up to 50% lower. Lee et al. [5], Frank and Gertler [6], Cseh [7], and Jofre-Bonet et al. [8], among others, also noted that, on average, depression leads to lower earnings. Moreover, the social cost of depression is significantly high [2,[9][10][11][12][13].
The costs associated with depression and loss of wellbeing make it pertinent that economists remain attentive to this issue. We believe that using an economic approach, it is possible to cooperate with other fields already involved and engaged in research of causes behind mental depression.
In Brazil, the prevalence rate of mental depression (hereinafter depression) has been declining over the years. However, according to data from the Special Supplements on Health Research of the National Household Sampling Survey in 1998, 2003, and 2008, it is still high and worrying. In the study conducted in 1998, 4.96% of the Brazilian population had depressive symptoms; fortunately, in the 2003 and 2008 surveys, reductions to 4.08% and 4.13%, respectively, were detected. For reasons still obscure in the health literature, the prevalence rate of depression is much higher among women than among men. In 2008, it was seen that 2.25% of all men had depression, while 5.90% of all women had it. Moreover, the reduction in the prevalence of depression was lower among women than among men (−14.4% and −23.7%, resp.).

Economics Research International
Several researchers have been working to identify and understand the causes of depression. However, still little is known about its determinants, particularly about those related to socioeconomic conditions of an individual and family. Therefore, the aim of this study is to advance this investigation, with emphasis on the alleged effects of family income and education on depression risk.
Specifically, this study seeks empirical advances in the modelling of socioeconomic determinants of mental depression proposed in Santos and Kassouf [18]. 1 The improvements introduced here are (i) sample data covering a period marked by significant changes in socioeconomic conditions in Brazil (1998 to 2008), (ii) the sample selected for the estimations is composed of younger individuals (25 to 65 years old), (iii) we controlled for if individuals had children, (iv) a possible quadratic relation between depression risk and education was tested, and (v) the alleged effects of family wealth by deciles and quartiles of average family income was investigated as an attempt to avoid the endogeneity problem.
After this introduction, in Section 2, we present the methodological procedures. Section 3 contains some conditional descriptive analyses of sample data used in the estimates of empirical models, which are presented and discussed in Section 4. Finally, Section 5 concludes the study.

Data and Sample.
The dataset that we used was formed from three National Household Sampling Surveys (1998, 2003, and 2008) and their special supplements on the health status of the Brazilian population, whose weights make them representative of the population from which it was drawn.
To make the sample suitable for empirical modelling were excluded: (i) individuals under 25 or over 65 years old, (ii) indigenous people, (iii) individuals living in a home but who were not members of the family-pensioners, domestic servants and relatives of a domestic servant, and (iv) those who declared that their family income was zero. Moreover, some observations were lost in the estimates due to missing observations in some variables. 2 Thus, the sample that was actually used in the estimations consists of 501,945 observations, including depressed and nondepressed individuals.
The database that was used offers at least five advantages over those used in most studies: (i) sample size, (ii) coverage period; (iii) control of many factors that have a potential bearing on depression risk, (iv) variable of interest (depression) that is free from the bias which may arise when data provided by self-diagnosed individuals are used; and (v) identification of the effects of family income, education, and other variables on depression risk based on nondepressed individuals, which is not possible when only data for individuals with symptoms of depression are used.

Empirical Model.
In the specifications of the models to be estimated, the response variable is a dummy that assumes value 1 if the individual has symptoms of depression diagnosed by a physician or another health professional (depress = 1) and 0 otherwise (depress = 0).
Concerning explanatory variables, the main problem is that some of them are potentially endogenous with respect to depression, such as education, income, and physical diseases. To avoid this possibility, the following strategy was adopted: the lowest sampling age is 25 years old, as people have usually completed their studies at this age; the logarithm for average monthly family income was used as a proxy for family income, instead of personal income. This is one of two strategies used here, and it will serve only as a starting point for modelling and checking the robustness of the estimates. Personal income is certainly endogenous, mainly because the disease can lead to a lower income. The average family income can in turn also drop when at least one family member is sick. Therefore, using the average family income instead of personal income might not solve the endogeneity problem of income adequately. We, therefore,f used the second strategy, which is considered more robust. It will be presented and discussed immediately after the common variables of both models are listed. Regarding a possible comorbidity, 3 we estimated regressions with or without disease-related variables to check the stability of the estimated coefficients.
Since there is no "canonical" model for the causes of mental depression, the model specification is based on the main empirical evidence found in the literature, especially in the health literature. The main studies are reported in (Table 6, The appendix).
The specification of the empirical model contains five sets of control variables: economic status of the family and schooling level of the individual, family characteristics, geographic and demographic characteristics, and presence of chronic physical diseases.
In the basic model (Model 1), the explanatory variable of interest is the logarithm for the average monthly family income in Brazilian Reals in 2008 (income). The other variables are (a) years of schooling (school) and the square of this variable (school2), to control for a possible nonlinear effect; (b) age of the individual (age) in years and the square of this variable (age2), also to control for a possible nonlinear effect; (c) a dummy variable for the individual's sex, which is 1 for females and 0 for males (women); (d) four dummy variables for skin color: white (base group), black, mulatto 4 and asian; (e) a dummy variable for marital status, which is 1 if the individual is married 5 and 0 if he or she is not (married); (f) a dummy variable for parenthood, which is 1 if the individual has children living in the same household and 0 if not (child); (g) a dummy variable for position in the family, which is 1 if the individual is the head of the household (h) number of family members ( f amsize); (i) five dummy variables to control for possible regional differences: north, south, mid-west, northeast, and southeast (base group); (j) a dummy variable for residence location, which is 1 when the individual resides in an urban area and 0 if he or she resides in a rural area (urban); (k) a dummy variable if the individual has at least one of eleven chronic physical diseases (physd); 6 (l) three dummy variables for year, to control for possible fixed time effects (1998 as the base group).
As mentioned above, we used the average monthly family income (hereinafter family income) to try and avoid the endogeneity problem. But the implicit assumption is made here that the income of other family members would not be affected by the fact that one of them is depressed. However, this assumption is unrealistic. It is not possible, therefore, to say that the average family income solves the endogeneity issue. Moreover, if the family is made up of only one individual, then there would be no difference between family income and personal income. This possibility could be rejected by deducting personal income from family income. But this would imply excluding individuals who live alone, since their personal income would be the average family income, that is, this new variable would be zero for them and we would thus lose the possibility of properly controlling for the effect of family size. It is, therefore, difficult to capture the effect of a possible feeling of loneliness, when one lives alone, on depression risk. Convinced that the cost of this strategy is greater than its benefit (robust estimates) and that the assumption above was being made anyway, we propose another strategy: substituting family income with ten dummy variables defined by the distribution deciles of this variable (Model 2), so that the possibility of endogeneity is relatively small. Table 1 shows the mean and standard deviation of income within each decile of the distribution of average family income.
Using this strategy, the endogeneity problem will only arise if the effect of depression on individual income is strong enough to become part of another decile of the distribution of family income. This is possible, especially for those whose family income is close to the lower limit of the decile to which the individual belongs. However, this would only cause a relevant endogeneity problem if a significant proportion of sick individuals migrate from one decile to a lower decile of income distribution due to depression. A strategy to check the robustness of the estimates will be proposed in Section 4.
The specifications are slightly modified to check the stability of the coefficient estimates with regard to both signs and their magnitudes. Specifically, the change is made in the control variable defined in item (k) in the above-described list.
Obviously, other factors have a bearing on depression risk, such as traumas in life and professional or financial difficulties. While not observable, however, these factors are partially reflected in some observable variables making up the specification, such as family income, marital status, and the existence of other chronic diseases. (Table 7, The appendix) provides a complete description of the variables, along with their means and standard deviations.

Estimation Procedures.
The empirical strategy is one of estimating pooled probit models, using the sample expansion factors associated to each observation.
As will be seen in the next section, depression prevalence rates are significantly higher among women. It is therefore possible that the effects of alleged determinants of depression risk depend on individual gender. In this context, we will apply the likelihood ratio test 7 for testing the null hypothesis that the coefficients of the empirical model are the same for men and women. Table 2 shows average depression prevalence rates (1998, 2003, and 2008) conditional on individual characteristics in controlled empirical estimations. This analysis will provide a better understanding of the causal effects of these characteristics on depression risk, which will be addressed in the next section. 8 Depression risk is significantly higher among women than among men. In 2008, for example, the prevalence rate among women was about 5.5 percentage points (p.p.) higher than among men. Given this significant difference between men and women, we compared the prevalence of depression by gender.

Preliminary Analyses
When conditioned to family income levels, one observes that depression risk differs between the distribution deciles. A nonlinear relationship might exist between family income and depression risk.
The same possibility of a nonlinear relationship with depression risk is observed in connection with age and, especially, schooling. For the two variables, it is suggested, both for men and women, that the relationship between education and depression risk follows a parabolic curve with its vertex pointing upwards. For women, it was seen that the highest depression rate occurs among those with 1-3 years of schooling, upwards of which the depression rate drops consistently. For men, the highest prevalence rates occur in the group with zero or less than 1 year of schooling. It is noteworthy that a significant number of people fall under these categories, that is, 13.18% and 13.22%, respectively. At the same time, the lowest prevalence rates are observed among people falling under the category of those with 11-14 years of schooling or with 15 years of schooling or more, that is, 3.18% and 3.16%, respectively. The same phenomenon has been observed for women, for which the depression prevalence rate is much lower among those with similar education levels, that is, 7.32% and 6.66%, respectively. 30.05% of all men and 33.27% of all women fall under these two schooling categories. This is an indication that education can be a relevant factor in determining mental depression. Apparently, depression risk depends on an individual's race or color. While 4.21% of white men have depression, the disease affects approximately 3.32% of mulatto and slightly more than 2.95% of black and Asian men. Among women, relative differences are less pronounced, except in the case of Asian women, among whom the depression rate is about half the one observed among women of other races.
Conditional statistics indicate that family characteristics are relevant in determining depression risk. Differences in prevalence rates of the disease were observed as a result of marital status, responsibility of being a parent, degree of responsibility within the family, and also family size. In the latter case, as we have seen, the rate decreases as the family size increases. Prevalence of depression is absurdly high among people living alone, especially women (17.05%). Living with a spouse appears to reduce depression risk for both men and women. The same cannot be said in regard to the responsibility of being a parent. Among men with children living in the household, the prevalence rate of depression is lower than among those who do not have children. Concerning responsibility within the family, it was seen that the prevalence rate of depression among heads of household is much higher than among those playing other roles in the family structure. It should be observed, however, that this difference is much more significant for women.
Conditioning depression rate to geographical features, it was observed that the highest depression prevalence rate occurs in Brazil's south region, where it is approximately twice as high as those observed in the north and northeast regions, the country's warmest ones.
Finally, a big difference was observed in the depression prevalence rate among people who had or didn't have any of the eleven physical diseases considered in the study. This means that for all variables, the highest percentage difference between groups is the one observed for this variable, especially among men.
Obviously, the statistical analyses presented in this section cannot support any assumptions regarding causality. However, in association with the related literature, they can shape expectations about signs of coefficients for the alleged relationships between these variables and depression risk.
On the one hand, expectations are that depression risk is lower in the highest deciles of income distribution, and that it will be lower if the individual has a spouse, and as the family size increases. On the other hand, depression risk is expected to be higher if the individual is the reference person in the family, if he or she lives in an urban environment, and if he or she has any chronic physical disease. Differences in depression risk are expected as a function of color or race and region of residence. Moreover, the risk of depression seems to increase at decreasing rates with age and schooling, to such a degree that at some point the depression function changes from increasing to decreasing as these variables rise. As for having children, coefficients with negative sign are expected for men and women, respectively. Finally, we expect the statistically significant variables to have a relatively stronger effect on depression risk among women than among men.

Results and Discussion
The results of likelihood ratio tests (LR) led us to reject the null hypothesis that model coefficients are genderindependent. We, therefore, analyzed the equations for conditional depression risk separately for men and women. The results of the estimations for the two models are shown in Tables 3 and 4, respectively. They show the marginal effects of the variables and their levels of statistical significance. The robust standard errors of the estimates are shown in (Tables  8 and 9, The appendix).
In general, the expected results suggested in the previous section are confirmed by the results of the estimation of the empirical models.
Initially, we emphasized two points. First, the results of the estimates of both models (1 and 2) in their two specifications (A and B) show that the estimated coefficients remained stable regardless of changes in the regressor sets. When compared on the basis of the same specification (A or B), both models also indicate stability in the estimated coefficients. This is the first evidence of the robustness of the estimates. However, in both models there is evidence that aggregating all diseases in a single variable is not the best way to control for the presence of other diseases, as the estimates suggest that the effect on depression risk varies according to the type of disease. Second, although family income was significant in both models, due to the endogeneity issues discussed in Section 2.2, the second model specification is apparently more appropriate. For these reasons, only the results of model 2B ( Table 2, column B) are analyzed.
As expected, due to the complexity of the phenomenon under analysis, although most variables are statistically significant, their marginal effects on depression risk are relatively small. A clear exception lies in the marginal effects of the existence of chronic physical diseases, in which the magnitude of the effects is relatively high.
Estimates of the effects of family income show that it has a negative impact on depression risk. For men, we found that the effect is statistically significant only for those at least in the fifth decile of the family income distribution, while for women the effect can be detected slightly before, in the fourth decile. Figure 1 shows these estimates (model 2B), but without considering their respective statistical significance. It should be recalled that the first decile was the base group considered in the estimations.
The hypothesis that education and depression risk are nonlinear is supported by the signs of the statistically significant coefficients for the school and school2 variables, which are positive and negative, respectively. Therefore, the relationship between education and depression risk follows a parabolic curve with its vertex pointing upwards. The estimates suggest that maximum risk occurs at around 4 years of schooling for women and at approximately 8 years of schooling for men. The same type of nonlinear relationship  To provide a graphic illustration of the apparent quadratic relationship between risk of depression and education, we calculated and showed, in Figure 2, the risk of depression by gender, considering the mean of each variable of model 2B and varying only the years of schooling (0 to 15+). The figure shows that the maximum point is between 4 and 5 years of schooling for women and between 8 and 9 for men.
The results support the hypotheses that race or color, region of residence, and living in an urban or rural area have a bearing on depression risk. The estimates suggest that the risk of depression is lower for black, mulatto, or Asian individuals as compared to white ones. These results are similar to those found by Mezuk et al. [29], Riolo et al. [30], and Jackson et al. [31], among others. Our estimates indicate that depression risk is higher in urban areas. This has also been observed by Paykel et al. [32] and Wang [33] based on data from the UK and Canada, respectively.
For people living in the south, Brazil's coldest region, the risk of depression is higher as compared to those living in the southeast region by a factor of approximately 2.3 p.p. for women and of about 0.7 p.p. for men. On the other hand, for people living in the north and northeast, the hottest regions of the country, the risk is lower than for those living in the southeast. These results are supported in the literature on seasonal and climate factors that act on the human organism and have a bearing on depression risk. In this context, Harmatz et al. [34], Oyane et al. [35], and Magnusson [36] have found evidence that depression risk is higher during winter.
With respect to family, the results indicate that having a spouse reduces depression risk by a factor of approximately 1.8 p.p. for men and of about 0.8% for women. The risk of depression decreases as family size increases. Moreover, being the reference person in the family decreases depression risk for men, but for women it is the other way around, as Santos and Kassouf [18] had already observed.
Women can be head of a household if they live with their children or other family members but are single, separated, divorced, or widowed, if they are married but their spouse has physical or mental problems that prevent him from working and generating income, or if they are married but their spouse does not contribute, or contributes little, to the family income due to drug or alcohol addiction. In such cases, responsibility for the family usually falls on women, who may feel pressured by economic or cultural factors, making them more vulnerable to depression. For cultural  reasons, it is plausible that men feel better when they are seen as the head of the household. In many cases, mental and behavioral disorders arise as a result of other chronic diseases [2]. Depression is debilitating enough by itself, and it can have a much greater effect if associated with physical diseases [37]. Little is known about the causality between mental and/or behavioral disorders and physical ailments. Some studies, however, emphasize that some physical diseases can lead to a chronic state of depression, as suggested, for example, by evidence found by Barkow et al. [24]. It is important to note that severe chronic physical illness can have serious financial implications for patients and their families. Moreover, people with chronic physical health problems are often penalized by loss of productivity, restrictions on their leisure and physical activities, discrimination from society, and a strong sense of inadequacy. The combination of these factors can trigger the onset of depressive symptoms.
Both the estimations from the model that considers all the eleven chronic physical diseases aggregated into a single variable and those generated by estimation from the model using disaggregated diseases indicate that having a chronic physical illness implies higher depression risk, as Moussavi et al. [25], Aneshensel et al. [38], and Barkow et al. [24] also observed. It should be noted that the estimates suggest that all the eleven diseases considered in our study increase the likelihood of depressive symptoms emerging. It is worth mentioning that although this relationship is gender-independent, its effect is much stronger for women. As for specific diseases, the results suggest that the strongest effect is caused by cirrhosis, for both men and women. Estimates indicate that this disease increases depression risk by approximately 5.3 p.p. for men and by about 10.7 p.p. for women.
Regarding the controls applied for possible fixed time effects, it was observed that dummies for year have coefficients that are negative and statistically significant in relation to 1998. This indicates that depression risk among Brazilians decreased over the considered period. It was estimated that depression risk, regardless of gender, was about 1.0 p.p. lower in 2008 than in 1998.
It is plausible that the improvements in economic and social conditions observed in Brazil during the analyzed  (Table 10, The appendix). The estimates were time-variant in many cases.
To further check the robustness of the results, and specifically to assess the validity of the quadratic relationship suggested by the results of previous estimates, the same model was reestimated using dummy variables for years of schooling. The results of these estimates by gender are shown in (Table 11, The appendix). The percentage marginal effects of these dummies, regardless of statistical significance, are shown in Figure 3. All of this suggests that, if it actually exists, the relationship between depression risk and education is nonlinear. Although less obvious, given the small number of points, a quadratic relationship might exist, corroborating the results observed previously.
As mentioned in Section 2.2, the main strategy to try and avoid the problem of endogeneity of income was to use dummy variables for the income deciles. Implicitly, however, it was assumed that the effect of depression on at least one member of the family is not strong enough to cause any shifts from one decile to another. However, such possibility exists, especially because deciles are narrow bands. Therefore, as a robustness check, the model 2B was reestimated using dummy variables of the income quartiles rather than deciles. Table 5 shows the mean and standard Depression associated with chronic physical illnesses worsens health status more than depression alone.

Depression and gender
Aneshensel et al. [26] Primary data done in Los Angeles, USA, after interviews in 1979 GLS Family and work roles tend to be associated with reduced depression risk, with greater effects on men Newmann [27] Psychiatric Evaluation Research Instrument-1978-Winsconsin, USA OLS Women are more likely than men to suffer hardships associated with the absence of a spouse, social isolation, financial difficulties, and chronic health problems. However, none of these hardships has a significantly greater impact on depressive syndrome levels for women than for men 10 Economics Research International Prevalence of depression has peaks in winter deviation of income within each quartile of the distribution of average family income. The large difference in average income between quartiles supports the argument that it is very unlikely that a drop in income for families with at least one depressed member is sufficiently strong to result in a significant number of families shifting from one quartile to another. The results shown in (Table 12, The appendix) are suggestive of the robustness of previous estimates.

Final Considerations
We saw that the estimates remained stable regardless of changes in the set of regressors, apparently confirming their robustness. The nonlinear effect of education was observed in both specifications that were tested. With regard to income, regardless of how it was measured, a negative effect on depression risk was observed.
We observed that the effects of the determinants of depression risk are usually significantly stronger for women. An apparent exception is the negative effect of marital status, which is at least twice as strong for men.
Family income proved to be a protective factor against depression for both men and women. Moreover, by controlling for it with deciles of distribution, we observed that the magnitude of its effect grows as the income increases.
The results show that for people included in the group of the richest 10%, as opposed to those in the group of the poorest 10%, the risk of depression drops by 1.1 p.p. for men and by 0.9 p.p. for women.
We found that depression risk as a function of age follows a parabolic curve with its vertex pointing upwards, showing that this risk rises at decreasing rates as the age increases. Our estimates suggest that (in the case of individuals from 25 to 65 years old), ceteris paribus, the maximum depression risk occurs at the age of 43.8 for women and at the age of 47.8 for men.
The results of this study support those already observed in Santos and Kassouf [18], including with respect to one's degree of responsibility within the family. Being the head of a household increases the risk of depression for women, but makes it milder for men.
We believe that the quadratic form was more appropriate to capture the relationship between education and depression risk than the strategy used before, since it allowed us to estimate at what level schooling results in the highest risk of depression. Our estimates indicate that the maximum risk occurs at around 4 years of schooling for women and at approximately 8 years of schooling for men.
Finally, the estimates indicated that regional and racial factors have a bearing on depression risk and that there is a significant positive relation between chronic physical diseases and depression risk.
Using a sample consisting of younger individuals (25 to 65) than those previously used (30 to 80), we detected stronger effects for statistically significant variables. Therefore, similarly to how models were estimated by gender, future studies should evaluate the possibility of conditioning estimates by age group also.
The findings of this study reflect the association between mental depression and various studied factors, especially those of a socioeconomic nature. However, although the robustness of our estimates was checked in the light of  2. The total number of excluded observations is 619,732. The lower age limit was applied to reduce the possibility of analyzing individuals who were still studying, while the upper limit was applied to avoid any bias in the estimates, considering that the incidence of chronic diseases is comparatively much higher in older age brackets; the second filter was applied because the number of individuals in these two categories is relatively small; the two last filters were applied to measure the average family income appropriately.  11.6 11.4 * denotes significance at 1%; * * denotes significance at 5%; * * * denotes significance at 10%. 11.5 11.3 * denotes significance at 1%; * * denotes significance at 5%; * * * denotes significance at 10%. position within the family was one of "reference person" or "spouse." 6. Physical chronic diseases diagnosed by a doctor or health professional: chronic spine or back problems, arthritis or rheumatism, cancer, diabetes, bronchitis or asthma, hypertension, heart disease, chronic renal failure, tuberculosis, tendonitis or tenosynovitis, and cirrhosis.
7. Details of the test are in Greene [40].
8. All the statistics presented here are for the sample used in the empirical estimations. They are therefore different from what one might get from the sample without the filters mentioned in Section 2.1, such as, for instance, the depression rates cited in the first section.