Model settings and descriptive statistical analysis
This article mainly measures spatial effects. Spatial correlation mainly manifests itself in two aspects: the error term and the lag term of the dependent variable. Using the LM test before running the spatial effects regression, we found that the statistical measures were significant at the 1% level, demonstrating the rationale for choosing the spatial econometric model. Both the LR test statistic and the Wald test statistic are significant at the 5% level, categorically rejecting the null hypothesis and indicating that the spatial Durbin (SDM) model is superior to the spatial lag model (SAR) model and the spatial error model (SEM). This shows that degeneracy is not possible. model. Compared to SEM and SAR models, SDM model is a better choice. SDM is a combined extension of SAR and SEM and can be established by adding corresponding constraints to SAR and SEM. Its regression form is:
$$ y = X\upbeta + WX{\updelta + \varepsilon } $$
(1)
$$ y =\uplambda Wy + X\upbeta + WX\updelta { + }\upvarepsilon $$
(2)
In the equation, (1) and (2), \(WX\updelta\) Represents the influence of independent variables from neighboring states. \(\UpDelta\) Represents a vector of coefficients corresponding to observed values of other independent variables in a particular state. \(\uplambda\) Represents the spatial autoregression coefficient used to measure the influence of this state's independent variables on observations. \(\upvarepsilon\) is a vector of random error terms. This section then lists the explanatory variables, dependent variables, and selected control variables one by one.
-
(1)
Explanatory variable: Number of visits per person (\(vpc\)). Self-assessment is the main measure of population health level in academia. However, this method contains many personal and subjective elements and is not objective and data-based. Therefore, this paper uses macro-level data to indicate the population's health level. Note that the degree of health includes not only physical health but also mental health. Whether life expectancy has increased and mortality rates have decreased is not enough to fully measure the population's level of health. Taking these factors into account, the mortality rate represents a quantitative aspect of the population's level of health and is measured in terms of the number of deaths/total population at the end of the year.
-
(2)
Central explanatory variable: Sports industry agglomeration (\(SI\)). As a comprehensive indicator, measuring sports industry agglomeration is relatively complex. In the previous section, we used the entropy weighting method to calculate the degree of agglomeration of China's interprovincial sports industry from 2006 to 2022 from three aspects: scale, speed, and quality of sports industry agglomeration. Detailed calculation results are presented in the previous section and will not be repeated here.
-
(3)
control variable
Urbanization level (\(City\)): Urbanization level represents the civilization and socialization of a city, which affects the health level of residents from population structure, labor migration, and social security. In this paper, we use the proportion of the urban population to the total population at the end of the year to represent the level of urbanization.
GDP per capita (\(GDP\)): China's economic development is uneven. Tertiary industries in economically developed regions are better developed than in less developed regions. GDP per capita can represent the level of economic development of a region.
Consumer price index (\(cpi\)): The consumption level of residents can affect their health from aspects of dietary structure and consumption habits. The CPI index can reflect the relative number of price trends and the extent of consumer goods and services purchased by urban and rural residents over a given period of time. CPI can well represent the consumption level of residents.
Park green space area per person (\(ppg\)): The government's investments in the environment and greenery demonstrate the government's concern for the health level of its population. In this paper, we use it to define park area per capita. The specific estimation method is regional park area/total population as of the end of the year.
Educational investment (\(education\)): The emphasis on health can be influenced by the educational level of the population and is also related to the region's investment in education. There is a lag in the effect of the population's educational level on the population's health level. Therefore, this paper uses the strengths of educational investment to express this. The specific measurement method is education budget expenditures/total fiscal budget expenditures.
Aging (\(old\)): The aging rate of a region can represent the age structure and has a significant impact on resident mortality rates and the number of visits per resident. This paper follows the division method of the China Statistical Yearbook and calls the population aged 65 and over the elderly population. Aging is measured as aging population/total population.
In the data used in the empirical tests, the intensity of the sports industry measured in the previous section is the central explanatory variable. The explanatory variable was the number of visits per person. Data are from the China Health Statistics Yearbook (Calendar). The data for control variables are mainly obtained from the China Statistical Yearbook, China Industrial Statistics Yearbook, China Urban Statistics Yearbook, the website of the National Bureau of Statistics, and the previous year's China Economic and Social Development Statistics Database. Some indicators, such as urbanization and aging, are calculated based on statistical data. Some of the missing data is imputed from state and city statistical yearbooks, and some of the missing data is imputed by linear interpolation and linear trend techniques. Furthermore, considering the characteristics and data availability of the four regions of Tibet Autonomous Region, Hong Kong Special Administrative Region, Macau Special Administrative Region, and Taiwan, data from these four regions were excluded from the data matching process. Finally, balanced panel data were obtained for 30 states from 2006 to 2022. To eliminate the effects of heteroskedasticity and quantity rigidity of relevant variables on the results, this paper logarithms of GDP per capita. Table 2 presents the descriptive statistical analysis of all variables.
Spatial Durbin regression results
The Hausman test is performed based on a geographic weight matrix. The P value of the Hausman test result is 0.0012, indicating that it is more effective to use the spatiotemporal dual fixed effects model when choosing the SDM model. Based on this, this article analyzes a spatial Durbin model with spatiotemporal double fixed effects. Model (1) represents the results of the spatial Durbin regression, and models (4), (5), and (6) represent the direct, indirect, and total effects of the variables, respectively.
From Table 3, the core explanatory variable, the intensity of the sports industry, passes the 1% level significance test for the direct effect, indirect effect, and total effect, and the spatial regression coefficient is − 6.783. This indicates that sports industry agglomeration can reduce the number of visits per capita and have positive spatial spillover effects on population health. This not only promotes the health level of the population within the state but also has spatial ripple effects on surrounding areas.
This empirical result is consistent with reality. First, sports industry agglomerations usually involve the construction of various sporting events, fitness activities, and sports facilities. These activities and facilities can increase the health awareness of local and surrounding residents and encourage them to participate more actively in sports activities, thereby reducing the incidence of diseases. The construction of large-scale sports facilities, health clubs, parks, and other fitness facilities will increase the accessibility and choice of a variety of physical activities for residents of the surrounding area. The services provided by these facilities can meet the fitness needs of residents and reduce the incidence of chronic diseases. Second, regions where the sports industry is concentrated have advanced economic development, and they often pay attention to living environments and community development, such as greening and leisure facilities. Improved living conditions and a positive community atmosphere help residents and surrounding residents maintain a healthy lifestyle and reduce the incidence of disease. Finally, sports activities not only benefit your body, but also have a positive impact on your mental health. Participating in sports releases stress, increases feelings of well-being, reduces psychological problems such as anxiety and depression, and reduces the need for treatment for mental health problems. Areas with a high concentration of sports industries can involve residents in sports activities and can attract residents from surrounding areas to participate in sports activities.
Robustness test
Many factors influence the health level of the population. Fluctuations occur due to changes in the economic environment, stage of development, and consumption habits of residents. To ensure the accuracy of the empirical study, this article tests the robustness of the spatial Durbin model regression results by changing the explanatory variables and spatial weight matrices, and performs a simple analysis.
Replacing explanatory variables
Sports, culture, and entertainment belong to the tertiary industry, and the industrial structure (\(teeth\)) can also represent to some extent the aggregation of sports industries within a region. To test the robustness of the above empirical model, this paper takes the industrial structure of different states from 2006 to 2022 as an explanatory variable. Currently, there are many measurement indicators used to improve the sophistication of industrial structures. In general, the sophistication of industrial structure is characterized by changes in the relative scale of industries. In this paper, we use the ratio of the added value of the tertiary industry to the added value of the secondary industry in order to improve the sophistication of the industrial structure.
permutation space weight matrix
The spatial Durbin model used in the previous text is a geographic weight matrix, which is replaced by an economic weight matrix to test the robustness of the empirical results. Economic weighting matrices are used to measure the relative importance of various economic indicators and factors in the overall economy, and can reflect to some extent the spatial spillover effects of sports industry agglomeration.
Table 4 shows the results of the robustness test. Based on the regression results using industry structure as an explanatory variable, the regression coefficient of industry structure in model (1) is – 0.1075, which is significant at the 10% level and is consistent with the results in Table 4. Model (2) is: The results of the spatial Durbin regression were replaced by an economic weight matrix with a regression coefficient of − 6.812 and passed the 1% significance test. The regression results for sports industry intensity in Table 3 are consistent. Therefore, the above empirical results are robust.