Data source
Colorectal cancer incidence data were obtained from the Guangzhou Cancer Registry, which was established in 1998 and covers all permanent residents of Guangzhou City. In this registry, all cancer cases diagnosed in 120 qualified hospitals (those with tumor diagnosis and treatment qualifications) were requested to be reported through a network direct reporting system. For each incident cancer case, information including registered identification number (ID), medical ID, China Identity Card Number (unique for each resident), name, sex, birth date, occupation, ethnicity, resident permanent address, phone number, cancer site, ICD-10 code, basis for diagnosis, treatment, prognosis and pathologic report if available, ICD-O-3 code, hospital, and the diagnosing physician’s name were registered. All cases were distributed to Community Health Service Centers for follow-up since 2010. The physicians in the Community Health Service Centers checked and supplemented the data (particularly the resident’s address). All cancers of the colon and rectum (ICD-10 codes C18–20) were included in the analysis. Cases with no or an incorrect address were excluded from the spatial and spatial–temporal analyses.
Population data were from the 2010 National Population Census [5], which covers the population of every street in Guangzhou, and from the 2010–2014 Guangzhou Bureau of Public Health and Statistics report [6], which covers the population of every district in Guangzhou. The population of each street during 2011–2014 was calculated as the product of the district population in 2011–2010 and the proportion of people residing on each street in 2010.
A map of the 167-street (town) administrative division in 2010 was obtained from the National Geographic Center of China. The 1984 World Geodetic System (WGS) map projection coordinating system from Defense Mapping Agency (DMA) was used. Although several streets were merged or divided between districts during 2010–2014, the 2010 street division was still used. The data were integrated in WGS, and ArcGis 10.2 software (Esri, Redlands, CA, USA) was used for mapping and data visualization.
Spatial cluster analysis
A spatial cluster analysis of CRC cases was performed using spatial autocorrelation. A spatial cluster model of Guangzhou was established using the univariate Moran’s I tool in the OpenGeodata 1.2.0 software (Luc Anselin, Urbana, IL, USA) as follows [7]:
$$I = \frac{{n\mathop \sum \nolimits_{i} \mathop \sum \nolimits_{j} W_{ij} \left( {X_{i} - \overline{X} } \right)\left( {X_{j} - \overline{X} } \right)}}{{\mathop \sum \nolimits_{i} \mathop \sum \nolimits_{j} W_{ij} \mathop \sum \nolimits_{i} \left( {X_{i} - \overline{X} } \right)^{2} }}$$
where X
i
= the crude incidence of cancer for the ith street; \(\overline{X}\) = the mean crude incidence of cancer for all streets in the study area; X
j
= the crude incidence of cancer for the jth street; W
ij
= a weight parameter for the pair of streets i and j that represents proximity; and n = the number of streets.
Here, I > 0 indicates a clustered pattern (i.e., similar values are found together), I = 0 indicates a random pattern, and I < 0 indicates a dispersed pattern (i.e., high- and low-incidence values are scattered).
The cluster type and exact position were estimated using the univariate local Moran’s I tool (local indicators of spatial association, LISA) in the OpenGeodata 1.2.0 software. The local Moran’s I (Ii) for the ith city was calculated according to Anselin’s report [8] as follows:
$$I_{i} = \frac{{\left( {X_{i} - \overline{X} } \right)\mathop \sum \nolimits_{j} W_{ij} \left( {X_{j} - \overline{X} } \right)}}{{S^{2} }}$$
where X
i
= the crude incidence of cancer for the ith street; \(\overline{X}\) = the mean crude incidence of cancer for all streets in the study area; X
j
= the crude incidence of cancer for the jth street; W
ij
= a weight parameter for the pair of streets i and j that represents proximity; and S = the standard deviation of the crude incidence of cancer in the study area.
The local Moran’s I identifies statistically significant (at a 95% confidence level; P < 0.05) spatial clusters of streets with high or low crude cancer incidences. Clusters of streets with high crude cancer incidences (high–high) were considered “hot spots,” whereas clusters of streets with low crude cancer incidences (low–low) were considered “cold spots.” In addition, the local Moran’s I identifies streets with high crude cancer incidences that are surrounded mainly by streets with low crude cancer incidences (high-low), as well as streets with low crude cancer incidences that are surrounded mainly by streets with high crude cancer incidences (low–high).
Spatio-temporal scan
The spatio-temporal cluster detection test for CRC incidence was retrospectively performed using spatial scan statistics. The scan parameters were as follows: the time range was from 2010 to 2014; the time interval was 1 year; the retrospective analysis metrics was space–time; the discrete scan statistics was Poisson; the maximum spatial cluster size was 10% of the population potentially at risk; the maximum temporal cluster size was 1 year; and the number of Monte Carlo simulations was restricted to 999. Then, the log-likelihood ratio (LLR) was calculated with the actual and theoretical incidences computed with the Poisson distribution in each scan window [9]. The LLR is proportional to
$$\left( {\frac{n}{E}} \right)^{n} \left( {\frac{N - n}{N - E}} \right)^{N - n} I$$
where n = the number of cancer cases within the scan window; N = the total number of cancer cases in the population; E = the expected number of cancer cases under the null hypothesis; and I = 1 when the scan window has a larger number of cancer cases than expected if the null hypothesis were true, and 0 otherwise.
The high-incidence cluster areas exhibited the higheset LLR values. Next, relative risk (RR) was calculated as the ratio of the incidence inside a cluster area to that outside the cluster area, and its significance was analyzed.
Statistical analysis
The age-standardized rate of incidence by the world standard population (ASRIW) was estimated using the 1964 Segi’s world standard population. The descriptive analysis was carried out using Excel 2013 (Microsoft, Redmond, WA, USA). The spatial cluster analysis was performed by hypothesis testing of Z statistics for space aggregation indices using the OpenGeoda 1.2.0 software. The spatio-temporal scan analysis was conducted using the SaTScan 9.4.2 software developed by NCI. A trend test of the average annual percentage change (AAPC) was performed using the Joinpoint Regression Program, version 4.0.4, developed by National Cancer Institute (NCI, Boston, MA, USA). AAPC was calculated as follows:
$$\ln \left( {R_{y} } \right) = \alpha + \beta y,\,{\text{AAPC}} = \left( {e^{\beta } - 1} \right) \times 100,$$
where Ry is the incidence in year y.