ZOU Guangui, REN Ke, JI Yin, Ding Jianyu, ZHANG Shaomin. Fault recognition based on principal component analysis and k-nearest neighbor algorithm[J]. COAL GEOLOGY & EXPLORATION, 2021, 49(4): 15-23. DOI: 10.3969/j.issn.1001-1986.2021.04.003
Citation: ZOU Guangui, REN Ke, JI Yin, Ding Jianyu, ZHANG Shaomin. Fault recognition based on principal component analysis and k-nearest neighbor algorithm[J]. COAL GEOLOGY & EXPLORATION, 2021, 49(4): 15-23. DOI: 10.3969/j.issn.1001-1986.2021.04.003

Fault recognition based on principal component analysis and k-nearest neighbor algorithm

  • Faults are geological structures that can cause disasters and thereby affect the safety of coal mines. Insight into the distribution of faults is one of the main purposes of 3D seismic exploration in coal mines. With respect to human-computer interaction in the interpretation of faults, the reliability of fault interpretation depends to a certain extent on the interpreter's knowledge. We propose an algorithm based on principal components and nearest neighbors to detect the distribution of faults along target horizons. The Yangdong Coal Mine of Fengfeng Mining Area is selected as the research area, and ten seismic attributes are extracted from the data obtained via three-dimensional seismic acquisition and high-precision processing of the mining area. Principal component analysis(PCA) is used to integrate the aforementioned ten seismic attributes into six integrated attributes. At the same time, the attribute information is combined with the fault information of 139 points determined from 15 wells and 3 roadways in the mining area to construct a known data set. Based on these data, two sets of data were constructed. The ratio of training to testing data for the first and second data set was 9:1 and 3:7, respectively. Using these data sets and the 10-fold cross-validation method, the accuracy of fault recognition based on the k-nearest neighbors(kNN) algorithm was determined to be 87.75% for data set 1 and 71.63% for data set 2. This indicates that the accuracy of fault identification is closely related to the number of data sets. In particular, when the number of training data sets is greater than that of the testing data sets, the accuracy of fault identification is higher. The attributes obtained after dimensionality reduction via PCA were used as inputs in the evaluation of the classification results of the KNN model, and the classification accuracy rates were calculated to be 89.23% for data set 1 and 73.79% for data set 2, respectively. This is because PCA reduces the dimensionality of the original input features, thus reducing the amount of calculation required and increasing the characterization capability of these features. The results show that a combination of the PCA and kNN methods can effectively identify fault distribution, and improve the efficiency of fault interpretation.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return