Journal of Systems Engineering and Electronics ›› 2023, Vol. 34 ›› Issue (1): 36-46.doi: 10.23919/JSEE.2023.000034
• REMOTE SENSING • Previous Articles Next Articles
Wei FENG1,2,3,*(), Yijun LONG1,2,3(), Shuo WANG1,2,3(), Yinghui QUAN1,2,3()
Received:
2022-09-06
Online:
2023-02-18
Published:
2023-03-03
Contact:
Wei FENG
E-mail:wfeng@xidian.edu.cn;yjlong@stu.xidian.edu.cn;shuow@stu.xidian.edu.cn;yhquan@mail.xidian.edu.cn
About author:
Supported by:
Wei FENG, Yijun LONG, Shuo WANG, Yinghui QUAN. A review of addressing class noise problems of remote sensing classification[J]. Journal of Systems Engineering and Electronics, 2023, 34(1): 36-46.
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
Table 1
Class handling method: removal and correction"
Description | Noise removal | Noise correction |
Basic procedure | Detecting and eliminating noisy data through noise filters | Detecting and correcting the labels of the noisy instances with predicted labels |
Non-ensemble method | SVM for outlier detection [ | Polishing: identify noisy data and replace its label by predicted labels [ |
Semi-supervised learning of probabilistic models for noise removal [ | Clustering: detect and group noisy data based on a neighborhood consistency constraint [ | |
Scalable penalized regression for noise detection [ | ||
The blame-based noise reduction algorithm [ | Density-based spatial clustering of applications with noise [ | |
Ensemble method | Bagging-majority vote filters [ | Active learning: active label correction (ALC-mislabeled and ALC-disagreement) to identify mislabeled data [ |
Identifying and eliminating data through subsets and error counts [ | ||
Heterogeneous ensemble with four different base classifiers [ | Single-pass discarding and correcting [ | |
Ensemble method based on the noise detection metric [ | ||
Edge analysis: a boosting ensemble to detect noisy data based on the sum of the weights of weak classifiers [ | Classification: relabel the noisy data by the class that is most predicted [ | |
Outlier removal boosting [ |
1 |
FENG W, HUANG W J, BAO W X Imbalanced hyperspectral image classification with an adaptive ensemble method based on SMOTE and rotation forest with differentiated sampling rates. IEEE Geoscience and Remote Sensing Letters, 2019, 16 (12): 1879- 1883.
doi: 10.1109/LGRS.2019.2913387 |
2 |
FENG W, QUAN Y H, DAUPHIN G, et al Semi-supervised rotation forest based on ensemble margin theory for the classification of hyperspectral image with limited training data. Information Sciences, 2021, 575, 611- 638.
doi: 10.1016/j.ins.2021.06.059 |
3 |
FENG W, DAUPHIN G, HUANG W J, et al Dynamic synthetic minority over-sampling technique based rotation forest for the classification of imbalanced hyperspectral data. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2019, 12 (7): 2159- 2169.
doi: 10.1109/JSTARS.2019.2922297 |
4 |
HE F, WANG R, JIA W M Fast semi-supervised learning with anchor graph for large hyperspectral images. Pattern Recognition Letters, 2020, 130, 319- 326.
doi: 10.1016/j.patrec.2018.08.008 |
5 | ZHU X Q, WU X D Class noise vs. attribute noise: a quantitative study. Artificial Intelligence Review, 2004, 22 (3): 177- 210. |
6 |
ALGAN G, ULUSOY I Image classification with deep learning in the presence of noisy labels: a survey. Knowledge-Based Systems, 2021, 215, 106771.
doi: 10.1016/j.knosys.2021.106771 |
7 |
GAMBERGER D, LAVRAC N, DZEROSKI S Noise detection and elimination in data preprocessing: experiments in medical domains. Applied Artificial Intelligence, 2000, 14 (2): 205- 223.
doi: 10.1080/088395100117124 |
8 | CARLA E, FRIEDL M Identifying mislabeled training data. Journal of Artificial Intelligence Research, 1999, 11 (1): 131- 167. |
9 |
FENG W, DAUPHIN G, HUANG W J, et al New margin-based subsampling iterative technique in modified random forests for classification. Knowledge-Based Systems, 2019, 182, 104845.
doi: 10.1016/j.knosys.2019.07.016 |
10 |
FENG W, QUAN Y H, DAUPHIN G Label noise cleaning with an adaptive ensemble method based on noise detection metric. Sensors, 2020, 20 (23): 6718.
doi: 10.3390/s20236718 |
11 | GARCIA S, LUENGO J, HERRERA F Dealing with noisy data. Data Preprocessing in Data Mining, 2015, 72, 107- 145. |
12 |
LI P L, HE X H, CHENG X J, et al An improved categorical cross entropy for remote sensing image classification based on noisy labels. Expert Systems with Applications, 2022, 205, 117296.
doi: 10.1016/j.eswa.2022.117296 |
13 | VERBAETEN S, ASSCHE A V Ensemble methods for noise elimination in classification problems. Proc. of the International Workshop on Multiple Classifier Systems, 2003, 317- 325. |
14 | MELLOR A, BOUKIR S, HAYWOOD A, et al Using ensemble margin to explore issues of training data imbalance and mislabeling on large area land cover classification. Proc. of the International Conference on Image Processing, 2014, 5067- 5071. |
15 |
WANG R Y, STOREY V C, FIRTH C P A framework for analysis of data quality research. IEEE Trans. on Knowledge and Data Engineering, 1995, 7 (4): 623- 640.
doi: 10.1109/69.404034 |
16 |
CATAL C, ALAN O, BALKAN K Class noise detection based on software metrics and ROC curves. Information Sciences, 2011, 181 (21): 4867- 4877.
doi: 10.1016/j.ins.2011.06.017 |
17 |
HERNANDEZ M A, STOLFO S J Real-world data is dirty: data cleansing and the merge/purge problem. Data Mining and Knowledge Discovery, 1998, 2 (1): 9- 37.
doi: 10.1023/A:1009761603038 |
18 | ZHU X Q, WU X B, CHEN Q J Eliminating class noise in large datasets. Proc. of the 20th International Conference on Machine Learning, 2003, 920- 927. |
19 | PECHENIZKIY M, TSYMBAL A, PUURONEN S, et al Class noise and supervised learning in medical domains: the effect of feature extraction. Proc. of the 19th IEEE International Symposium on Computer-Based Medical Systems, 2006, 708- 713. |
20 | FRENAY B, VERLEYSEN M Classification in the presence of label noise: a survey. IEEE Trans. on Neural Networks and Learning Systems, 2013, 25 (5): 845- 869. |
21 |
SAEZ J A, CORCHADO E ANCES: a novel method to repair attribute noise in classification problems. Pattern Recognition, 2022, 121, 108198.
doi: 10.1016/j.patcog.2021.108198 |
22 | QUINLAN J R Induction of decision trees. Machine Learning, 1986, 1 (1): 81- 106. |
23 |
AL-SABBAGH K W, STARON M, HEBIG R Improving test case selection by handling class and attribute noise. Journal of Systems and Software, 2022, 183, 111093.
doi: 10.1016/j.jss.2021.111093 |
24 | GUO G D, WANG H, BELL D A, et al KNN model-based approach in classification. Proc. of the OTM Confederated International Conferences on the Move to Meaningful Internet Systems, 2003, 986- 996. |
25 | ALI K M, PAZZANI M J Error reduction through learning multiple descriptions. Machine Learning, 1996, 24 (3): 173- 202. |
26 | VAN DEN HOUT A, VAN DER HEIJDEN P G M Randomized response, statistical disclosure control and misclassification: a review. International Statistical Review, 2002, 70 (2): 269- 288. |
27 | BURGERT T, RAVANBAKHSH M, DEMIR B On the effects of different types of label noise in multi-label remote sensing image classification. IEEE Trans. on Geoscience and Remote Sensing, 2022, 60, 5413713. |
28 |
SAEZ J A, LUENGO J, HERRERA F Predicting noise filtering efficacy with data complexity measures for nearest neighbor classification. Pattern Recognition, 2013, 46 (1): 355- 364.
doi: 10.1016/j.patcog.2012.07.009 |
29 | THONGKAM J, XU G D, ZHANG Y C, et al Support vector machine for outlier detection in breast cancer survivability prediction. Proc. of the Asia-Pacific Web Conference, 2008, 99- 109. |
30 |
SEGATA N, BLANZIERI E, DELANY S J, et al Noise reduction for instance-based learning with a local maximal margin approach. Journal of Intelligent Information Systems, 2010, 35 (2): 301- 331.
doi: 10.1007/s10844-009-0101-z |
31 | TENG C M Correcting noisy data. Proc. of the 16th International Conference on Machine Learning, 1999, 239- 248. |
32 | BARANDELA R, GASCA E Decontamination of training samples for supervised pattern recognition methods. Pattern Recognition, 2000, 1876, 621- 630. |
33 | HUGHES N P, ROBERTS S J, TARASSENKO L Semi-supervised learning of probabilistic models for ECG segmentation. Proc. of the 26th IEEE Annual International Conference, 2004, 1, 434- 437. |
34 | WANG Y K, SUN X, FU Y Scalable penalized regression for noise detection in learning with noisy labels. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, 346- 355. |
35 | DELANY S J, CUNNINGHAM P An analysis of case-base editing in a spam filtering system. Advances in Case-Based Reasoning, 2004, 3155, 128- 141. |
36 | PRASAD M N, SOWMYA A Multi-class unsupervised classification with label correction of hrct lung images. Proc. of the International Conference on Intelligent Sensing and Information Processing, 2004, 51- 56. |
37 |
ARAFA A, EL-FISHAWY N, BADAWY M, et al RN-SMOTE: reduced noise SMOTE based on DBSCAN for enhancing imbalanced data classification. Journal of King Saud University-Computer and Information Sciences, 2022, 34 (8): 5059- 5074.
doi: 10.1016/j.jksuci.2022.06.005 |
38 |
BUSHRA A A, YI G M Comparative analysis review of pioneering DBSCAN and successive density-based clustering algorithms. IEEE Access, 2021, 9, 87918- 87935.
doi: 10.1109/ACCESS.2021.3089036 |
39 | REBBAPRAGADA U, BRODLEY C E, SULLA-MENASHE D, et al Active label correction. Proc. of the IEEE 12th International Conference on Data Mining, 2012, 1080- 1085. |
40 | HANCOX-LI L Robustness in machine learning explanations: does it matter? Proc. of the Conference on Fairness, Accountability, and Transparency, 2020, 640- 647. |
41 | ABELLAN J, MASEGOSA A R An experimental study about simple decision trees for bagging ensemble on datasets with classification noise. Symbolic and Quantitative Approaches to Reasoning with Uncertainty, 2009, 5590, 446- 456. |
42 | LI S K, XIA X B, GE S M, et al Selective-supervised contrastive learning with noisy labels. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, 316- 325. |
43 | DUDA R O, HART P E, STORK D G. Pattern classification. Hoboken: John Wiley & Sons, 2001. |
44 | OKAMOTO S N, OBUHIRO Y An average-case analysis of the k-nearest neighbor classifier for noisy domain . Proc. of the 15th International Joint Conference on Artificial Intelligence, 1997, 1, 238- 243. |
45 | GUO L. Margin framework for ensemble classifiers: application to remote sensing data. Talence: University of Bordeaux, 2011. |
46 | GUO L, BOUKIR S Ensemble margin framework for image classification. Proc. of the IEEE International Conference on Image Processing, 2014, 4231- 4235. |
47 | SLUBAN B, GAMBERGER D, LAVRAC N Ensemble-based noise detection: noise ranking and visual performance evaluation. Data Mining and Knowledge Discovery, 2013, 38, 265- 303. |
48 |
KHOSHGOFTAAR T M, ZHONG S, JOSHI V Enhancing software quality estimation using ensemble-classifier based noise filtering. Intelligent Data Analysis, 2005, 9 (1): 3- 27.
doi: 10.3233/IDA-2005-9102 |
49 | QUINLAN J R. C4.5: programs for machine learning. San Francisco: Morgan Kaufmann Publishers Inc, 1993. |
50 | MIRANDA A L B, GARCIA A C, CARVALHO L P, et al Use of classification algorithms in noise detection and elimination. Proc. of the International Conference on Hybrid Artificial Intelligence Systems, 2009, 417- 424. |
51 | GUYON I, MATIC N, VAPNIK V. Discovering informative patterns and data cleaning. USAMA M F, GREGORY P S, PADHRAIC S, et al. ed. Advances in Knowledge Discovery and Data Mining. Menlo Park: American Association for Artificial Intelligence, 1996. |
52 | WHEWAY V Using boosting to detect noisy data. Proc. of the Pacific Rim International Conference on Artificial Intelligence Workshop Reader, 2001, 2112, 123- 130. |
53 | BREIMAN L. Arcing the edge. Berkeley: University of California, 1997. |
54 | SCHAPIRE R E, FREUND Y, BARTLETT P, et al Boosting the margin: a new explanation for the effectiveness of voting methods. The Annals of Statistics, 1998, 26 (5): 1651- 2080. |
55 |
KARMAKER A, KWEK S A boosting approach to remove class label noise. International Journal of Hybrid Intelligent Systems, 2006, 3 (3): 169- 177.
doi: 10.3233/HIS-2006-3305 |
56 |
SLUBAN B, LAVRAC N Relating ensemble diversity and performance: a study in class noise detection. Neurocomputing, 2015, 160, 120- 131.
doi: 10.1016/j.neucom.2014.10.086 |
57 | SHAO H C, WANG H C, SU W T, et al Ensemble learning with manifold-based data splitting for noisy label correction. IEEE Trans. on Multimedia, 2021, 24, 1127- 1140. |
58 | CANTADOR I, DORRONSORO J R Boosting parallel perceptrons for label noise reduction in classification problems. Proc. of the International Work Conference on the Interplay between Natural and Artificial Computation, 2005, 586- 593. |
59 | FREUND Y, SCHAPIRE R E Experiments with a new boosting algorithm. Proc. of the 13th International Conference on Machine Learning, 1996, 148- 156. |
60 |
CAO J J, KWONG S, WANG R A noise-detection based adaboost algorithm for mislabeled data. Pattern Recognition, 2012, 45 (12): 4451- 4465.
doi: 10.1016/j.patcog.2012.05.002 |
61 | VEZHNEVETS A, BARINOVA O Avoiding boosting overfitting by removing confusing samples. Proc. of the European Conference on Machine Learning, 2007, 430- 441. |
62 | OZA N C Aveboost2: boosting for noisy data. Proc. of the International Workshop on Multiple Classifier System-Multiple Classifier Systems, 2004, 3077, 31- 40. |
63 | DOMINGO C, WATANABE O Madaboost: a modification of adaboost. Proc. of the 13th Annual Conference on Computational Learning Theory, 2000, 180- 189. |
64 | KIM Y D Averaged boosting: a noise-robust ensemble method. Advances in Knowledge Discovery and Data Mining, 2003, 2637, 388- 393. |
65 |
BARTLETT P L, JORDAN M I, MCAULIFFE J D Convexity, classification, and risk bounds. Journal of the American Statistical Association, 2006, 101 (473): 138- 156.
doi: 10.1198/016214505000000907 |
66 | KRIEGER LONG C, WYNER A Boosting noisy data. Proc. of the 18th International Conference on Machine Learning, 2001, 274- 281. |
67 | ABELLAN J, CASTELLANO J G, MANTAS C J. A new robust classifier on noise domains: bagging of credal C4. 5 trees. Complexity, 2017. DOI: 10.1155/2017/9023970. |
68 |
SABZEVARI M, MARTINEZ-MUNOZ G, SUAREZ A Small margin ensembles can be robust to class-label noise. Neurocomputing, 2015, 160, 18- 33.
doi: 10.1016/j.neucom.2014.12.086 |
69 |
SAEZ J A, GALAR M, LUENGO J, et al Analyzing the presence of noise in multi-class problems: alleviating its influence with the one-vs-one decomposition. Knowledge and Information Systems, 2014, 38 (1): 179- 206.
doi: 10.1007/s10115-012-0570-1 |
70 | WILLIAM W C Fast effective rule induction. Proc. of the 12th International Conference on Machine Learning, 1995, 115- 123. |
[1] | Tingting WEI, Weilin YUAN, Junren LUO, Wanpeng ZHANG, Lina LU. VLCA: vision-language aligning model with cross-modal attention for bilingual remote sensing image captioning [J]. Journal of Systems Engineering and Electronics, 2023, 34(1): 9-18. |
[2] | Hao DU, Wei WANG, Xuerao WANG, Yuanda WANG. Autonomous landing scene recognition based on transfer learning for drones [J]. Journal of Systems Engineering and Electronics, 2023, 34(1): 28-35. |
[3] | Binquan LI, Xiaohui HU. Effective distributed convolutional neural network architecture for remote sensing images target classification with a pre-training approach [J]. Journal of Systems Engineering and Electronics, 2019, 30(2): 238-244. |
[4] | Yongpeng Zhu, Yinsheng Wei, and Peng Tong. First order sea clutter cross section for bistatic shipborne HFSWR [J]. Systems Engineering and Electronics, 2017, 28(4): 681-. |
[5] | Hao Chen, Jun Li, Ning Jing, and Jun Li. User-oriented data acquisition chain task planning algorithm for operationally responsive space satellite [J]. Journal of Systems Engineering and Electronics, 2016, 27(5): 1028-1039. |
[6] | Gensheng Hu, Xiaoqi Sun, Dong Liang, and Yingying Sun. Cloud removal of remote sensing image based on multi-output support vector regression [J]. Journal of Systems Engineering and Electronics, 2014, 25(6): 1082-1088. |
[7] | Qunming Wang, Liguo Wang, and Danfeng Liu. Integration of spatial attractions between and within pixels for sub-pixel mapping [J]. Journal of Systems Engineering and Electronics, 2012, 23(2): 293-303. |
[8] | Zhu Zhengwei, & Zhou Jianjiang. Optimum selection of common master image for ground deformation monitoring based on PS-DInSAR technique [J]. Journal of Systems Engineering and Electronics, 2009, 20(6): 1213-1220. |
[9] | Chang Tiantian, Liu Hongwei & Zhou Shuisheng. Large scale classification with local diversity AdaBoost SVM algorithm [J]. Journal of Systems Engineering and Electronics, 2009, 20(6): 1344-1350. |
[10] | Nian Feng, Yang Yujie & Wang Wei. Research of optimizing the microwave wide band blackbody calibration target [J]. Journal of Systems Engineering and Electronics, 2009, 20(1): 6-12. |
[11] | Nian Feng, Yang Yujie, Chen Yunmei, Xu Dezhong & Wang Wei. Recent progress on space-borne microwave sounder pre-launch calibration technologies in China [J]. Journal of Systems Engineering and Electronics, 2008, 19(4): 643-651. |
[12] | Fang Min. Novel ensemble learning based on multiple section distribution in distributed environment [J]. Journal of Systems Engineering and Electronics, 2008, 19(2): 377-380. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||