MINISTRY OF EDUCATION AND TRAINING MINISTRY OF NATIONAL DEFENSE MILITARY TECHNICAL ACADEMY NGUYEN LONG A MULTI-OBJECTIVE EVOLUTIONARY ALGORITHM USING DIRECTIONS OF IMPROVEMENT AND APPLICATION THE THESIS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY IN MATHEMATICS Hanoi – 2014 e MINISTRY OF EDUCATION AND TRAINING MINISTRY OF NATIONAL DEFENSE MILITARY TECHNICAL ACADEMY A MULTI-OBJECTIVE EVOLUTIONARY ALGORITHM USING DIRECTIONS OF IMPROVEMENT AND APPLICATION Specialized in: Fundamentals of Mathematics for Informatics Code: 62 46 01 10 THE THESIS IS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY IN MATHEMATICS SUPERVISORS: 1. DR BUI THU LAM 2. DR NGUYEN VAN HAI Hanoi - 2014 e e Abstract A multi-objective optimization problem involves at least two conflicting objectives and it has a set of Pareto optimal solutions. Multi-objective evolutionary algorithms (MOEAs) use a population of solutions to approximate the Pareto optimal set in a single run.
MOEAs have attracted a lot of research attention during the past decade. They are still one of the hottest research areas in the field of Computational Intelligence and they are the main focus of this thesis. Firstly, the main concepts for multi-objective optimization are presented, then the thesis con- cerns about mentions the solving multi-objective optimization problems by multi-objective evolutionary algorithms. This thesis also conducts a survey on the usage of directorial infor- mation in search’s guidance.
Through the survey, the thesis indicates that there is a need to have more investigation on how to have an e↵ective guidance from both aspects: 1. Automatically guiding the evolutionary process to make the MOEA balanced between exploitation and exploration. Combining decision maker’s preference with directions of improvement to guide the MOEAs during optimal process toward the most preferred region in the objective space. To address this, the thesis builds up all its proposals based on a direction based multi- objective evolutionary algorithm (DMEA), the most recent one with a systematic way to maintain directions of improvement so some related issues on DMEA are raised and anal- ysed, hypothesised as primary research problems in this thesis.
At the highlighted chapters, the thesis discusses all the issues on using directions of improve- ment in DMEA through thesis’s contributions: 1. Design a new proposed direction based multi-objective evolutionary algorithm version e ii II (DMEA-II) with following improvement techniques: • Using an adaptive ratio between convergence and spread directions. • Using a Ray based density niching method for the main population. • Using a new Ray based density selection scheme for dominated solutions selection.
• Using a new parents selection scheme for the o↵springs perturbation. In order to validate the proposed algorithm, a series of experiments on a wide range of test problems was conducted. It obtained quite good results on primary performance metrics, including the generation distance (GD), the inverse generation distance (IGD), the hypervolume (HYP) and the two set coverage (SC). The analysis on the results indicates the better performance of DMEA-II in comparison with the most popular MOEAs.
Proposes an interactive method for DMEA-II as the second aspect of having an e↵ective guidance. An interactive method is introduced with three ray based approaches: Rays Replacement, Rays Redistribution, Value Added Niching. The experiments carried out a case study on several test problems and showed quite good results. Introduces a SpamAssassin based Spam Email Detection System that uses DMEA- II.
The proposed system helps users to have more good choices for the SpamAssassin system in configuration. e iii Acknowledgements The first of all, I would like to express my respectful thanks to my principal supervisor, Assoc. Bui Thu Lam for his directly guidance to my PhD progress. Bui has given me knowledge and passion as the motivation of this thesis.
His valued guidance has inspired much of the research in the thesis. I also wish to thank my co-supportive Assoc. Nguyen Van Hai for his suggestions and knowledge during my research, especially the relation between theories and real problems in work. I also would like to thank Prof.
Hussein Abbass, Assoc. Tran Quang Anh and Assoc. Dao Thanh Tinh for their invaluable support throughout my PhD. I feel lucky to work with such excellent people.
I also would like to thank all of my fellows in the Department of Software Technology and Evolutionary Computation research group for their assistance and support. Last but not least, I also would like to acknowledge the support of my family, especially my parents Dr. Nguyen Nghi, Truong Thi Hong, they worked hard and believed strongly in their children. I also would like to thanks my wife, sisters, brothers who always support me during my research.
e iv Originality Statement I hereby declare that this thesis is my own work, with my knowledge and belief the thesis has no material previously published or written by others. Any contributions made to the research by colleagues, with people in our research team at Le Quy Don Technical University or elsewhere, during my candidature is clearly acknowledged. I also declare that the intellectual content in this submission is the research results of my own work, except to the extent that assistance from others in conception or in style, presentation and linguistic expression is acknowledged. ev Contents Abstract ii List of Figures ix List of Tables xi Abbreviations xii 1 Introduction 1 1.4 Questions and Hypothesises.
10 2 Background concepts and Issues 13 2.1 Multi-objective problems .5 Weak Pareto Optimality .1 No-preference methods .3 An overview of Multi-objective Evolutionary Algorithms .1 Non-elitist methods .5 Search’s guidance in MOEAs .1 Technique of using guided directions .2 Advantages and disadvantages .1 Direction based multi-objective evolutionary algorithm (DMEA) .2 Issue 01: The disadvantages of the fixed ratio between types of directions 51 2.3 Issue 02: Lack of an efficient niching method for the main population .4 Issue 03: The disadvantages of using the weighted sum scheme .5 Issue 04: Using a ’hard’ niching method .6 Issue 05: Investigating on how the DM can interact with DMEA. 54 3 A guided methodology using directions of improvement 55 3.1 Using an adaptive ratio between convergence and spread directions .2 Using a Ray based density niching for the main population .3 Using a ray based density selection schemes .4 Direction based Multi-objective Evolutionary Algorithm-II .4 Results and Discussion .5 Analyzing e↵ects of di↵erent selection schemes for the perturbation. 86 4 A guided methodology using interaction with decision makers 87 4.2 A multi-point Interactive method for DMEA-II .3 Value Added Niching .5 Results and Discussion. 102 5 An application of DMEA-II for a spam email detection system 104 5.2 Spam email detection .3 An interactive method .6 Results and Discussion.
123 6 Conclusions and Future Work 124 6. 129 Publications 130 Appendix A Benchmark sets 132 e viii List of Figures 2.1 An illustration of optimal Pareto .2 An illustration of weak optimal Pareto .3 An illustration of the weighted-sum approach .4 An illustration of the ✏-constraint approach .5 An illustration of performance metrics .6 An illustration of descent directions .7 An illustration of Pareto descent directions .8 An illustration of determination directions in di↵erent cases .9 An illustration of di↵erential directions .10 An illustration of directional convergence and directional spread .11 An illustration of the movement of a centroid .12 An illustration of convergence and spread directions .13 An illustration of the ray system .14 An illustration of the performance of DMEA .1 An illustration of the Ray-based Density .2 The obtained non-dominated of DMEA and DMEA-II .3 Results on DTLZ2, UF1, UF3 and UF8 .4 Visualization of GD and IGD overtime for ZDT1, ZDT4 .5 The chart for DMEA-II and DMEA comparison on GD, IGD and HYP .6 The chart for DMEA-II and other MOEAs comparison on GD .7 The chart for DMEA-II and other MOEAs comparison on IGD .8 The chart for DMEA-II and other MOEAs comparison on HYP .9 The chart for DMEA-II and other MOEAs comparison on SC .10 Visualization of GD and IGD over time for ZDT1, ZDT2 .11 Visualization of GD and IGD over time for ZDT3, DTLZ3 .1 An illustration of altering the reference point .2 An illustration of the use reference direction approach .3 An illustration of the rays replacement approach .4 An illustration of the rays redistribution approach .5 An Illustration of the value added niching approach .6 A visualization of the interactive method on ZDT1 .7 A visualization of the interactive method on ZDT2 .8 A visualization of the interactive method on ZDT3 .9 A visualization of the interactive method on ZDT4 .10 A visualization of the interactive method on ZDT6 .1 An illustration of results with 30 and 100 rules for 272 emails .2 An illustration of results with 30 and 100 rules for 426 emails .3 An illustration of results with 30 and 100 rules for 286 multilingual emails .4 Results for the Rays Replacement approach with 30 rules .5 Results for the Rays Replacement approach with 50 rules .6 Results for the Rays Replacement approach with 100 rules .7 Results for the Rays Redistribution approach with 30 rules .8 Results for the Rays Redistribution approach with 50 rules .9 Results for the Rays Redistribution approach with 100 rules .10 Results for the Value Added Niching approach with 30 rules .11 Results for the Value Added Niching approach with 50 rules .12 Results for the Value Added Niching approach with 100 rules. 122 ex List of Tables 3.1 The main features of test problems .2 Common parameter settings .4 The average values of GD, IGD and HYP .5 The average value of GD .6 The average value of IGD .7 The average value of HYP .8 The comparison of DMEA-II and others on SC .9 The GD, IGD, HYP, SC results for DMEA-II and MOEA/D .10 The GD values of DMEA-II and DMEA-II* over the first 200 generations .11 The IGD values of DMEA-II and DMEA-II* over the first 200 generations .1 The main features of ZDT problems .2 The result of SOOA with 30 and 100 rules for 272 emails .3 The result of SOOA with 30 and 100 rules for 426 emails .4 The result of SOOA with 30 and 100 rules for 286 multilingual emails. 139 e xi Abbreviations Abbreviation Meaning EA Evolutionary Algorithm GA Genetic Algorithm ES Evolution Strategies EP Evolution Programming GP Genetic Programming MOP Multi-objective Optimization Problem MOEA Multi-objective Evolutionary Algorithm POF Pareto Optimal Front POS Pareto Optimal Set RD Ray based Density DMEA Direction based Multi-objective Evolutionary Algorithm DMEA-II Direction based Multi-objective Evolutionary Algorithm-II NSGA-II Non-Dominated Sorting Genetic Algorithm II SPEA2 Strength Pareto Evolutionary Algorithm 2 MOEA/D Multi-objective Evolutionary Algorithm Based on Decomposition MOGA Multi-objective Genetic Algorithm NPGA Niched Pareto Genetic Algorithm PAES Pareto-Archived Evolution Strategy MOPSO Multi-objective Particle Swarm Optimization PDE Pareto Di↵erential Evolution DM Decision Maker GD Generational Distance IGD Inverse Generational Distance HYP Hypervolume SC Two Set Converge SDR Spam Detection Rate FAR False Alarm Rate VSDSA Vietnamese spam detection based on SpamAssassin CD Convergence Direction SD Spread Direction DC Directorial Convergence DS Directorial Spread e xii ! BẢNG THUẬT NGỮ SỬ DỤNG TRONG LUẬN ÁN Tiếng Anh Tiếng Việt Evolutionary Algorithm Giải thuật tiến hóa Multi-objective Optimization Problem Bài toán tối ưu đa mục tiêu Multi-objective Evolutionary Algorithm Giải thuật tiến hóa Pareto Optimal Front Lớp tối ưu Pareto Pareto Optimal Set Tập tối ưu Pareto Directions of Improvement Hướng cải thiện Convergence Direction Hướng hội tụ Spread Direction Hướng tản mát Differential Direction Hướng vi phân Gradient Direction Hướng Gradient Generational Distance Khoảng cách thế hệ Inverse Generational Distance Khoảng cách thế hệ đảo Hypervolume Siêu diện tích Spam Detection Rate Tỷ lệ nhận dạng thư rác False Alarm Rate Tỷ lệ nhận dạng sai Decision Maker Người ra quyết định Reference point Điểm tham chiếu Reference region Vùng tham chiếu Spam Detection System Hệ thống lọc thư rác Interactive method Phương pháp tương tác ! xiii! e Chapter 1 Introduction 1.