Thuật toán mạng nơ-ron nhanh cho việc giải quyết các tác vụ phân loại

Thuật toán mạng nơ-ron nhanh giúp giải quyết các nhiệm vụ phân loại hiệu quả, nâng cao độ chính xác và tốc độ xử lý trong học máy.

Trường đại học

Virginia Commonwealth University

Chuyên ngành

Computer Science

Người đăng

Ẩn danh

Thể loại

thesis

2012

Phí lưu trữ

30 Point

Mục lục chi tiết

Dedication

Acknowledgment

1. AN INTRODUCTION TO NEURAL NETWORKS

1.1. Introduction

1.2. Artificial Neural Network

1.3. Architectures of Neural Network

1.4. Learning Methods

1.4.1. Supervised Learning

1.4.2. Unsupervised Learning

1.4.3. Reinforcement Learning

1.5. Applications of Supervised Learning in NN

1.6. Perceptron

1.7. Perceptron Learning Algorithm

1.8. Multilayer Perceptron (MLP)

1.9. Activation Functions

2. OVERVIEW OF THE EXPERIMENT

2.1. Contents of Experimental Chapters

3. EXPERIMENTAL PROCEDURES OF DEVELOPING FAST NN ALGORITHM

3.1. Least Mean Squares Algorithm (LMS)

3.2. Adapting Learning Rate and the Momentum Term

3.3. Error Back-Propagation Algorithm (EBP)

3.4. Fast Neural Network Algorithm

3.4.1. Batch Learning Technique

3.4.2. Batch EBP Algorithm

3.4.3. Summary of the Fast Neural Network Algorithm

3.4.4. Issues to be Considered

3.4.4.1. Labeling Desired Output

3.4.4.2. Using a Single Neuron in the OL of One Model/K OL Neurons Structure for Two-Class Data Sets

3.5. Experimental Data Sets

3.5.1. Scaling Raw Data

3.5.2. Shuffling the Scaled Data Set

4. EXPERIMENTAL NEURAL NETWORK STRUCTURES

4.1. The Differences between Neural Network Structures

4.1.1. One Model/K Output Layer Neurons Structure

4.1.2. K Separate Models/One Output Layer Neuron Structure

4.1.3. K Joint Models/One Output Layer Neuron

4.1.3.1. One Model/K OL Neurons

4.1.3.2. K Separate Models/One OL Neuron

4.1.3.3. K Joint Models/One OL Neuron

5. EXPERIMENTAL RESULTS AND DISCUSSION

5.1. Controlling the Experimental Environment

5.2. Comparison of Three Different MLP Structures

5.2.1. Comparison of Three Different MLP Structures in Term of Accuracy

5.2.2. Comparison of Three Different MLP Structures in Terms of Structure Size

5.2.3. Comparison of Three Different MLP Structures in Term of Time Consumption

5.3. Using a Neuron in the OL of One Model/K OL Neurons Structure for Two-Class Data Sets

List of Tables

List of Figures

Abbreviations

Abstract

Tóm tắt

I. Tổng quan về thuật toán mạng nơ ron nhanh cho phân loại

Thuật toán mạng nơ-ron nhanh đã trở thành một công cụ quan trọng trong lĩnh vực học máy, đặc biệt là trong các tác vụ phân loại. Mạng nơ-ron, với khả năng học từ dữ liệu lớn, giúp cải thiện độ chính xác và hiệu suất trong việc phân loại dữ liệu. Bài viết này sẽ khám phá các khía cạnh chính của thuật toán này, từ cấu trúc đến ứng dụng thực tiễn.

1.1. Mạng nơ ron và vai trò trong học máy

Mạng nơ-ron là một mô hình tính toán mô phỏng cách thức hoạt động của não bộ con người. Chúng có khả năng học từ dữ liệu và cải thiện hiệu suất qua thời gian. Việc áp dụng mạng nơ-ron trong học máy đã mở ra nhiều cơ hội mới cho các ứng dụng phân loại.

1.2. Lịch sử phát triển của mạng nơ ron

Mạng nơ-ron đã trải qua nhiều giai đoạn phát triển từ những năm 1940. Sự ra đời của các thuật toán như EBP (Error Back Propagation) đã giúp cải thiện đáng kể khả năng phân loại của mạng nơ-ron, đặc biệt trong các bài toán phức tạp.

II. Thách thức trong việc áp dụng mạng nơ ron cho phân loại

Mặc dù mạng nơ-ron mang lại nhiều lợi ích, nhưng cũng tồn tại nhiều thách thức trong việc áp dụng chúng cho các tác vụ phân loại. Các vấn đề như overfitting, lựa chọn cấu trúc mạng phù hợp và tối ưu hóa thuật toán là những yếu tố cần được xem xét.

2.1. Vấn đề overfitting trong mạng nơ ron

Overfitting xảy ra khi mô hình học quá nhiều từ dữ liệu huấn luyện, dẫn đến hiệu suất kém trên dữ liệu mới. Việc điều chỉnh cấu trúc mạng và sử dụng các kỹ thuật như dropout có thể giúp giảm thiểu vấn đề này.

2.2. Lựa chọn cấu trúc mạng phù hợp

Cấu trúc mạng nơ-ron cần được lựa chọn cẩn thận để phù hợp với loại dữ liệu và bài toán phân loại. Việc thử nghiệm với các kiến trúc khác nhau có thể giúp tìm ra giải pháp tối ưu nhất.

III. Phương pháp phát triển thuật toán mạng nơ ron nhanh

Để phát triển một thuật toán mạng nơ-ron nhanh, cần áp dụng các phương pháp tối ưu hóa và cải tiến thuật toán. Các kỹ thuật như điều chỉnh tốc độ học và sử dụng các thuật toán tối ưu hóa mới có thể giúp tăng tốc độ huấn luyện.

3.1. Kỹ thuật điều chỉnh tốc độ học

Tốc độ học là một yếu tố quan trọng trong quá trình huấn luyện mạng nơ-ron. Việc điều chỉnh tốc độ học có thể giúp cải thiện khả năng hội tụ của mô hình và giảm thiểu thời gian huấn luyện.

3.2. Sử dụng thuật toán tối ưu hóa mới

Các thuật toán tối ưu hóa như Adam và RMSprop đã được chứng minh là hiệu quả trong việc cải thiện tốc độ và độ chính xác của mạng nơ-ron. Việc áp dụng các thuật toán này có thể giúp tăng cường hiệu suất của mô hình.

IV. Ứng dụng thực tiễn của mạng nơ ron trong phân loại

Mạng nơ-ron nhanh đã được áp dụng rộng rãi trong nhiều lĩnh vực, từ nhận diện hình ảnh đến phân tích ngữ nghĩa. Các ứng dụng này không chỉ giúp cải thiện độ chính xác mà còn tiết kiệm thời gian và tài nguyên.

4.1. Nhận diện hình ảnh

Mạng nơ-ron đã được sử dụng để nhận diện và phân loại hình ảnh với độ chính xác cao. Các ứng dụng trong lĩnh vực y tế, an ninh và thương mại điện tử đang ngày càng phổ biến.

4.2. Phân tích ngữ nghĩa

Trong lĩnh vực xử lý ngôn ngữ tự nhiên, mạng nơ-ron giúp phân tích và hiểu ngữ nghĩa của văn bản, từ đó cải thiện khả năng tìm kiếm và phân loại thông tin.

V. Kết luận và tương lai của mạng nơ ron trong phân loại

Mạng nơ-ron nhanh đang mở ra nhiều cơ hội mới trong lĩnh vực phân loại. Với sự phát triển không ngừng của công nghệ, tương lai của mạng nơ-ron hứa hẹn sẽ mang lại nhiều cải tiến và ứng dụng mới.

5.1. Xu hướng phát triển trong tương lai

Các nghiên cứu hiện tại đang tập trung vào việc cải thiện khả năng của mạng nơ-ron trong việc xử lý dữ liệu lớn và phức tạp. Sự phát triển của AI sẽ tiếp tục thúc đẩy sự tiến bộ trong lĩnh vực này.

5.2. Tác động đến các lĩnh vực khác

Mạng nơ-ron không chỉ ảnh hưởng đến lĩnh vực học máy mà còn có tác động lớn đến các ngành công nghiệp khác như y tế, tài chính và giáo dục. Sự tích hợp của mạng nơ-ron vào các hệ thống hiện tại sẽ tạo ra những thay đổi đáng kể.

25/07/2025

Bạn đang xem trước tài liệu:

Fast neural network algorithm for solving classification tasks

Tải đầy đủ

Trích đoạn nội dung tài liệu

Virginia Commonwealth University VCU Scholars Compass Theses and Dissertations Graduate School 2012 FAST NEURAL NETWORK ALGORITHM FOR SOLVING CLASSIFICATION TASKS Noor Albarakati Virginia Commonwealth University Follow this and additional works at: https://scholarscompass.edu/etd Part of the Computer Sciences Commons © The Author Downloaded from https://scholarscompass.edu/etd/2740 This Thesis is brought to you for free and open access by the Graduate School at VCU Scholars Compass. It has been accepted for inclusion in Theses and Dissertations by an authorized administrator of VCU Scholars Compass. For more information, please contact libcompass@vcu. Albarakati 2012 All Rights Reserved FAST NEURAL NETWORK ALGORITHM FOR SOLVING CLASSIFICATION TASKS A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science at Virginia Commonwealth University.

By Noor Mubarak Albarakati Bachelor of Science, King Abdul-Aziz University, Saudi Arabia, 2005 Director: Dr. Vojislav Kecman Associate Professor, Department of Computer Science Committee Members Dr. Vojislav Kecman Dr. Kayvan Najarian Dr.

Rosalyn Hobson Virginia Commonwealth University Richmond, Virginia April, 2012 ii Dedication This thesis is dedicated first to my parents: my dear father “May you rest in peace dear father” and my dear mother. Thank you my mother for your endless love, unconditional support and encouragement. I owe my deepest gratitude to my dear siblings for their love, affection and moral support, and especially my sister Nahla and my brother Noseir, who have been kind, taking care and very patient with me in those tough times I went through, encouraging and creatively boosting my energy to its maximum. I would like also to thank my best friend Mahsa Zahary, who was always willing to raise my morale and give me best suggestions.

I never felt lonely working in my lab when she was around. “You will be fine” her favorite sentence to calm my stress down. Best thankful is going to Reyhaneh Mogharabnia, who was always making splendid short visits to my lab. iii Acknowledgment This thesis would not have been possible to exist without having a full supervising, encouragement, guidance and supporting from the initial step of how to do a research until the final step of documenting it into a thesis format unless I have all of these from Dr.

Vojislav Kecman, associate professor in Computer Science department. He really helped me to fully understand the thesis subject, directed me during all my experiment, and taught me the research methodologies and how to write them down academically. I am really glad to have the opportunity to work under Dr. Vojislav Kecman supervision.

I am so thankful for all those people who helped me while I was working in this thesis. At first, I would like to show my gratitude to Robert Strack, who was always readily agreed to help me whenever I faced a problem. I am also very gratitude to the entirely patient Michael Paul Pfaffenberger for grammatically correcting my thesis. I would like also to acknowledge my job back home, Yanbu University College, and Saudi Arabian Cultural Mission, for their academic and financial support.

iv Table of Contents List of Tables ……………………………………………………………………………. vii List of Figures …………………………………………………………………………………. x 1 AN INTRODUCTION TO NEURAL NETWORKS .2 Artificial Neural Network .3 Architectures of Neural Network .5 Applications of Supervised Learning in NN .1 Perceptron Learning Algorithm .1 Threshold Activation Functions .2 Linear Activation Functions.3 Nonlinear Activation Functions .1 Unipolar Logistic (Sigmoidal) Function .2 Bipolar Sigmoidal Function (Hyperbolic Tangent Function) .9 Learning and Generalization .1 Over-Fitting and Under-Fitting Phenomena .2 Bias and Variance Dilemma .3 Controlling Generalization Errors .10 Problem Statement and Previous Work. 14 2 OVERVIEW OF THE EXPERIMENT .3 Contents of Experimental Chapters.

17 3 EXPERIMENTAL PROCEDURES OF DEVELOPING FAST NN ALGORITHM .1 Least Mean Squares Algorithm (LMS) .2 Adapting Learning Rate and the Momentum Term .3 Error Back-Propagation Algorithm (EBP) .4 Fast Neural Network Algorithm .1 Batch Learning Technique .2 Batch EBP Algorithm .3 Summary of the Fast Neural Network Algorithm .4 Issues to be Considered .1 Labeling Desired Output .3 Using a Single Neuron in the OL of One Model/K OL Neurons Structure for Two-Class Data Sets .5 Experimental Data Sets .1 Scaling Raw Data .2 Shuffling the Scaled Data Set. 37 4 EXPERIMENTAL NEURAL NETWORK STRUCTURES .1 The Differences between Neural Network Structures .1 One Model/K Output Layer Neurons Structure .2 K Separate Models/One Output Layer Neuron Structure.3 K Joint Models/One Output Layer Neuron .1 One Model/K OL Neurons .2 K Separate Models/One OL Neuron .3 K Joint Models/One OL Neuron. 42 5 EXPERIMENTAL RESULTS AND DISCUSSION .1 Controlling the Experimental Environment .2 Comparison of Three Different MLP Structures .1 Comparison of Three Different MLP Structures in Term of Accuracy .2 Comparison of Three Different MLP Structures in Terms of Structure Size .3 Comparison of Three Different MLP Structures in Term of Time Consumption.3 Using a Neuron in the OL of One Model/K OL Neurons Structure for Two-Class Data Sets. 57 vii List of Tables 3.1 Experimental Data Set Information ………………………………………………….1 Experimental Fixed Parameters …………………………………………………….2 Experimental Variable Parameters ………………………………………………….3 Accuracy of Three MLP Structures ………………………………………………….4 Number of HL Neurons of Three MLP Structures ………………………………….5 The Accuracy of Using One or Two OL Neurons in Vote Data Set ……………….

55 viii List of Figures 1.2 Multilayer Perceptron of One Hidden Layer and One Output Layer ……………….3 The Trade-off between Bias and Variance ………………………………………….4 Cross Validation Procedure ………………………………………………………….1 A HL Neuron J has a Connection with an OL Neuron K in Details ………………… 25 4.1 One Model/K OL Neurons Structure ……………………………………………….2 K Separate Models/One OL Neuron Structure ……………………………………… 43 4.3 K Joint Models/One OL Neuron Structure ………………………………………….1 The Scores of Ranking Three Different MLP Structures …………………………… 49 5.2 The Accuracy of Different MLP Structures of Eleven Data Sets …………………… 49 5.3 The Accuracy of Different MLP Structures of Eleven Data Sets …………………… 50 5.4 The Structure Size of Three MLP Structures for Eleven Data Sets ………………… 53 5.5 Training time of Different MLP Structures of Eleven Data Sets……………………. 54 ix Abbreviations ANN Artificial Neural Network ART Adaptive Resonance Theory EBP Error Back Propagation FNN Feedforward Neural Network HL Hidden Layer IL Input Layer LMS Least Mean Square MLP Multilayer Perceptrons MSE Mean Square Error NN Neural Network OL Output Layer OvA One-versus-All RBFN Radial Basis Function Network RNN Recurrent Neural Network SLP Single Layer Perceptron SOM Self-Organizing Map SVM Support Vector Machine Abstract FAST NEURAL NETWORK ALGORITHM FOR SOLVING CLASSIFICATION TASKS By Noor M. A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science at Virginia Commonwealth University. Virginia Commonwealth University, 2012.

Major Director: Dr. Vojislav Kecman, Associate Professor, Department of Computer Science Classification is one-out-of several applications in the neural network (NN) world. Multilayer perceptron (MLP) is the common neural network architecture which is used for classification tasks. It is famous for its error back propagation (EBP) algorithm, which opened the new way for solving classification problems given a set of empirical data.

In the thesis, we performed experiments by using three different NN structures in order to find the best MLP neural network structure for performing the nonlinear classification of multiclass data sets. A developed learning algorithm used here is the batch EBP algorithm which uses all the data as a single batch while updating the NN weights. The batch EBP speeds up training significantly and this is also why the title of the thesis is dubbed 'fast NN …'. In the batch EBP, and when in the output layer a linear neurons are used, one implements the pseudo-inverse algorithm to calculate the output layer weights.

In this way one always finds the local minimum of a cost function for a given hidden layer weights. Three different MLP neural network structures have been investigated while solving classification problems having K classes: one model/K output layer neurons, K separate models/One output layer neuron, and K joint models/One output layer neuron. The extensive series of experiments performed within the thesis proved that the best structure for solving multiclass classification problems is a K joint models/One output layer neuron structure. 1 An Introduction to Neural Networks 1.1 Introduction Machine learning is a significant part of almost all research and developments today.

Gaining knowledge from empirical data is the core of machine learning. The knowledge is achieved by changing either a structure or parameters of a model or both in order to improve its expected performance on future data [3]. These changes have been performed to accomplish one of artificial intelligence tasks which can be learning, decision making, prediction, recognition, diagnosis, planning, control, …, etc. Recently, different approaches are used to learn from data such as support vector machine (SVM), decision tree, clustering, Bayesian networks, genetic programming, and artificial neural network.

This thesis will discuss learning from experimental data by using artificial neural network. In particular, it will develop a fast neural network algorithm and it will test several neural network structures in order to find what the best approach for multiclass classification problems is.2 Artificial Neural Network Artificial neural network (ANN), or often it called neural network, is a parallel computational model that takes its structure and function from biological neural networks. A neuron is the main artificial node in the NN. It processes the summation of inputs by using activation function to generate an output.

An activation function could be linear or nonlinear. All neurons are connected peer-to-peer to each other by weights wi. The output of a nonlinear neuron is given by 𝑜 = 𝑓(𝑢) = 𝑓(∑𝑛𝑖=1 𝑤𝑖 𝑥𝑖 + 𝑏) = 𝑓(𝒘𝑇 𝒙 + 𝑏) (1.1) where, u is an input to the neuron and o is its output, f(u) is an known dependency, mapping or function, between input and output, 𝑥𝒊 is the 𝑖th input, 𝑤𝒊 is the i-th weight, n is the total number of inputs, and b is a threshold or a bias.3 Architectures of Neural Network Neural network can basically be divided into feedforward neural network, and recurrent neural network. Feedforward neural network (FNN) architecture consists of a finite number of layers which contain a finite number of neurons in a feedforward manner.

There is neither no feedback connection in the whole network, nor a connection between neurons in a single layer. The layers are connected by network weights. Number of neurons in a single layer has to be sufficient to solve the problem, and number of layers has to be minimal as much as possible to reduce the problem solving time. FNN are classified into fully connected layered FNN or partially connected layered FNN.

When each neuron connects to every feedforward neurons in the 2 network, it is considered as a fully connected layered FNN. Otherwise, FNN will be considered to be a partial one. Multilayer Perceptrons (MLP) and Radial Basis Function Network (RBFN) are the most fully connected layered FNN could be used in NN. In recurrent neural network (RNN), there is at least one feedback connection, and that make this type of network a dynamic NN.

Hopfield model and the Boltzmann machine are the most popular RNN.4 Learning Methods Neural network has to learn its parameters, such as weights by using training data (learning process) in order to predict, or to estimate, the correct output for any new input (generalization process). Learning methods are mostly classified into supervised, unsupervised and reinforcement learning.1 Supervised Learning Supervised learning is basically about having the data set as pairs of input and desired output (x, d). Error-correction rule is a learning technique which is used in supervised learning algorithms to do a direct comparison between desired output d and actual network output o for a given input x in order to minimize the errors values between them (e = d - o). During training phase, network weights have been adjusted by feeding the errors back to the network.

Usually, mean square error approach (MSE) is used as a cost function [3]. Two neural network applications that apply supervised learning algorithms are the classification and regression.

Nội dung được bảo vệ bản quyền — Tải xuống đầy đủ