Luận văn về phương pháp cải tiến trọng số từ trong phân loại văn bản

Chuyên ngành

Computer Science

Người đăng

Ẩn danh

Thể loại

Thesis

2014

61
0
0

Phí lưu trữ

30 Point

Mục lục chi tiết

ORIGINALITY STATEMENT

ABSTRACT

ACKNOWLEDGEMENTS

1. CHƯƠNG 1: INTRODUCTION

1.1. Motivation

1.2. Structure of this Thesis

2. CHƯƠNG 2: OVERVIEW OF TEXT CATEGORIZATION

2.1. Text Categorization tasks

2.1.1. Single-label and Multi-label Text Categorization

2.1.2. Flat and Hierarchical Text Categorization

2.2. Applications of Text Categorization

2.2.1. Automatic Document Indexing for IR Systems

2.2.2. Document Organization

2.2.3. Word Sense Disambiguation

2.2.4. Hierarchical Categorization of Web Pages

2.3. Machine learning approaches to Text Categorization

2.3.1. k Nearest Neighbor

2.3.2. Support Vector Machines

2.3.3. Performance Measures

3. CHƯƠNG 3: TERM WEIGHTING SCHEMES

3.1. Previous Term Weighting Schemes

3.1.1. Unsupervised Term Weighting Schemes

3.1.2. Supervised Term Weighting Schemes

3.2. Our New Term Weighting Scheme

3.2.1. Term Weighting Methods

3.2.2. Machine Learning Algorithm

3.2.3. Reuters News Corpus

3.2.4. 20 News groups Corpus

3.2.5. Evaluation Measures

4. CHƯƠNG 4: RESULTS AND DISCUSSION

4.1. Results on the 20 News groups corpus

4.2. Results on the Reuters News corpus

4.3. Discussion

4.4. Further Analysis

5. CHƯƠNG 5: CONCLUSION

List of Figures

List of Tables

List of Abbreviations