Khôi Phục Video Khuôn Mặt Cổ: Tái Tạo Không-Thời Gian cho Phim Cổ

Khóa luận tốt nghiệp nghiên cứu tốt nghiệp khoa học máy tính phục chế video chân dung cũ thông qua tái tạo khung hình sử dụng thông, vận dụng lý thuyết vào thực tế, đề xuất giải

Trường đại học

Vietnam National University - Ho Chi Minh City University of Information Technology

Chuyên ngành

Bachelor of Computer Science

Người đăng

Ẩn danh

Thể loại

Bachelor Thesis

2023

Phí lưu trữ

30 Point

Mục lục chi tiết

DANH SÁCH HỘI ĐỒNG BẢO VỆ KHÓA LUẬN

Acknowledgements

Abstract

1. Introduction

1.1. Overview

1.2. Problem Definition

1.3. Challenges

1.4. Motivation

1.5. Objectives

1.6. Contributions

1.7. Dissertation Structure

2. Background

2.1. GAN

2.2. Generative Adversarial Training (GAN)

2.3. Intuition of GAN

3. Related work

4. Method

4.1. Spatio-temporal Re-rendering for Antique Facial Video Restoration

5. Experiment

5.1. Peak Signal-to-Noise Ratio (PSNR)

5.2. Structural Similarity Index Measure (SSIM)

5.3. Frechet Video Distance (FVD)

5.4. Future Work

References

List of figures

List of tables

Tóm tắt

I. Tổng Quan Về Khôi Phục Video Khuôn Mặt Cổ

Khôi phục video khuôn mặt cổ là một lĩnh vực nghiên cứu quan trọng, nhằm phục hồi các video lịch sử có giá trị cao. Những video này thường bị hư hại do công nghệ ghi hình cũ và điều kiện bảo quản không tốt. Việc khôi phục không chỉ giúp bảo tồn di sản văn hóa mà còn có thể ứng dụng trong các lĩnh vực như an ninh. Phương pháp tái tạo không-thời gian đang nổi lên như một giải pháp hiệu quả cho vấn đề này.

1.1. Khái Niệm Về Khôi Phục Video

Khôi phục video là quá trình cải thiện chất lượng hình ảnh và âm thanh của các video cũ. Điều này bao gồm việc loại bỏ các tạp âm, cải thiện độ phân giải và khôi phục màu sắc.

1.2. Tầm Quan Trọng Của Video Khuôn Mặt Cổ

Video khuôn mặt cổ không chỉ mang giá trị lịch sử mà còn giúp chúng ta hiểu rõ hơn về văn hóa và con người trong quá khứ. Việc khôi phục chúng giúp bảo tồn những giá trị này cho các thế hệ sau.

II. Thách Thức Trong Khôi Phục Video Khuôn Mặt Cổ

Khôi phục video khuôn mặt cổ đối mặt với nhiều thách thức, bao gồm chất lượng hình ảnh kém, sự biến dạng và thiếu dữ liệu huấn luyện. Những vấn đề này làm cho quá trình khôi phục trở nên phức tạp và tốn thời gian.

2.1. Vấn Đề Chất Lượng Hình Ảnh

Nhiều video cổ có chất lượng hình ảnh thấp, với các tạp âm như nhiễu hạt và vết xước. Điều này gây khó khăn trong việc khôi phục chi tiết khuôn mặt.

2.2. Thiếu Dữ Liệu Huấn Luyện

Một trong những thách thức lớn nhất là thiếu dữ liệu huấn luyện phù hợp cho mô hình khôi phục video khuôn mặt. Điều này làm giảm hiệu quả của các thuật toán khôi phục.

III. Phương Pháp Tái Tạo Không Thời Gian Trong Khôi Phục Video

Phương pháp tái tạo không-thời gian (STERR-GAN) là một trong những giải pháp tiên tiến nhất hiện nay. Phương pháp này kết hợp thông tin không-thời gian để cải thiện chất lượng video khuôn mặt cổ, giúp giảm thiểu hiện tượng nhấp nháy và cải thiện độ chính xác.

3.1. Nguyên Tắc Hoạt Động Của STERR GAN

STERR-GAN sử dụng cả thông tin không-thời gian và không gian để khôi phục video. Điều này giúp cải thiện độ ổn định và chất lượng hình ảnh trong video khôi phục.

3.2. Lợi Ích Của Phương Pháp Tái Tạo

Phương pháp này không chỉ giúp khôi phục video một cách hiệu quả mà còn tiết kiệm thời gian và chi phí so với các phương pháp truyền thống.

IV. Ứng Dụng Thực Tiễn Của Khôi Phục Video Khuôn Mặt Cổ

Khôi phục video khuôn mặt cổ có nhiều ứng dụng thực tiễn, từ bảo tồn di sản văn hóa đến cải thiện chất lượng video giám sát. Những ứng dụng này không chỉ mang lại giá trị văn hóa mà còn có ý nghĩa trong việc nâng cao an ninh.

4.1. Bảo Tồn Di Sản Văn Hóa

Việc khôi phục video cổ giúp bảo tồn các giá trị văn hóa và lịch sử, cho phép các thế hệ sau có thể tiếp cận và hiểu biết về quá khứ.

4.2. Cải Thiện Chất Lượng Video Giám Sát

Khôi phục video khuôn mặt cũng có thể được áp dụng trong lĩnh vực an ninh, giúp cải thiện chất lượng hình ảnh từ camera giám sát, từ đó nâng cao khả năng nhận diện tội phạm.

V. Kết Luận Và Tương Lai Của Khôi Phục Video Khuôn Mặt Cổ

Khôi phục video khuôn mặt cổ là một lĩnh vực đầy tiềm năng với nhiều thách thức và cơ hội. Tương lai của lĩnh vực này hứa hẹn sẽ có nhiều tiến bộ nhờ vào sự phát triển của công nghệ và các phương pháp mới.

5.1. Tiềm Năng Phát Triển

Với sự phát triển của công nghệ AI và machine learning, khả năng khôi phục video khuôn mặt cổ sẽ ngày càng được cải thiện, mở ra nhiều cơ hội mới.

5.2. Hướng Nghiên Cứu Tương Lai

Nghiên cứu trong lĩnh vực này cần tập trung vào việc phát triển các mô hình khôi phục hiệu quả hơn, đồng thời tạo ra các bộ dữ liệu phong phú hơn để hỗ trợ quá trình huấn luyện.

10/07/2025

Bạn đang xem trước tài liệu:

Khóa luận tốt nghiệp khoa học máy tính phục chế video chân dung cũ thông qua tái tạo khung hình sử dụng thông tin không thời gian

Tải đầy đủ

Trích đoạn nội dung tài liệu

VIETNAM NATIONAL UNIVERSITY - HO CHI MINH CITY UNIVERSITY OF INFORMATION TECHNOLOGY FACULTY OF COMPUTER SCIENCE BACHELOR THESIS SPATIO-TEMPORAL RE-RENDERING FOR FACIAL VIDEO RESTORATION Bachelor of Computer Science (Honors degree) NGO HUU MANH KHANH- 19520125 NGO QUANG VINH - 19520354 Supervised by Dr. Nguyen Vinh Tiep TP. HO CHi MINH, 2023 VIETNAM NATIONAL UNIVERSITY - HO CHI MINH CITY UNIVERSITY OF INFORMATION TECHNOLOGY FACULTY OF COMPUTER SCIENCE BACHELOR THESIS SPATIO-TEMPORAL RE-RENDERING FOR FACIAL VIDEO RESTORATION Bachelor of Computer Science (Honors degree) NGO HUU MANH KHANH- 19520125 NGO QUANG VINH - 19520354 Supervised by Dr. Nguyen Vinh Tiep TP.

HO CHi MINH, 2023 DANH SÁCH HOI DONG BAO VỆ KHÓA LUẬN Hội đồng chấm khóa luận tốt nghiệp, thành lập theo Quyết định số. của Hiệu trưởng Trường Đại học Công nghệ Thông tin. Acknowledgements The successful completion of this dissertation is the result of the invaluable support and assistance provided by many individuals. We are deeply grateful for their insightful feedback.

First of all, I would like to express gratitude to my supervisor, Dr. Nguyen Vinh Tiep, for his dedicated direction, enthusiastic guidance, and invaluable instruction throughout this research. His valuable advice and support were instrumental in helping us navigate the research process and successfully complete this thesis. We would like to express our sincere thanks to the Dean of the Faculty and all the teachers in the Faculty of Computer Science, University of Information Technology, for their support and for helping us prepare enough knowledge to complete this thesis.

We are also grateful to Multimedia Laboratory (MMLab-UIT) for providing us with a conducive research environment and state-of-the-art equipment for this research. Furthermore, we would like to extend our appreciation to the researchers of the MMLab for their valuable feedback and critical questions that greatly contributed to our research. It helps us identify and correct mistakes, improve the quality of this thesis Abstract Facial old films are a great source of historical value, providing us a vivid imagination of the significant figures in the past. However, they were captured with old camera technology in the past, old films were low-quality and exhibited visual artifacts like pepper noise and stripes.

Besides, old films can be damaged due to poor keeping environment. As a result, they are difficult or impossible to watch. There is a demand to restore and preserve these old films so future generations can enjoy them. Not limited to restoring old films, facial restoration can be used for security purposes.

More specifically, surveillance cameras are installed in many public places to prevent crime, but their records are often low-quality due to camera resolution and poor lighting, making it difficult to identify people. Facial video restoration is a solution to this problem, it upgrades the quality of the face in the video and makes it easier to identify crime in the video. Although the similar problem, facial image restoration, has been researched for a long time, the work on facial video restoration is still less explored. The current facial image restoration model has impressive performance, we can directly use them for video restoration by restoring each frame individually.

Nonetheless, this approach struggles with flickering problems since these models are designed for image restoration and do not take into account temporal information. In this thesis, we propose Spatio-temporal Re-rendering for Antique Facial Video Restoration (STERR-GAN), a facial video restoration model that employs both temporal and spatial information for restoring, the experiment shows that our model can address the flickering problem and yield a better result. In addition, to the best of our knowledge, the datasets for facial image restoration or video restoration are available, but the dataset for the facial image restoration domain is still unavailable. As such, we introduce the VAR dataset (Video dataset for Antique Restoration), a new video restoration dataset for facial domain.

I expect that this dataset will become a valuable resource for measuring the performance of future models and advancing research in this study area. Table of contents List of figures vii List of tables ix 1 Introduction =Gmœ— 11 Overview. Pf ee es Ans 13 Objectives. fw ee ee 1.1 Recurrent Neural Networks (RNN) .2 Bidirectional Recurrent Neural Networks .1 Handcrafted Features for Optical Flow Estimation 23 2.2 RAFT: Recurrent All-Pairs Field Transforms for Optical Flow.

ca vi Table of contents 3. ee ee 4 Method 4.2 Spatio-temporal Re-rendering for Antique Facial Video Restoration 4. ee ee 55 5 Experiment 59 5.1 Peak Signal-to-Noise Aatio(PSNR).2 Structural Similarity Index Measure (SSIM).4 Frechet Video Distance(FVD). eee eee eee 69 6.2 Future Work 70 References 71 List of figures 11 Example of the Facial VideoRestoration and some old films deterioration.

3 21 Story of counterfeit money. 02 ee ee ee eee 8 2.2 Approximate the data distribution of GAN 2.3 Backpropagation in Generator traning.4 Backpropagation in Discriminator traning.5 The progress in generating face images using GANs model .6 The architecture of StyleGAN generator.7 Illustrative example with two factors of variation .8 Example of water droplet-like artifacts in StyleGAN images .9 The architecture of StyleGAN2 16 2.10 Example of "phase" artifacts 2.11 Some alternative network architectures of StyleGAN2.12 Illustratiuon of Recurrent NeuralNetworks.13 Illustration of Bidirectional Recurrent Neural Networks .14 The architectureof RAFT.15 Example optical flow esimation.1 Illustration of GFP-GAN framework .2 Overview of the DeepRemaster. 35 Illustration of the source-reference attention layer.4 Illustration of framework proposed by Wanetal.5 Overview of Wan etal.1 The process of collection Video dataset for Antique Restoration (VAR) .2 Some samples from VAR 1.3 Visualization of STERR-GAN famework_. 65 Viii List of figures 5.

ee ee 68 List of tables 5.1 Quantitative result of STERR-GAN, GFP-GAN and DeepRemaster 5.2 Ablation study of STERR-GAN Chapter 1 Introduction 1. Practical Context Back to the late 19th century, when the motion picture was first introduced to mankind. From that time, a surprising amount of films were recorded and released. However, due to the technology at that time, films were low-quality and exhibited visual artifacts like pepper noise and stripes.

In addition, old films suffered from degradation due to poor keeping environment. With all of these factors, the significant historical value of old videos can be lost. Despite the fact that film restoration techniques have been created to bring these antique films back to life, the process is laborious. Nowadays, video restoration is typically conducted digitally, with artists manually retouching each frame to remove blemishes, fix flickering, and perform colorization.

However, this process is extremely time-consuming and expensive, as it requires examining and repairing every single frame of the old film. As a result, there is a desire for an algorithm that can automate these tedious tasks, allowing old films to be restored and given a more modern appearance at a lower cost. Old film restoration, or generally Video Restoration, has many applications in real-life. Preserving historical video footage Preserving historical video footage is an essential application of video restoration tech- nology.

Historical video footage refers to videos that capture important events, people, or cultural artifacts from the past. These recordings can be a valuable source of information and cultural heritage, and it is essential to preserve them for future generations. However, video recordings are often subject to degradation over time due to factors such as wear and tear, exposure to heat and moisture, and the passage of time. This can make it difficult to view or use these recordings, as they may be blurry, distorted, or otherwise of 2 Introduction poor quality.

In addition, many historical video recordings are stored in formats that are no longer widely used, such as VHS tapes or film reels, making it difficult to access or view the footage. Video restoration techniques can be used to preserve and restore historical video footage, improving the quality of the video and making it possible to view and study these recordings in greater detail. This can involve various techniques, such as noise reduction, color correction, and image enhancement, to view. By using video restoration techniques to improve the quality of these recordings, it is possible to preserve and share these essential pieces of history for future generations.

Enhancing the clarity of surveillance footage Surveillance footage is typically captured by cameras that are placed in strategic locations to monitor and record activity in a particular area. This footage is often used for a variety of purposes, such as security, crime prevention, and investigation. However, surveillance footage can often be of low quality due to factors such as poor lighting, camera movement, and noise. This can make it difficult to identify people and objects in the footage, which can make it less useful for its intended purpose.

Video restoration techniques can be used to improve the clarity of surveillance footage by applying a variety of techniques such as noise reduction, color correction, and image enhancement. For example, some video restoration model is proposed to remove noise or blur from the footage, making it easier to see details such as facial features or license plate numbers. These techniques can help to improve the effectiveness of surveillance footage by making it easier to identify people and objects in the video, which can be useful for security, crime prevention, and investigation purposes 1.2 Problem Definition Facial Video Restoration is a subfield of video restoration that aims to restore high-quality faces from low-quality counterparts with various deterioration, such as low-resolution, noise, blur, compression artifacts, etc.1 illustrates an example of facial video restoration. ¢ Input: a sequence of old films frame, and they contain a complex mixture of degradation such as film grain noise (blue box) or scratches (red arrow) ¢ Output: a corresponding color high-quality videos.1 Overview 3 (a) (b) Input Output Fig.1 Example of the Facial VideoRestoration.

The first row shows various frames from the input video, the second row shows the restored frame, where T is the frame index in videos. The old movies suffer from a plethora of deterioration issues, such as scratches (a) and film grain noise (b) which make them challenging to restore to their original quality 1.3 Challenges Besides common challenges of computer vision tasks, facial video restoration has its own difficulties ¢ Lack of dataset. The training dataset is one of the primary difficulties that we are facing in this work. A paired dataset is unavailable to our problem, and the previous work [51] use the synthesis dataset.

However, to the best of our knowledge, the dataset for facial video restoration is insufficient. ¢ Keeping facial detail. The face contains a lot of subtle details that are important for conveying emotions and expressions. It can be challenging to restore a video in a way that preserves these details while still improving the overall quality of the image.

Besides, The appearance of the face can be affected by complex lighting conditions, such as shadows, highlights, and reflections. This can make it challenging to correct color and exposure issues in the facial region 4 Introduction ¢ Flickering problem. The flickering problem is the unwanted changes in brightness or color in restored video sequences. It can be particularly noticeable in high-motion or low-light scenes and can be distracting and unpleasant for viewers.

¢ Requires high computational resources. Since we apply complex image processing techniques to a large number of frames in a video, these techniques can be computa- tionally intensive, especially when applied to high-resolution videos. ¢ Old films contain a complex mixture of degradation. Due to the poor keeping environment and old capture technique, antique videos often contain many distortions.

Therefore, comprehensively mitigating these issues in a single deep neural network is difficult. Motivation From our survey, there are many research about video and old film restoration such as Video Restoration [56, 37, 5] and Facial Image Restoration [27, 51]. On the oter hand, although facial video restoration has many practical applications in preserving old film, security, and crime prevention, the work on this topic is less explored. Therefore, we choose Facial Video Restoration as our research topic in this thesis.

Nội dung được bảo vệ bản quyền — Tải xuống đầy đủ

Chủ đề

Nghiên cứu về mạng đối kháng sinh điều kiện

Phục hồi video và hình ảnh

Công nghệ tái tạo video cổ

Ứng dụng của phục hồi video