استخدام التصنيف المسبق للصور لتحسين دقة  أنظمة وصف الصور

م. رشا معلا

Authors

Eng. Rasha Mualla

Keywords:

Deep Learning, Image Captioning Systems, FastText, ResNet50, CNN, LSTM, Classification effects, Classified datasets

Abstract

Deep learning for the purpose of image description and captioning has been one of the most promised computer science application recently. It consists of two parts; the image and the text description models. In previous researches, we studied the effect of using different languages and datasets on the image description models. In this paper, we study the classification effect of the image dataset on those models. So, a new combined 12000-images dataset consisting of two international datasets (Flickr2k and MS-COCO) is built. The designed models support Arabic and English languages. For the description part, we used two different scenarios. In the first scenario, we used CNN and LSTM models. While for the second one, ResNet50 and FastText are used as image and text models respectively. The training is applied for both indoor and outdoor classes. Tests scenarios are applied in two cases and four ways which are the word-by-word and the sentence-by-sentence models. The performance analysis proves that classified classes have a higher performance than the unclassified ones in case of repeating-based and non-repeating-based datasets in all scenarios.

Downloads

Download data is not yet available.

Using Image Pre-classification to improve the accuracy of the image captioning systems

Authors

Keywords:

Abstract

Downloads

Downloads

Published

Issue

Section

How to Cite

Language

Information

Browse

Make a Submission