Building an image set for modeling image re-targeting using deep learning.

Mohammad A. Alsmirat, Ethar El-Qawasmeh,Mahmoud Al-Ayyoub,Yaser Jararweh

Simul. Model. Pract. Theory（2023）

引用 0|浏览3

暂无评分

摘要

The displays of various devices such as TVs, laptops, and mobile phones vary greatly in size, aspect ratio, and resolution. Creating different versions of the same media for these different systems can be costly in terms of storage. A novel solution to this problem is image re-targeting, which adapts an image to fit any device display, preserving the main content of the image without wasting additional storage. Re-targeting can be performed on the device itself or on an intermediate device such as an edge server. The main re-targeting techniques found in literature are cropping, seam carving, and scale and stretch, but it is not easy to determine the best technique for a given image. In this paper, we propose using deep learning to model the decision of which re-targeting technique to use for which target size. To achieve this, we built and annotated a large image dataset using a team of 10 people, including 2750 original images representing 6 categories. We applied the most commonly used four re-targeting techniques (cropping, scaling, Seam Carving, and optimized scale and stretch) to the original image set to create five different sizes (HDP (1600 × 900), HD (1280 × 720), SVGA (800 × 600), WVGA (852 × 480), and NTSC (720 × 480)). The resulting image set, which includes the re-targeted images, contains 46750 images. We examined and summarized the annotators’ perceptual point of view for reference and comparison purposes. Additionally, we created a deep learning model and trained it using the newly generated image set to recommend the best re-targeting technique based on an input image and a target size. The model consists of 30 independent sub-models (five models for each image category). All these sub-models are constructed using pre-trained ResNet50. The choice of the sub-model to use for a particular image is based on both the input image category and target size. To accelerate the training process and avoid overfitting of the sub-models, we utilize transfer learning and froze the weights of the first 49 layers of the pre-trained model used. We then substituted the last fully connected layer with three fully connected layers, with the last one being a classification layer. The results demonstrated that our image dataset is suitable for training deep learning models for image re-targeting.

查看译文

关键词

Image re-targeting, Image datasets, QoE, Human perceptual views, Deep learning, Transfer learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要