Rahul Sharma

Computer Vision

Video-to-text Summarizer for Content-moderators: Fast Insights

Around 90% of the content moderators, also known as the Bodyguards of the Internet, have Post-Traumatic Stress Disorder (PTSD). Most of the parent companies are offering them psychological counselling resources. However, the damage cannot be fixed entirely. As per the recent article from Financial Times, the third-party content moderators for Facebook are required to sign a form explicitly acknowledging that their job could cause PTSD. Additionally, content moderators are asked to sign Non-Disclosure Agreements which restricts them from sharing their trauma even with their loved ones. To avoid the negative psychological consequences, we have proposed a solution which can take the videos as the input and transform it into the target understandable textual description. This summary can be utilized to flag the inappropriateness in the video.

The development of the solution is in five different phases as shown in following figure - Video Sampling, Image Annotation, Image-to-Text Conversion, Text Summarization, and Analytics and Data Visualizations.

The below diagram outlines key deep learning encoder-decoder framerwork for image captioning:

The experiment results show that our predictive models can be utilized by medical practitioners to predict patients’ AD shift with 0.78 C-Index and 0.10 IBS. Our study also demonstrates that the selection of critical features can improve the effectiveness of probabilities at each time interval. This model can be extended to predict the time duration for disease shifts for other diseases such as Breast Cancer, Huntington’s disease, and Scleroderma, to mention a few. We have trained our model on MSVD and COCO Dataset.

For more information about this project, you can through our article at Fast Insights: Artificial intelligence–enabled video-to-text summarizer.

June 2020 Rahul Sharma