ReBotNet:快速实时视频增强 ReBotNet: Fast Real-time Video Enhancement

作者:Jeya Maria Jose Valanarasu Rahul Garg Andeep Toor Xin Tong Weijuan Xi Andreas Lugmayr Vishal M. Patel Anne Menini


Most video restoration networks are slow, have high computational load, andcan’t be used for real-time video enhancement. In this work, we design anefficient and fast framework to perform real-time video enhancement forpractical use-cases like live video calls and video streams. Our proposedmethod, called Recurrent Bottleneck Mixer Network (ReBotNet), employs adual-branch framework. The first branch learns spatio-temporal features bytokenizing the input frames along the spatial and temporal dimensions using aConvNext-based encoder and processing these abstract tokens using a bottleneckmixer. To further improve temporal consistency, the second branch employs amixer directly on tokens extracted from individual frames. A common decoderthen merges the features form the two branches to predict the enhanced frame.In addition, we propose a recurrent training approach where the last frame’sprediction is leveraged to efficiently enhance the current frame whileimproving temporal consistency. To evaluate our method, we curate two newdatasets that emulate real-world video call and streaming scenarios, and showextensive results on multiple datasets where ReBotNet outperforms existingapproaches with lower computations, reduced memory requirements, and fasterinference time.



Related posts