5_long_consistency
Fig.5. Video of a long style consistent translation video visual example (1680 frames).
Left: input semantic labels; Right: UVIT translated video in sunset scenario. All frames within the video share the same style code to keep style consistency.