Exploring Edge-aware Loss in Sketch Frame Interpolation

Kong Liang

Abstract

Sketch frame interpolation presents unique challenges compared to natural video: sparse, high-frequency line structures demand precise edge fidelity rather than smooth photometric consistency. We investigate whether a Sobel-based edge-aware loss (L_edge) can better serve this domain by explicitly supervising gradient similarity between predicted and ground-truth frames. Fine-tuning the AFI backbone on the STD-12K animation sketch dataset, we compare L_edge against pixel-regression (L_1) and style loss (L_style) across PSNR, SSIM, LPIPS, and Chamfer Distance. L_edge underperforms both baselines on all metrics, with qualitative results revealing ringing artifacts, inconsistent line weights, and overly thick strokes. We attribute this to the Sobel operator’s indiscriminate response to all high-frequency transitions — including compression noise and soft shading — rather than the clean structural strokes that define sketch quality. We discuss more promising directions, including learned edge detectors conditioned on sketch structure and frequency-domain losses operating on stroke skeletons.