Motion estimation techniques are primarily used in applications to detect object movement across frames of video, 3D scene reconstruction, and video compression. Algorithms used to estimate motion are limited by the tradeoff between motion vector (MV) quality and complexity of computation. The two widely used approaches are those of optical flow computation, or block matching. Optical flow yields high motion vector quality but at the cost of very high computational complexity, but block matching techniques offer flexibility in the tradeoff between the two factors. Hybrid approaches to block matching further improve MV quality at a similar low computational cost.
This project is an implementation of Block Overlap Minimization from a motion estimation algorithm presented in "Joint Framework for Motion Validity and Estimation Using Block Overlap", M. Santoro et. al. [1], wherein final motion vectors are computed, based on the minimal combined effect (Energy) of SAD (sum of absolute differences), Smoothness of a local vector field, and Block Overlap Minimization. The n-level approach used in [1] takes motion vectors from a lowest resolution image and passes them on to the next level of resolution, and are thus modified at each subsequent level. This hierarchical structure helps to speed up the process of motion estimation.
At each level the following operations are performed. Frame 1 is divided into a grid of blocks. For each block in Frame 1, a corresponding match is found in Frame 2, the match being the minimum SAD valued block position found in Frame 2. The SAD is stored, and the difference in position coordinates is stored as motion vectors. This is the first step.
This project is an implementation of Block Overlap Minimization from a motion estimation algorithm presented in "Joint Framework for Motion Validity and Estimation Using Block Overlap", M. Santoro et. al. [1], wherein final motion vectors are computed, based on the minimal combined effect (Energy) of SAD (sum of absolute differences), Smoothness of a local vector field, and Block Overlap Minimization. The n-level approach used in [1] takes motion vectors from a lowest resolution image and passes them on to the next level of resolution, and are thus modified at each subsequent level. This hierarchical structure helps to speed up the process of motion estimation.
At each level the following operations are performed. Frame 1 is divided into a grid of blocks. For each block in Frame 1, a corresponding match is found in Frame 2, the match being the minimum SAD valued block position found in Frame 2. The SAD is stored, and the difference in position coordinates is stored as motion vectors. This is the first step.
The second step comprises of the smoothness validity metric, that compares the current spatial MV to its eight-connected neighbors, such that:
Here ‘R(vi)’ are the eight spatial candidates having MV ‘vi’, and ‘vj’ is the current block’s MV. This is particularly applicable because objects are represented by groups of pixels, and all pixels within a group should ideally have the same motion vector. At the end of this stage, the calculated the calculated energy ‘E’ of a candidate can be represented as:
Symbol lambda represents the Lagrange Multiplier, that is chosen to weigh the smoothness term, over the SAD term. Following is the comparison table indicating original and obtained values of endpoint error for the implementation of equation (3):
Table 1 (Dataset source: Middlebury College)
The next addition to the energy function is the ‘overlap term’. Overlap is generated when blocks get mapped from Frame 1 to Frame 2, the mapping is not completely unique. The following concept is illustrated in the figure below:
Fig. 1 (non unique mapping between frames)
Thus we can keep track of the number of overlapping blocks over each pixel. The function calculate_level_overlap() does the same operation and creates a single channel overlap image for the given level – and is invoked immediately after calcLevelBM() which performs block matching based on minimum SAD at that level. Each pixel within the range of one block size is incremented, in accordance with the block’s mapped location (original position + motion vector). This operation is looped to operate with all blocks. Thus the corresponding level’s overlap image is obtained. An illustration of the same is displayed below.
Fig. 2
(Level 2 overlap image. Scaled and shifted for display purposes. Dimensions not to scale)
(Level 2 overlap image. Scaled and shifted for display purposes. Dimensions not to scale)
The second function regarding overlap is the calculate_candidate_overlap(), and is invoked after the calculate_smoothness(). This function calculates the ‘overlap term’ for a particular candidate. Firstly, the effect of the current block’s MV is temporarily removed (restored at the end of the function’s key operations). Thus the mapped block’s pixel locations are decremented by 1 (in the overlap image). Then the candidate’s MV’s are assigned to the current block – corresponding pixel locations incremented by 1. The mapped block’s pixels in the overlap image are then summed up. The last two steps are incorporated in the same loop to save the program’s execution time. The sum is represented by ‘Lx’. The sum is then divided by the square of the block size and incremented by one. This is the value returned by the function. The new composite equation for finding energy is as follows:
Note that the +1 is added in the overlap term because, in the event that there is no overlap, the SAD term should still be added to the function. Following is a comparison table indicating original and obtained values of endpoint error for the implementation of equation (4).
Table 2
Thus the implementation of the energy minimization framework for block based motion estimation in general reduces the endpoint error with the inclusion of Block Overlap Minimization, and a further improvement in the results is obtained when the overlap term is not scaled by the SAD.
Thus the implementation of the energy minimization framework for block based motion estimation in general reduces the endpoint error with the inclusion of Block Overlap Minimization, and a further improvement in the results is obtained when the overlap term is not scaled by the SAD.
References:
[1] "Joint Framework for Motion Validity and Estimation Using Block Overlap", M. Santoro et. al.
[2] "Motion Estimation using Block Overlap Minimization", M. Santoro et. al.
[3] "Hierarchical Motion Estimation Algorithm Based on Pyramidal Successive Elimination", C. Lin et. al.
[1] "Joint Framework for Motion Validity and Estimation Using Block Overlap", M. Santoro et. al.
[2] "Motion Estimation using Block Overlap Minimization", M. Santoro et. al.
[3] "Hierarchical Motion Estimation Algorithm Based on Pyramidal Successive Elimination", C. Lin et. al.