Method of signalling motion information for efficient scalable video compression -> Monitor Keywords
Fresh Patents
Monitor Patents Patent Organizer How to File a Provisional Patent Browse Inventors Browse Industry Browse Agents Browse Locations
     new ** File a Provisional Patent ** 
site info Site News  |  monitor Monitor Keywords  |  monitor archive Monitor Archive  |  organizer Organizer  |  account info Account Info  |  
10/26/06 | 8 views | #20060239345 | Prev - Next | USPTO Class 375 | About this Page  375 rss/xml feed  monitor keywords

Method of signalling motion information for efficient scalable video compression

USPTO Application #: 20060239345
Title: Method of signalling motion information for efficient scalable video compression
Abstract: A method for incrementally coding and signalling motion information for a video compression system involving a motion adaptive transform and embedded coding of transformed video samples comprises the steps of: (a) producing an embedded bit-stream, representing each motion field in coarse to fine fashion; and (b) interleaving incremental contributions from said embedded motion fields with incremental contributions from said transformed video samples. A further embodiment of a method for estimating and signalling motion information for a motion adaptive transform based on temporal lifting steps comprises the steps of: (a) estimating and signalling motion parameters describing a first mapping from a source frame onto a target frame within one of the lifting steps; and (b) inferring a second mapping between either said source frame or said target frame, and another frame, based on the estimated and signalled motion parameters associated with said first mapping. (end of abstract)
Agent: Young & Basile, P.C. - Troy, MI, US
Inventors: David Taubman, Andrew Secker
USPTO Applicaton #: 20060239345 - Class: 375240030 (USPTO)
Related Patent Categories: Pulse Or Digital Communications, Bandwidth Reduction Or Expansion, Television Or Motion Video Signal, Adaptive, Quantization
The Patent Description & Claims data below is from USPTO Patent Application 20060239345.
Brief Patent Description - Full Patent Description - Patent Application Claims  monitor keywords



FIELD OF THE INVENTION

[0001] The present invention relates to efficient compression of motion video sequences and, in preferred embodiments, to a method for producing a fully scalable compressed representation of the original video sequence while exploiting motion and other spatio-temporal redundancies in the source material. The invention relates specifically to the representation and signalling of motion information within a scalable compression framework which employs motion adaptive wavelet lifting steps. Additionally, the present invention relates to the estimation of motion parameters for scalable video compression and to the successive refinement of motion information by temporal resolution, spatial resolution or precision of the parameters.

BACKGROUND OF THE INVENTION

[0002] For the purpose of the present discussion, the term "internet" will be used both in its familiar sense and also in its generic sense to identify a network connection over any electronic communications medium or collection of cooperating communications systems.

[0003] Currently, most video content which is available over the internet must be pre-loaded in a process which can take many minutes over typical modem connections, after which the video quality and duration can still be quite disappointing. In some contexts video streaming is possible, where the video is decompressed and rendered in real-time as it is being received; however, this is limited to compressed bit-rates which are lower than the capacity of the relevant network connections. The most obvious way of addressing these problems would be to compress and store the video content at a variety of different bit-rates, so that individual clients could choose to browse the material at the bit-rate and attendant quality most appropriate to their needs and patience. Approaches of this type, however, do not represent effective solutions to the video browsing problem. To see this, suppose that the video is compressed at bit-rates of R, 2R, 3R, 4R and 5R. Then storage must be found on the video server for all these separate compressed bit-streams, which is clearly wasteful. More importantly, if the quality associated with a low bit-rate version of the video is found to be insufficient, a complete new version must be downloaded at a higher bit-rate; this new bit-stream must take longer to download, which generally rules out any possibility of video streaming.

[0004] To enable real solutions to the remote video browsing problem, scalable compression techniques are essential. Scalable compression refers to the generation of a bit-stream which contains embedded subsets, each of which represents an efficient compression of the original video with successively higher quality. Returning to the simple example above, a scalable compressed video bit-stream might contain embedded sub-sets with the bit-rates of R, 2R, 3R, 4R and 5R, with comparable quality to non-scalable bit-streams, having the same bit-rates. Because these subsets are all embedded within one another, however, the storage required on the video server is identical to that of the highest available bit-rate. More importantly, if the quality associated with a low bit-rate version of the video is found to be insufficient, only the incremental contribution required to achieve the next higher level of quality must be retrieved from the server. In a particular application, a version at rate R might be streamed directly to the client in real-time; if the quality is insufficient, the next rate-R increment could be streamed to the client and added to the previous, cached bit-stream to recover a higher quality rendition in real time. This process could continue indefinitely without sacrificing the ability to display the incrementally improving video content in real time as it is being received from the server.

[0005] The above application could be extended in a number of exciting ways. Firstly, if the scalable bit-stream also contains distinct subsets corresponding to different intervals in time, then a client could interactively choose to refine the quality associated with specific time segments which are of the greatest interest. Secondly, if the scalable bit-stream also contains distinct subsets corresponding to different spatial regions, then clients could interactively choose to refine the quality associated with specific spatial regions over specific periods of time, according to their level of interest. In a training video, for example, a remote client could interactively "revisit" certain segments of the video and continue to stream higher quality information for these segments from the server, without incurring any delay.

[0006] To satisfy the needs of applications such as that mentioned above, low bit-rate subsets of the video must be visually intelligible. In practice, this means that most of the available bits will be devoted to a low bit-rate portion of the video are likely to contribute to the reconstruction of the video at a reduced frame rate, since attempting to recover the full frame rate video over a low bit-rate channel will result in unacceptable deterioration of the spatial details within each frame. In order to achieve smooth quality scalability within a compressed video sequence which also offers frame rate scalability, the details required to recover higher frame rates must contribute to the refinement of a model which involves motion sensitive temporal interpolation.

[0007] Without temporal interpolation, missing frames cannot be introduced into a low rate video sequence without first augmenting their spatial fidelity to a level commensurate with the frames already available, and this implies a large discontinuous jump in the amount of information which must be provided to the decoder in order to smoothly increase the reconstructed video quality. Continuing this line of argument, we see that motion information is important to highly scalable video compression; moreover, the motion itself must be represented in a manner which can be scaled, according to the temporal resolution (frame rate), spatial resolution and quality of the sample data.

Motion Adaptive Transforms Based on Wavelet Lifting

[0008] The present invention is best appreciated in the context of an earlier invention, which is the subject of W002/50772. This earlier patent application describes a method for modifying the individual lifting steps in a lifting implementation of a temporal wavelet decomposition, so as to compensate for the effects of motion. This method has the following advantageous properties: 1) the motion sensitive transform may be perfectly inverted, in the absence of any compression artefacts; 2) the low temporal resolution subsets of the wavelet hierarchy offer high spatial fidelity so that the transform allows excellent frame rate scalability; 3) the high pass temporal detail subbands produced by the transform have very low energy, allowing high compression efficiency; 4) in the absence of motion, the transform reduces to a regular wavelet decomposition along the temporal axis; and 5) in the presence of locally translational motion, the transform is equivalent to applying a regular wavelet decomposition along the motion trajectories.

[0009] To assist in the present discussion, we briefly summarise the key ideas behind this earlier invention. Any two-channel FIR subband transform can be described as a finite sequence of lifting steps [W. Sweldens, "The lifting scheme: A custom-design construction of biorthogonal wavelets," Applied and Computational Harmonic Analysis, vol 3, pp 196-2000, April 1996]. It is instructive to begin with an example based upon the Haar wavelet transform. Up to a scale factor, this transform may be realised in the temporal domain, through a sequence of two lifting steps, as h k .function. [ n ] = x 2 .times. k + 1 .function. [ n ] - x 2 .times. k .function. [ n ] l k .function. [ n ] = x 2 .times. k .function. [ n ] + 1 2 .times. h 2 .times. k .function. [ n ] where x.sub.k[n].varies.x.sub.k[n.sub.1, n.sub.2] denotes the samples of frame k from the original video sequence and h.sub.k[n].varies.h.sub.k[n.sub.1, n.sub.2] and l.sub.k[n].varies.l.sub.k[n.sub.1, n.sub.2] denote the high-pass and low-pass subband frames.

[0010] l.sub.k[n] and h.sub.k[n] correspond to the scaled sum and the difference of each original pair of flames. An example is shown in FIG. 1A. Since motion is ignored, ghosting artefacts are clearly visible in the low-pass temporal subband, and the high-pass subband frame has substantial energy.

[0011] Now let W.sub.k1.fwdarw.k2 denote a motion-compensated mapping of frame k1 onto the coordinate system of frame k2, so that W.sub.k1.fwdarw.k2(x.sub.k1)[n].apprxeq.x.sub.k2[n] for all n. The lifting steps are modified as follows. h k .function. [ n ] = x 2 .times. k + 1 .function. [ n ] - W 2 .times. k .fwdarw. 2 .times. k + 1 .function. ( x 2 .times. k ) .function. [ n ] ( 1 ) l k .function. [ n ] = x 2 .times. k .function. [ n ] + 1 2 .times. W 2 .times. k + 1 .fwdarw. 2 .times. k .function. ( h k ) .function. [ n ] ( 2 ) Note that W.sub.2k.fwdarw.2k+1 and W.sub.2k+1.fwdarw.2k represent forward and backward motion mappings, respectively. The high-pass subband frames correspond to motion-compensated residuals. These will be close to zero in regions where the motion is accurately modelled. The result is shown in FIG. 1B.

[0012] The framework described above is readily extended to any two-channel FIR subband transform, by motion-compensating the relevant lifting steps.

[0013] We demonstrate this in the important case of the biorthogonal 5/3 wavelet transform [D. Le Gall and A. Tabatabai, "Sub-band coding of digital images using symmetric short kernal filters and arithmetic coding techniques," IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp 761-764, April 1988]. As before, x.sub.2k[n] and x.sub.2k+1[n] denote the even and odd indexed frames from the original sequence. Without motion, the 5/3 transform may be implemented by alternatively updating each of these two frame subsequences, based on filtered versions of the other sub-sequence. The lifting steps are h k .function. [ n ] = x 2 .times. k + 1 .function. [ n ] - 1 2 .times. ( x 2 .times. k .function. [ n ] - x 2 .times. k + 2 .function. [ n ] ) l k .function. [ n ] = x 2 .times. k .function. [ n ] + 1 4 .times. ( h k - 1 .function. [ n ] + h k .function. [ n ] )

[0014] As before, we introduce motion warping operators within each lifting step, which yields the following h k .function. [ n ] = x 2 .times. k + 1 .function. [ n ] - 1 2 .times. ( W 2 .times. k .fwdarw. 2 .times. k + 1 .function. ( x 2 .times. k ) .function. [ n ] + W 2 .times. k + 2 .fwdarw. 2 .times. k + 1 .function. ( x 2 .times. k + 2 ) .function. [ n ] ) ( 3 ) l k .function. [ n ] = x 2 .times. k .function. [ n ] + 1 4 .times. ( W 2 .times. k - 1 .fwdarw. 2 .times. k .function. ( h k - 1 ) .function. [ n ] + W 2 .times. k + 1 .fwdarw. 2 .times. k + 1 .function. ( h k ) .function. [ n ] ) ( 4 )

[0015] FIG. 2 demonstrates the effect of these modified lifting steps. The highpass frames are now essentially the residual from a bidirectional motion compensated prediction of the odd-indexed original frames. When the motion is adequately captured, these high-pass frames have little energy and the low-pass frames have excellent spatial fidelity.

Counting the Cost of Motion

[0016] In the example of the Haar transform, given above, two separate motion mapping operators, W.sub.2k.fwdarw.2k+1 and W.sub.2k+1.fwdarw.2k, are required to process every pair of frames, x.sub.2k[n] and x.sub.2k+1[n]. Their respective motion parameters must be transmitted to the decoder. To provide a larger number of temporal resolution levels, the transform is re-applied to the low-pass subband frames, lk[n], for which motion mapping operators W.sub.4k.fwdarw.4k+2 and W.sub.4k+2.fwdarw.4k are required for every four frames. Continuing in this way, an arbitrarily large number of temporal resolutions may be obtained, using 2 2 + 2 4 + 2 8 + .times. .times. .2 motion fields per original frame.

[0017] For the example of the 5/3 transform, also given above, four motion mapping operators, W.sub.2k.fwdarw.2k+1, W.sub.2k.fwdarw.2k-1, W.sub.2k+1.fwdarw.2k and W.sub.2k-1.fwdarw.2k are required for every pair of frames (indexed by k), for just one level of temporal decomposition. Continuing the transformation to an arbitrarily large number of temporal resolutions involves approximately 4 motion fields per original video frame.

[0018] The cost of estimating, coding and transmitting the above motion fields can be substantial. Moreover, this cost may adversely affect the scalability of the entire compression scheme, since it is not immediately clear how to progressively refine the motion fields without destroying the subjective properties of the reconstructed video when the motion is represented with reduced accuracy.

[0019] The previous invention clearly reveals the fact that any number of motion modelling techniques are compatible with the motion adaptive lifting transform, and also recommends the use of continuously deformable motion models such as those associated with triangular or quadrilateral meshes (see, for example, Y. Nakaya and H. Harashima, "Motion compensation based on spatial transformations," IEE Trans. Circ. Syst. For Video Tech., Vol. 4, pp 339-367, June 1994). However, no particular solution is presented to the difficulties described above.

SUMMARY OF THE INVENTION

Continue reading...
Full patent description for Method of signalling motion information for efficient scalable video compression

Brief Patent Description - Full Patent Description - Patent Application Claims
Click on the above for other options relating to this Method of signalling motion information for efficient scalable video compression patent application.
###
monitor keywords

How KEYWORD MONITOR works... a FREE service from FreshPatents
1. Sign up (takes 30 seconds). 2. Fill in the keywords to be monitored.
3. Each week you receive an email with patent applications related to your keywords.  
Start now! - Receive info on patent apps like Method of signalling motion information for efficient scalable video compression or other areas of interest.
###


Previous Patent Application:
Method and system for rate control in a video encoder
Next Patent Application:
Picture encoding device
Industry Class:
Pulse or digital communications

###

FreshPatents.com Support
Thank you for viewing the Method of signalling motion information for efficient scalable video compression patent info.
IP-related news and info


Results in 0.26545 seconds


Other interesting Feshpatents.com categories:
Medical: Surgery Surgery(2) Surgery(3) Drug Drug(2) Prosthesis Dentistry