Video coding and decoding -> Monitor Keywords
Fresh Patents
Monitor Patents Patent Organizer How to File a Provisional Patent Browse Inventors Browse Industry Browse Agents Browse Locations
     new ** File a Provisional Patent ** 
site info Site News  |  monitor Monitor Keywords  |  monitor archive Monitor Archive  |  organizer Organizer  |  account info Account Info  |  
08/09/07 | 2 views | #20070183676 | Prev - Next | USPTO Class 382 | About this Page  382 rss/xml feed  monitor keywords

Video coding and decoding

USPTO Application #: 20070183676
Title: Video coding and decoding
Abstract: A video coding and decoding method, wherein a picture is first divided into sub-pictures corresponding to one or more subjectively important picture regions and to a background region sub-picture, which remains after the other sub-pictures are removed from the picture. The sub-pictures are formed to conform to predetermined allowable groups of video coding macroblocks (MBs). The allowable groups of MBs can be, for example, of rectangular shape. The picture is then divided into slices so that each sub-picture is encoded independent of other sub-pictures except for the background region sub-picture, which may be coded using another sub-pictures. The slices of the background sub-picture are formed in a scan-order with skipping over MBs that belong to another sub-picture. The background sub-picture is only decoded if all the positions and sizes of all other sub-pictures can be reconstructed on decoding the picture.
(end of abstract)
Agent: Foley & Lardner LLP - San Diego, CA, US
Inventors: Miska Hannuksela, Ye-Kui Wang
USPTO Applicaton #: 20070183676 - Class: 382243000 (USPTO)
Related Patent Categories: Image Analysis, Image Compression Or Coding, Shape, Icon, Or Feature-based Compression
The Patent Description & Claims data below is from USPTO Patent Application 20070183676.
Brief Patent Description - Full Patent Description - Patent Application Claims  monitor keywords

FIELD OF THE INVENTION

[0001] This invention relates to video coding and decoding. It relates particularly, but not exclusively, to video coding and transmission over error-prone data connections.

BACKGROUND OF THE INVENTION

[0002] Video transmission requires coding of the video in a form that allows its transmission. Typically, this involves effective compression due to the vast amount of information contained in a stream of pictures that constitute a video to be transmitted.

[0003] ITU-T H.263 is an International Telecommunications Union (ITU) video coding recommendation which specifies the bit-stream syntax and the decoding of a bit-stream. In this standard, pictures are coded using luminance and two colour difference (chrominance) components (Y, CB and CR). The chrominance components are each sampled at half resolution along both co-ordinate axes compared to the luminance component.

[0004] Each coded picture, as well as the corresponding coded bit stream, is arranged in a hierarchical structure with four layers being, from top to bottom, a picture layer, a picture segment layer, a macroblock (MB) layer and a block layer. The picture segment layer can be either a group of blocks layer or a slice layer.

[0005] The picture layer data contains parameters affecting the whole picture area and the decoding of the picture data. By default, each picture is divided into groups of blocks. A group of blocks (GOB) typically comprises a row of macroblocks (16 subsequential pixel lines) or a multiple thereof. Data for each GOB consist of an optional GOB header followed by data for MBs. Alternatively to GOBs, so called slices can be used, whereby each picture is divided into slices instead of GOBs. Data for each slice consists of a slice header followed by data for MBs.

[0006] The slices define regions within a coded picture. Each region is a number of MBs in a normal scanning order. There are no prediction dependencies across slice boundaries within the same coded picture. However, temporal prediction can generally cross slice boundaries unless ITU-T H.263 Annex R (Independent Segment Decoding) is used. Slices can be decoded independently from the rest of the picture data (except for the picture header). Consequently, slices improve error resilience in packet-lossy networks.

[0007] Each GOB or slice is divided into MBs. An MB relates to 16.times.16 pixels of luminance data and the spatially corresponding 8.times.8 pixels of chrominance data. In other words, an MB consists of four 8.times.8 luminance blocks and two spatially corresponding 8.times.8 chrominance blocks.

[0008] Rather than using regions formed of a number of MBs in the normal scan order, rectangular regions consisting of N.times.M macroblocks (N, M greater than or equal to one) and substituting slice and GOB structures were proposed to the ITU-T H.263 by Sen-ching Cheung, "Proposal on using Region Layer in H.263+", ITU-T SG15 WP1 document LBC-96-213, July 1996. However, the proposal was not adopted for H.263.

[0009] In ITU-T H.263 Independent Segment Decoding mode (ITU-T H.263 Annex R), segment boundaries (as defined by the boundaries of the slices or the upper boundaries of the GOBs for which GOB headers are sent, or the boundaries of the picture, whichever bounds a region in the smallest way) are treated similarly to picture boundaries, which eliminate all error propagation from neighboring slices. For example, errors cannot be propagated due to motion compensation or de-blocking loop filtering from neighboring slices. Segment boundaries can only be changed at INTRA pictures, i.e. when no inter-coding is required.

[0010] The ISO/IEC standard draft 14496-2:1999(E), referred to as MPEG-4 visual or MPEG-4 video, is a standard draft that has a design centered around a basic unit of content called an audio-visual object (AVO). Examples of AVO's are a musician (in motion) in an orchestra, the sound generated by that musician, the chair she is sitting on, the (possibly moving) background behind the orchestra, and explanatory text for the current passage. In the MPEG-4 video, each AVO is represented separately and becomes the basis for an independent stream.

[0011] The coding of natural two-dimensional motion video is a part of the MPEG-4 video. MPEG-4 video is capable of coding both conventional rectangular video objects as well as arbitrarily shaped two-dimensional video objects The basic video AVO is called a video object (VO). The VOs can be scalable, i.e. they may be split up, coded, and sent in two or more video object layers (VOL). One of these VOLs is called the base layer, which all terminals must receive in order to display any kind of video. The remaining VOLs are called enhancement layers, which may be expendable in case of transmission errors or restricted transmission capacity. In case of non-scalable video coding, one VOL per VO is coded.

[0012] A snapshot in time of a video object layer is called a video object plane (VOP). For a rectangular video, this corresponds to a picture or a frame. However, in general, the VOPs can have an arbitrary shape. Each VOP can be divided into video packets. Each VOP and video packet is further divided into macroblocks similarly to ITU-T H.263. The colour (YUV) information of the macroblock is coded similarly to ITU-T H.263, i.e., the macroblock is further divided into 8.times.8 blocks. In addition, if the VOP has an arbitrary shape, the shape of the macroblock is coded as explained in the next paragraph.

[0013] The MPEG-4 video VOs may be of any shape, and furthermore the shape, size, and position of the object may vary from one frame to the next. In terms of its general representation, a video object is composed of three colour components (YUV) and an alpha component. The alpha component defines the object's shape on a picture-by-picture basis. Binary objects form the simplest class of objects. They are represented by a sequence of binary alpha maps, i.e. 2-dimensional pictures where each pixel is either black or white. MPEG-4 video provides a binary shape only mode for compressing these objects. The compression process is defined exclusively by a binary shape encoder for coding the sequence of alpha maps. In addition to binary objects, a grey-level alpha map can be used to define the opacity of the object. The object boundary is coded using a binary alpha map, while the grey-level alpha information is coded similarly to texture coding using the DCT transform. In addition to the sequence of object shape and opacity definitions, the representation comprises the colours of all the pixels within the interior of the object shape. MPEG-4 video encodes these objects using a binary shape encoder and then a motion compensated discrete cosine transform (DCT)-based algorithm for the interior texture coding.

[0014] It is also known to be advantageous to segment a video bit-stream into portions of different priorities, for example by scalable video coding, data partitioning, or region-based coding discussed above.

[0015] Scalable video coding and data partitioning suffer, however, from dependencies between different coding elements. An enhancement layer, for example, cannot be decoded correctly if the base layer has not been received correctly. Correspondingly, a low-priority partition is of no use if the corresponding high-priority partition has not been received. This makes the use of scalable video coding and data partitioning disadvantageous in some cases. Scalable coding and data partitioning do not provide means to handle spatial regions of interest differently from subjectively less important areas. Moreover, many forms of scalable coding, such as conventional signal to noise ratio (SNR) and spatial scalability, suffer from a worse compression efficiency compared to non-scalable coding. In the region-based video coding, on the other hand, the GOBs or slices may contain macroblocks of different subjective importance. Thus, no prioritization of GOBs and slices is typically possible.

[0016] Coding of arbitrarily shaped objects is currently considered too complex for handheld devices. This is further exemplified by the fact that MPEG-4 video shape coding tools are typically excluded from mobile video communication services of the planned third generation mobile telephones.

SUMMARY OF THE INVENTION

[0017] It is an object of the invention to provide an alternative suitable for mobile communication which yet provides at least some of the advantages similar to those offered by MPEG-4 video.

[0018] According to a first aspect of the invention there is provided a method of video encoding comprising the steps of: [0019] dividing a picture into a set of regular shaped coding blocks having a predetermined alignment in relation to the area of the picture, each coding block corresponding to at least one group of elementary coding elements; [0020] determining at least one shape within a picture; [0021] selecting at least one subset of the coding blocks defining at least one area covering the at least one determined shape; [0022] determining as at least one separate coding object the selected at least one subset of the coding blocks; [0023] determining as a background object the part of the picture that excludes the at least one separate coding object; [0024] encoding the at least one separate coding object; and [0025] encoding as one coding object the background object.

[0026] It is an advantage of the invention that a background coding object can be determined as a unitary coding object that is defined as the part of the picture that does not belong to any separate coding object and that the separate coding objects need not conform to the shapes which they cover.

[0027] Preferably, the background coding object is coded using the at least one separate coding object.

[0028] The background object cannot be reconstructed without determination of the position, shape and size of each separate coding object. If any data packet carrying a separate coding object is lost, there is no chance to decode the background coding object anyway. The determination of the position and size of the at least one separate coding object indicates the presence of video data of the at least one separate coding object. There is thus a high likelihood to successful prediction of a background coding object using the at least one separate coding object, so that it is typically reasonable to encode the background coding object using the at least one separate coding object.

Continue reading...
Full patent description for Video coding and decoding

Brief Patent Description - Full Patent Description - Patent Application Claims
Click on the above for other options relating to this Video coding and decoding patent application.
###
monitor keywords

How KEYWORD MONITOR works... a FREE service from FreshPatents
1. Sign up (takes 30 seconds). 2. Fill in the keywords to be monitored.
3. Each week you receive an email with patent applications related to your keywords.  
Start now! - Receive info on patent apps like Video coding and decoding or other areas of interest.
###


Previous Patent Application:
Image compression method, image compression device, image transmission system, data compression pre-processing apparatus, and computer program
Next Patent Application:
Dynamic range compression of high dynamic range imagery
Industry Class:
Image analysis

###

FreshPatents.com Support
Thank you for viewing the Video coding and decoding patent info.
IP-related news and info


Results in 0.83461 seconds


Other interesting Feshpatents.com categories:
Novartis , Pfizer , Philips , Polaroid , Procter & Gamble ,