FreshPatents.com Logo
stats FreshPatents Stats
6 views for this patent on FreshPatents.com
2012: 6 views
Updated: October 13 2014
Browse: Apple patents
newTOP 200 Companies filing patents this week


    Free Services  

  • MONITOR KEYWORDS
  • Enter keywords & we'll notify you when a new patent matches your request (weekly update).

  • ORGANIZER
  • Save & organize patents so you can view them later.

  • RSS rss
  • Create custom RSS feeds. Track keywords without receiving email.

  • ARCHIVE
  • View the last few months of your Keyword emails.

  • COMPANY DIRECTORY
  • Patents sorted by company.

Follow us on Twitter
twitter icon@FreshPatents

Composite audio waveforms

last patentdownload pdfdownload imgimage previewnext patent


20120269364 patent thumbnailZoom

Composite audio waveforms


A technique for aligning a plurality of media clips is provided. One or more intra-clip points of interest (POIs) are identified in at least a first media clip. When aligning a first point in the first media clip with a second point in a second media clip, the first point may be snapped to the second point, wherein at least one of the first point and second point is an intra-clip POI. When a snap occurs, at least one of a visual or audible indication is generated, such as a “pop” sound, a snap line, or automatically aligning the first point with the second point when the first point is within a specified number of pixels of the second point. Techniques for representing multiple channels of an audio clip as a single waveform and caching waveforms are also provided.

Apple Inc. - Browse recent Apple patents - Cupertino, CA, US
Inventor: Keith D. Salvucci
USPTO Applicaton #: #20120269364 - Class: 381119 (USPTO) - 10/25/12 - Class 381 
Electrical Audio Signal Processing Systems And Devices > With Mixer

view organizer monitor keywords


The Patent Description & Claims data below is from USPTO Patent Application 20120269364, Composite audio waveforms.

last patentpdficondownload pdfimage previewnext patent

PRIORITY CLAIM

This application claims the benefit as a Continuation of application Ser. No. 11/325,886, filed Jan. 4, 2006 the entire contents of which is hereby incorporated by reference as if fully set forth herein, under 35 U.S.C. §120. The applicant(s) hereby rescind any disclaimer of claim scope in the parent application(s) or the prosecution history thereof and advise the USPTO that the claims in this application may be broader than any claim in the parent application(s), which claims the benefit of priority from U.S. Provisional Application No. 60/642,138, filed on Jan. 5, 2005, entitled “Composite Audio Waveforms with Precision Alignment Guides”; the entire content of which is incorporated by this reference for all purposes as if fully disclosed herein.

BACKGROUND

The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the described approaches qualify as prior art merely by virtue of their inclusion in this section.

Digital audio players or video players are capable of playing audio and video data from digital files, such as, for example, MP3, WAV, or AIFF files. Known digital audio or video players are capable of showing basic information about a media file, such as the name of the file and any status or progression information regarding the playback process if the audio or video file is being played back on a digital audio or video player. This same type of information is available to and displayed by video, audio, and movie editing software.

FIG. 1 illustrates digital movie editing software that shows the name 102 of an audio file 101, basic time length information 104 associated with audio file 101, and a progression status bar 106. While this information is useful and, indeed, necessary in digital media editing, it would be beneficial to be able to see more detailed information about audio data, such as audio intensity over time, via a visual representation.

Sometimes, audio data is comprised of multiple channels, such as a surround sound mix which could have six or more channels. Thus, the additional detailed information, alluded to above, could be presented with six or more visual representations, each visual representation associated with one channel. Not only do such visual representations occupy much space on a computer display, but much of the information may not necessarily be useful (i.e., the type of digital media editing a user wants to perform does not require editing multiple channels), unless a user is interested in working specifically on one or more of those channels.

Another problem associated with digital media editing is generating visual representations of media files, such as audio clips. Significant time and memory is required to read in all the audio data for a given audio clip and then generate a visual representation based on the audio data.

Lastly, many users of media editing software wish to align two or more media clips. For example, a user may wish to begin a video clip as soon as an audio clip begins. However, often times an audio clip begins with silence and a video clips begins with blank video. Furthermore, there may be many places within a video and audio clip, other than where the audio begins, in which a user may wish to align the media clips. Thus, it is likely that simply aligning the beginning or ending (i.e., edges) of a video clip with an edge of an audio clip may not produce the desired results.

Because simply aligning the edges of media clips may not produce the desired results, editors of digital media may have to manually edit each clip, such as deleting “silence” at the beginning of a media clip, or manually aligning the media clips with a selection device, such as a mouse. Each of these latter techniques are prone to producing less than precise alignments where too much or too little audio is deleted at the beginning of an audio clip (when manually editing) or where a video clip may not start exactly when audio begins (when manually aligning).

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is depicted by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:

FIG. 1 is a representative screenshot of digital movie editing software displaying basic information about an audio clip without any accompanying graphic waveforms;

FIG. 2 illustrates two individual audio waveforms, each representing two channels of audio data that are coalesced to create the single composite waveform of FIG. 3;

FIG. 3 is a representative screenshot of digital movie editing software displaying information about an audio clip that includes a single composite waveform that represents two channels of audio data, according to an embodiment of the invention;

FIG. 4 is a representative screenshot illustrating a “timeline snap” feature according to one embodiment of the invention, according to an embodiment of the invention; and

FIG. 5 is a block diagram that depicts a computer system upon which an embodiment of the invention may be implemented.

DETAILED DESCRIPTION

Overview

Techniques are described hereafter for providing detailed information about audio when editing digital media, for example. Additional information may be displayed while an audio clip is being played as well as when the audio clip is not being played. An example of information that may be displayed is information about volume intensity at different points in time in an audio clip. Certain points of interest within an audio clip may also be automatically identified for “snapping.”

Information about an audio clip could be used when editing movies or other video data to more accurately synchronize video data with audio data, for example. Additionally, graphical audio waveforms corresponding to audio in a video file may be used to detect potential problems in the video file, such as loud outbursts from crowds of people. Graphical audio waveforms may also be used to identify points in a soundtrack to place edit points. Other benefits of allowing users to view information about audio data in a file, such as graphical waveforms representing intensity levels over time, will be apparent to those skilled in the art.

Composite Audio Waveform

The techniques described herein may be implemented in a variety of ways. Performance of such techniques may be integrated into a system or a device, or may be implemented as a stand-alone mechanism. Furthermore, the approach may be implemented in computer software, hardware, or a combination thereof.

The techniques described herein provide users with a single visual waveform graphic that represents a characteristic produced collectively by multiple tracks of audio data in a media clip. Media data is digital data that represents audio or video and that can be played or generated by an electronic device, such as a sound card, video card, or digital video recorder. A media clip is an image, audio, or video file or any portion thereof. A single waveform that reflects a characteristic produced collectively by multiple tracks is referred to herein as a “composite audio waveform”.

A displayed composite audio waveform may help in a variety of ways, such as to help a user to synchronize a song or sound clip to match action in a video clip. Embodiments that make use of the composite audio waveform to synchronize audio and video are useful in digital movie editing software.

According to one embodiment, the composite audio waveform is a representation of the audio intensity (volume) produced by combining all tracks found within the audio media clip.

A composite audio waveform that reflects the collective intensity of all tracks within an audio clip may be used to see where an audio clip builds in intensity. Users of movie editing software may use the visual cues provided in the composite audio waveform to align video frames to the audio. For example, users may use composite audio waveforms to align video to audio events, such as a certain drumbeat or the exact beginning or end of the audio.

In one embodiment, users have an option of turning the visual display of graphic waveforms on or off. For example, an option to turn waveforms on or off may be a preferences option. In another embodiment, the displayed waveforms may be resized or zoomed in on, allowing a user to see more details of the waveform when desired. For example, a user could select a waveform and press up and down arrows to change the zoom.

According to one embodiment, the technique of generating and displaying a composite audio waveform may be applied to the audio within video clips, as well as to audio clips themselves. Audio from a video clip may be extracted from video clips that also include audio tracks. Extracting audio from a video clip allows users to move or copy the audio to a different place within a movie.

An individual audio clip may be composed of a number of channels, e.g., two channels for stereo data—one for the left speaker and one for the right speaker. In one embodiment, all channels are coalesced into a single waveform that represents all channels. That is, one waveform shows the combined audio intensity for all channels of an audio clip. A single waveform may show a cumulative intensity, for example, by summing the intensity volumes of the separate channels.

FIG. 2 illustrates two individual audio waveforms, each representing two channels of audio data that are combined to create the single composite waveform of FIG. 3. As shown, the sound in channel 202 is more intense at the beginning of the audio clip than the sound in channel 204. This could occur, for example, from a guitar coming from the right speaker at the beginning of a song, but not from the left.

FIG. 3 illustrates digital movie editing software that includes the ability to display a waveform graphic 301 for audio clip 101 of FIG. 1, according to an embodiment of the invention. As shown, waveform graphic 301 indicates the average intensity of the audio data in the audio clip over time. Waveform graphic 301 represents a single composite waveform that represents audio from an audio clip having multiple audio channels (e.g., a stereo audio clip), where the multiple channels are coalesced into one composite waveform.

Summing, or coalescing, the two audio channels of FIG. 2 into a single composite waveform as in FIG. 3 makes editing movies with audio more user-friendly and less confusing, while simultaneously conserving screen space on the user\'s display. The process of coalescing multiple channels into a single composite waveform may include summing, decimating, and reducing the bit depth of audio samples as described in more detail herein.

Forming the Composite Waveform

In one embodiment, to form the composite waveform, data is decimated using a 128 point sinc function to reduce the amount of data being managed in the application. A 128 point sinc function is not required, and in alternative embodiments, various sinc functions may be used, such as a 400 point sinc function. Any sinc function or any other method for downsampling audio data may be used.

A sinc function sinc(x), or “sampling function,” is a function associated with digital signal processing and the theory of Fourier transforms. The full name of the function is “sine cardinal,” but may be referred to as “sinc.” Sinc filters may be used in many applications of signal processing. In one embodiment, a sinc process is used to take multiple sequential samples of digital audio data and reduce them to one sample that is a weighted average of all of the samples.

For example, consider an audio file with one channel (i.e. a mono audio file). The goal is to reduce the total number of samples in the audio file but still have a usable, representative signal. A 128 point sinc function will take 128 audio samples at a time, and reduce them to one sample that represents all 128 audio samples. In embodiments of the present invention, this process is iterated over the entire audio file and creates a file (or creates data in memory) that is 128th the size of the original file.

Sinc filtering may be considered a “weighted average” using coefficients generated from a sinc function. In one embodiment of the present invention, the sinc function used is:

sinc(x)=sin(x)/x

where x is the absolute value of the distance from the center of the samples being filtered.

Decimating the data, such as by using a sinc function, reduces memory overhead and increases speed of processing and plotting. After decimating, the data is further reduced in size by changing the bit depth of the audio samples. For example, most audio files stored on computers use 16 bits to represent one sample of audio. In one embodiment of the present invention, the sample size is reduced to 8 bits of data by truncation and rounding techniques. Bit depth reduction alone lowers the memory footprint of the audio data by 50%. Thus, the overall data size of a 16 bit stereo file can be reduced to 1/1600th of its original size during the processing of the data prior to plotting and saving to disk. In summary, such memory savings comes from coalescing stereo data (reduced to ½ the size), reducing bit depth (reduced to ½ the size again), and decimating the audio data by a 400 point sinc function (reduced additionally to 1/400th of the size).

Caching and Saving the Waveform

Typically, displaying audio waveforms is a slow, computationally expensive process that consumes a significant amount of memory. Techniques for displaying audio waveforms can require (1) the audio data for the waveform be read from disk, (2) the waveform to be calculated from the audio data, and (3) the waveform to be drawn to the screen, each time a particular audio waveform is to be displayed.

In some embodiments, caching is used to solve performance problems associated with known waveform display techniques. In one embodiment, the waveform is calculated from the audio data only once during the lifetime of a “project” in movie editing software, or other software that uses the techniques of the present invention. Because the waveform is only calculated once from the audio data, the waveform does not have to be recalculated from data read from disk every time the waveform is to be displayed. In one embodiment, the waveform is calculated from the audio data only once during the lifetime of the audio data.

To cache the waveform after it has been calculated from the audio data, the audio data is transformed into a digital image representing the waveform during a first session of an application. A session is the period of time a user interfaces with an application; in this case, a media editing application. The session begins when the user accesses the application and ends when the user quits, or closes, the application. Before the first session ends, the digital image is durably saved to a persistent storage, such as a hard disks, floppy disks, optical disks, or tapes.

Input is received, e.g., from a user, to initiate a second session of the application. Input is also received to load the digital image from persistent storage. The digital image is loaded by reading the digital image from the persistent storage and displaying the digital image on a computer display, e.g., via a graphical user interface. Consequently, the digital image only needs to be calculated and generated once.

More specifically, audio data is drawn once into a digital image, and the digital image is saved. Typically, the saved image is much smaller than the actual audio data that it represents and therefore much faster to load and display. Using common fast graphic routines, the saved image may be resized or cropped as needed within a user interface. The “previously-calculated” image may also be presented with faded opacity, while the waveform is being recalculated, to indicate to a user that waveform processing is in progress. Thus, displaying a waveform in this manner is much faster than known methods of recalculating and displaying audio waveforms.



Download full PDF for full patent description/claims.

Advertise on FreshPatents.com - Rates & Info


You can also Monitor Keywords and Search for tracking patents relating to this Composite audio waveforms patent application.
###
monitor keywords



Keyword Monitor How KEYWORD MONITOR works... a FREE service from FreshPatents
1. Sign up (takes 30 seconds). 2. Fill in the keywords to be monitored.
3. Each week you receive an email with patent applications related to your keywords.  
Start now! - Receive info on patent apps like Composite audio waveforms or other areas of interest.
###


Previous Patent Application:
Temperature compensated microphone
Next Patent Application:
Acoustic apparatus, acoustic system, and audio signal control method
Industry Class:
Electrical audio signal processing systems and devices
Thank you for viewing the Composite audio waveforms patent info.
- - - Apple patents, Boeing patents, Google patents, IBM patents, Jabil patents, Coca Cola patents, Motorola patents

Results in 0.66041 seconds


Other interesting Freshpatents.com categories:
Medical: Surgery Surgery(2) Surgery(3) Drug Drug(2) Prosthesis Dentistry  

###

Data source: patent applications published in the public domain by the United States Patent and Trademark Office (USPTO). Information published here is for research/educational purposes only. FreshPatents is not affiliated with the USPTO, assignee companies, inventors, law firms or other assignees. Patent applications, documents and images may contain trademarks of the respective companies/authors. FreshPatents is not responsible for the accuracy, validity or otherwise contents of these public document patent application filings. When possible a complete PDF is provided, however, in some cases the presented document/images is an abstract or sampling of the full patent application for display purposes. FreshPatents.com Terms/Support
-g2-0.2441
     SHARE
  
           

FreshNews promo


stats Patent Info
Application #
US 20120269364 A1
Publish Date
10/25/2012
Document #
13540513
File Date
07/02/2012
USPTO Class
381119
Other USPTO Classes
International Class
04B1/00
Drawings
6



Follow us on Twitter
twitter icon@FreshPatents