| System and method for improving text input in a shorthand-on-keyboard interface -> Monitor Keywords |
|
System and method for improving text input in a shorthand-on-keyboard interfaceUSPTO Application #: 20070094024Title: System and method for improving text input in a shorthand-on-keyboard interface Abstract: A word pattern recognition system improves text input entered via a shorthand-on-keyboard interface. A core lexicon comprises commonly used words in a language; an extended lexicon comprises words not included in the core lexicon. The system only directly outputs words from the core lexicon. Candidate words from the extended lexicon can be outputted and simultaneously admitted to the core lexicon upon user selection. A concatenation module enables a user to input parts of a long word separately. A compound word module combines two common shorter words whose concatenation forms a long word. (end of abstract) Agent: Samuel A. Kassatly Law Office - San Jose, CA, US Inventors: Per-Ola Kristensson, Shumin Zhai USPTO Applicaton #: 20070094024 - Class: 704252000 (USPTO) Related Patent Categories: Data Processing: Speech Signal Processing, Linguistics, Language Translation, And Audio Compression/decompression, Speech Signal Processing, Recognition, Word Recognition, Preliminary Matching The Patent Description & Claims data below is from USPTO Patent Application 20070094024. Brief Patent Description - Full Patent Description - Patent Application Claims CROSS-REFERENCE TO RELATED APPLICATIONS [0001] This application relates to the following co-pending U.S. patent application Ser. No. 10/325,197, titled "System and Method for Recognizing Word Patterns Based on a Virtual Keyboard Layout," Ser. No. 10/788,639, titled "System and Method for Recognizing Word Patterns in a Very Large Vocabulary Based on a Virtual Keyboard Layout," and Ser. No. 11/121,637, titled "System and Method for Issuing Commands Based on Pen Motion on a Graphical Keyboard," all of which are assigned to the same assignee as the present application, and are incorporated herein by reference. FIELD OF THE INVENTION [0002] The present invention generally relates to lexicon-based text entry and text prediction systems. More specifically, the present invention relates to text entry using shorthand-on-keyboard, an efficient method of entering words by drawing geometric patterns on a graphical on-screen keyboard. BACKGROUND OF THE INVENTION [0003] Shorthand on graphical keyboards (hereafter "shorthand-on-keyboard") or Shorthand on a Keyboard as Graph (sokgraph), represent an input method and system for efficiently entering text without a physical keyboard, typically using a stylus. Shorthand-on-keyboard enables the user to trace letter or functional keys on the graphical keyboard to enter words and commands into a computer. Experienced users partly or completely memorize the geometric patterns of frequently used words and commands on the keyboard layout and may draw these patterns based on memory recall using, for example, a digital pen. [0004] Word-level recognition-based text entry systems such as shorthand-on-keyboard and handwriting/speech recognition as well as text prediction systems all rely on some form of lexicon that defines the set of words that these systems recognize. The input of the user is matched against choices in the lexicon. Words not included in the lexicon are usually not automatically recognized. In such a case, a special mode has to be provided. For example, in shorthand-on-keyboard the user may initially check a candidate list (N-best list). If no choice on the candidate list is the intended word, the user decides if the patterns drawn were incorrect. If the patterns drawn were correct, the user realizes the word intended is not in the lexicon. The user then enters the new word in the lexicon by tapping the individual letters. Ideally, the lexicon comprises all words a particular user needs to write, no more no less. A lexicon that is either too large or too small can introduce problems to the user. [0005] A larger lexicon could present certain challenges, since it tends to reduce the recognition accuracy due to the likelihood of a greater number of distracters for each user input. In any language, there tends to be a core set of vocabulary that is common to all individuals. Beyond this core set, vocabulary tends to be specialized for a particular individual. For instance, an engineer may compose emails comprising highly technical terms and abbreviations for a particular field or business area. For other users, these specialized terms can be irrelevant and can introduce noise in the recognition process, making the recognition process less robust. [0006] A smaller lexicon is typically a more robust lexicon in that user input is more likely to be correctly recognized, provided the intended word is in the lexicon. A smaller lexicon provides more flexibility and tolerance for the input of the user, allowing the input to be imprecise and inaccurate compared to the ideal form of the intended input choice. A further advantage of a small lexicon is that the search space is smaller. Consequently, a small lexicon allows reduction in the latency of a search. This is especially important in mobile devices where processing power is severely limited. [0007] However, when a small lexicon does not contain the word the user needs, the user experience can be frustrating. A user does not know, prior to entry, whether a word is in the lexicon, causing uncertainty for the user. The lack of recognition of a word by a conventional system can occur either when the word is input incorrectly or when the word is not in the lexicon. Consequently, it can be difficult for the user to determine why a word is not recognized. In general, the user cannot know whether a word is in the lexicon except by repeatedly trying the word. When the user is certain that the word is not in the lexicon, the user adds that word to the lexicon via an interface provided by the recognition system by tapping as described earlier. A smaller lexicon requires a user to add words to the lexicon more often. [0008] There are several conventional solutions to the lexicon size issue. A commonly used method is to use a large lexicon and then take advantage of higher order language regularities such as a word-level trigram-model to filter out highly unlikely candidates. The downside of a language model approach is generally the overhead of creating and making efficient use of a large language model. Moreover, a language model can introduce errors and mistakenly filter out the intended words. This is especially true if the language model is generic rather than customized to a particular user. In practice, efficient customization of a language model is difficult. Furthermore, a language model is difficult to integrate with a recognition technique that already has a high precision, such as shorthand-on-keyboard. [0009] An alternative conventional approach creates a customized lexicon for a user by mining the written text generated by the user, for example, written emails and other documents. Although this approach does result in a lexicon more closely tailored to a specific user, a previously written corpus generated by a user may be to be too small to cover all of the desired words. Furthermore, in practice, it is difficult to write a computer program code that can open and read all and various email and document formats that the user may be using. This approach often requires the user to locate and select the previous written documents, which is inconvenient for the user. A customized lexicon may also be difficult to carry over across different devices. [0010] Although these conventional solutions are adequate for their intended purpose, it is desirable to find a solution that enables a lexicon to have a relatively small number of irrelevant distracters to the user's desired input and yet allows easy access to almost all words the user may need, including more specialized words that are infrequently used by most users. Overall, there is a desire to include all words possibly needed by the user in a very large lexicon. However a very large lexicon implies that more words match the pattern drawn on the keyboard given the same matching threshold, reducing the signal-to-noise ratio in the input system. Consequently, a larger lexicon corresponds to less flexibility and robustness to the user. Thus, there is a need for a lexicon configuration for a shorthand-on-keyboard system that balances ease of use with flexibility and robustness. [0011] Another challenge to a conventional shorthand-on-keyboard input method is a requirement of entering text exactly at the word level, one word at a time. Some words are long. For relatively new users, it can be cognitively difficult to draw a long word with shorthand-on-keyboard in one stroke. This difficulty is particularly acute in some European languages in which compound long words are more common than in English. Furthermore, a user can find entry more convenient if common affixes can be drawn as a separate stroke from the stem of the word. For example, to write the word "working" with shorthand-on-keyboard, the user may wish to draw the pattern of w-o-r-k on a graphical keyboard, then draw i-n-g and combine the two into one word. Thus, there is a need for an effective system and method to automatically combine partial words on the keyboard ("sokgraphs") into one word as intended by the user. [0012] What is therefore needed is a system, a computer program product, and an associated method for a system and method for improving text input in a shorthand-on-keyboard interface. The need for such a solution has heretofore remained unsatisfied. SUMMARY OF THE INVENTION [0013] The present invention satisfies this need, and presents a system, a computer program product, and an associated method (collectively referred to herein as "the system" or "the present system") for improving text input in a shorthand-on-keyboard interface. The present system comprises a core lexicon and an extended lexicon. The core lexicon comprises commonly used words in a language. The core lexicon typically comprises approximately 5,000 to 15,000 words, depending on an application of the present system. The extended lexicon comprises words not included in the core lexicon. The extended lexicon comprises approximately 30,000 to 100,000 words. [0014] The core lexicon allows the present system to target commonly used words in identifying a gesture as a highest-ranked candidate word, providing more robust recognition performance associated with a smaller lexicon. Only words from the core lexicon can be directly outputted in the present system. Additional candidate words are available from the extended lexicon, allowing a user to find lesser-known words on the candidate list, but only through menu selection. The present system enhances word recognition accuracy without sacrificing selection of words from a large lexicon. The core lexicon provides more flexibility and tolerance for the input of the user to be imprecise and inaccurate from the ideal form of the intended input choice. [0015] The present system further comprises a recognition module, a pre-ranking module, and a ranking module. The recognition module generates an N-best list of candidate words corresponding to an input pattern. The pre-ranking module ranks the N-best candidate words according to predetermined criteria. The ranking module adjusts ranking of the N-best list of candidate words to place words drawn from the core lexicon higher than words drawn from the extended lexicon, generating a ranked list of word candidates. Only words in the core lexicon are presented as output by the present system. The present system lists candidate words found in the extended lexicon only in the N-best list; these words require user selection to become output. Once selected by a user from the N-best list, a word from the extended lexicon is admitted to the core lexicon. [0016] More specifically, in a preferred embodiment, only words in the core lexicon are outputted by the recognition system. Words in the extended lexicon can only be listed in the N-best list and need explicit user selection to be outputted. Once selected, the words in the extended lexicon also gets admitted to the core lexicon. [0017] The present system reduces the overhead inflicted upon the user in the case the word gestured by the user is not in the vocabulary of the core lexicon. Instead of being unsure whether the word is included in the lexicon or if the system misrecognized the input, the user can scan the N-best list and select the desired candidate word. [0018] The present system further comprises a concatenation module and a compound word module. The concatenation module enables a user to input parts of a long word separately; the present system automatically combines words and part-of-words that are partial "sokgraphs" into one word that is intended by the user. Word parts can be stems, such as "work" and affixes, such as "ing" or "pre". The compound word module combines two or more common shorter words whose concatenation forms a long word, such as short+hand in English. The concatenation of several short words into one compound word is more common in some European languages such as Swedish or German. [0019] The present system allows user interaction to adjust concatenation of a word 1 and a word 2 and decoupling of a combined word. When the user clicks on a concatenated word, for example "smokefree", a menu option "split to "smoke free"" or an equivalent option is given to the user. Alternatively a pen trace motion, such as a downward motion crossing the word smokefree, can be defined as a split command. For concatenable words with no action due to low confidence, a menu option is embedded in word 1 and word 2. When the user clicks on word 1, the option "snap to right" or an equivalent option is selectable. When the user clicks on word 2, the option "snap to left" or an equivalent option is selectable. Alternatively a pen gesture, such as a circle crossing both word 1 and word 2, is defined as the command to join the two words as one concatenated long word. BRIEF DESCRIPTION OF THE DRAWINGS Continue reading... Full patent description for System and method for improving text input in a shorthand-on-keyboard interface Brief Patent Description - Full Patent Description - Patent Application Claims Click on the above for other options relating to this System and method for improving text input in a shorthand-on-keyboard interface patent application. ### 1. Sign up (takes 30 seconds). 2. Fill in the keywords to be monitored. 3. Each week you receive an email with patent applications related to your keywords. Start now! - Receive info on patent apps like System and method for improving text input in a shorthand-on-keyboard interface or other areas of interest. ### Previous Patent Application: Method and device for recognizing human intent Next Patent Application: Information processing apparatus, information processing apparatus system and control method of information processing apparatus Industry Class: Data processing: speech signal processing, linguistics, language translation, and audio compression/decompression ### FreshPatents.com Support Thank you for viewing the System and method for improving text input in a shorthand-on-keyboard interface patent info. IP-related news and info Results in 6.46252 seconds Other interesting Feshpatents.com categories: Qualcomm , Schering-Plough , Schlumberger , Seagate , Siemens , Texas Instruments , |
||