xvoice Correction Box Specification

Author: Jessica Hekman (jphekman@arborius.net)
Version: 0.2

Caveats: I have no attachment to any command or variable names, or any implementation specifics; I expect them to be changed to fit xvoice conventions. I have some moderate attachment to what I've called "user triggers" (i.e., the actual phrases which users would say to trigger various commands), but of course these are just suggestions as well; if you would like to change them, feel free, but remember that I'd always be willing to discuss them to explain why I feel certain defaults are better.

This specification draws heavily from the DragonDictate UI, which I and others find really excellent. I am happy to provide screenshots of relevant parts of the DragonDictate correction box to help illustrate what I am describing.

Correction Box

The correction box consists of a horizontal set of <correction_word_buffer> user boxes. The size of <correction_word_buffer> is better determined by an xvoice developer who can look at how much space each box takes up and how long is a good length for the correction box. The number should probably become user-settable at some point.

Above each word box is the phrase "Word n," numbering each word to help users specify words by number. Possibly this phrase should be in a slightly smaller font size than the font used for the actual words. Numbering starts on the right with 1 and proceeds left.

Word buffer

The correction box assumes that xvoice has maintained some record of previously-dictated words. If such functionality already exists, this spec should be modified to interact with that functionality instead of the proposed functionality.

I'm not sure how many words should be buffered. DragonDictate can remember up to 25 or so. Keep in mind that future versions of the correction box are likely to save alternate word possibilities in addition to the chosen word.

I will assume that the word buffer is essentially a list of word objects and correction boundary objects. Each word object contains two strings: current_value, which represents the word which has been sent to the application (in some cases the string is empty--for example, if the "word" was actually a command), and corrected_value, which is initially null. (At a future date, word objects may also contain lists of alternate word choices provided by the engine.)

Correction boundary objects contain no information. For more information about them, see the section called "Correction boundaries."

For the purposes of this spec, the "beginning" of the word buffer is array element 0, and the "end" is array element <size-of-array - 1>.

The following non-user-level commands can be performed on a word object:

Word boxes

A word box is a rectangle, designed to hold one word. Its exact size is best determined by playing around with different values and seeing how they look; it should probably be about 20 characters wide by default (wider for particularly long words or command phrases).

Each word box corresponds to exactly one object in the word list. Word boxes are in the same order as objects in the word list, with the leftmost box corresponding to the "beginning" of the word buffer and the rightmost box corresponding to the "end." Word boxes display the value of the <current_value> variable of their corresponding word object.

A user can spell or type the correct word into a word box. When that box subsequently loses focus, it should compare the string in its buffer with the <current_value> of its word object. If the strings differ, it should copy the value of its buffer to the <corrected_value> slot of its word object.

In possible future versions of the correction box, alternate word choices will be offered, which the user can select or edit.

Correction boundaries

The following events cause a correction boundary to be inserted at the end of the word buffer: Note that, when either of these events is triggered by a spoken command, a word object corresponding to that command will be inserted on the end of the word buffer, as with any recognized word or phrase, and only then will the correction boundary be inserted.

Commands available from xvoice

bringUpCorrectionBox
default user triggers provided: "oops," "bring up correction box"
switches focus to correction box; loads correction box vocabulary

scratch n
default user triggers provided: "correction n," "scratch n," "correction" (->"correct 1"), "scratch that" (->"correct 1")
attempt to delete last n words from currently focused application:
  1. call setCorrectedValue("") on last n objects in word list
  2. call triggerCorrectionProcess

triggerCorrectionProcess
default user triggers provided: none
the following is a fairly high-level algorithm. I will happily be more specific if that's helpful.
  1. from the word list, select list of words after the last correction boundary ("words-since-boundary")
  2. from "words-since-boundary," select the list of words before (and including) the earliest word in the list for which <corrected_value> != <current_value> ("words-to-correct")
  3. change focus to the last focused application
  4. count the number of characters in each <current_value> in each word object in "words-to-correct"; add the appropriate number of spaces between each word. Send that many backspace characters.
  5. starting from the beginning of "words-to-correct," send the characters corresponding to the result of calling getCorrectionCharacters on each word object (with the appropriate number of spaces between each); call commitCorrection on each word object.

Commands available from correction box

The numbers and keys vocabularies should be enabled in the correction box by default.
focusToWord n
default user triggers provided: "word n"
moves focus to the word box corresponding to n

close commit?=true|false
default user triggers provided: "okay" (close commit?=true), "cancel" (close?=false)
closes correction box, returns focus to previous application, and, if commit?==true, triggers correction process