xvoice Correction Box Specification
Author: Jessica Hekman (jphekman@arborius.net)
Version: 0.2
Caveats: I have no attachment to any command or variable names,
or any implementation specifics;
I expect them to be changed to fit xvoice conventions. I have some
moderate attachment to what I've called "user triggers" (i.e., the
actual phrases which users would say to trigger various commands), but
of course these are just suggestions as well; if you would like to
change them, feel free, but remember that I'd always be willing to
discuss them to explain why I feel certain defaults are better.
This specification draws heavily from the DragonDictate UI, which I and
others find really excellent. I am happy to provide screenshots of
relevant parts of the DragonDictate correction box to help illustrate
what I am describing.
Correction Box
The correction box consists of a horizontal set of
<correction_word_buffer> user boxes. The size of
<correction_word_buffer> is better determined by an xvoice
developer who can look at how much space each box takes up and how
long is a good length for the correction box. The number should
probably become user-settable at some point.
Above each word box is the phrase "Word n," numbering each word
to help users specify words by number. Possibly this phrase should be
in a slightly smaller font size than the font used for the actual
words. Numbering starts on the right with 1 and proceeds left.
Word buffer
The correction box assumes that xvoice has maintained some record of
previously-dictated words. If such functionality already exists, this
spec should be modified to interact with that functionality instead of
the proposed functionality.
I'm not sure how many words should be buffered. DragonDictate can
remember up to 25 or so. Keep in mind that future versions of the
correction box are likely to save alternate word possibilities in
addition to the chosen word.
I will assume that the word buffer is essentially a list of word
objects and correction boundary objects. Each word object contains two
strings: current_value, which
represents the word which has been sent to the application
(in some cases the string is empty--for example, if the "word" was
actually a command), and corrected_value, which is initially
null. (At a future date, word objects may also contain lists of
alternate word choices provided by the engine.)
Correction boundary objects contain no information. For more
information about them, see the section called "Correction
boundaries."
For the purposes of this spec, the "beginning" of the word buffer is
array element 0, and the "end" is array element <size-of-array -
1>.
The following non-user-level commands can be performed on a word
object:
- setCorrectedValue(value:String) : sets word object's
<corrected_value> slot to <value>
- commitCorrection : if <corrected_value> != null,
copies <corrected_value> to
<current_value> and sets <corrected_value>
to null.
- getCorrectionCharacters : if <corrected_value> !=
null, return it; else, return<current_value>
Word boxes
A word box is a rectangle, designed to hold one word. Its exact size
is best determined by playing around with different values and seeing
how they look; it should probably be about 20 characters wide by
default (wider for particularly long words or command phrases).
Each word box corresponds to exactly one object in the word list. Word
boxes are in the same order as objects in the word list, with the
leftmost box corresponding to the "beginning" of the word buffer and
the rightmost box corresponding to the "end." Word boxes display the
value of the <current_value> variable of their
corresponding word object.
A user can spell or type the correct word into a word box. When that
box subsequently loses focus, it should compare the string in its
buffer with the <current_value> of its word object. If
the strings differ, it should copy the value of its buffer to the
<corrected_value> slot of its word object.
In possible future versions of the correction box, alternate word
choices will be offered, which the user can select or edit.
Correction boundaries
The following events cause a correction boundary to be inserted at the
end of the word buffer:
- recognition of a command
- changing of focus from the current application (either by a
spoken command or by other means)
Note that, when either of these events is triggered by a spoken
command, a word object corresponding to that command will be inserted
on the end of the word buffer, as with any recognized word or phrase,
and only then will the correction boundary be inserted.
Commands available from xvoice
- bringUpCorrectionBox
- default user triggers provided: "oops," "bring
up correction box"
- switches focus to correction box; loads correction box vocabulary
- scratch n
- default user triggers provided: "correction n,"
"scratch n," "correction" (->"correct 1"), "scratch that"
(->"correct 1")
-
attempt to delete last n words from currently
focused application:
- call setCorrectedValue("") on last n objects in word
list
- call triggerCorrectionProcess
- triggerCorrectionProcess
- default user triggers provided: none
- the following is a fairly high-level algorithm. I will
happily be more specific if that's helpful.
- from the word list,
select list of words after the last correction boundary
("words-since-boundary")
- from "words-since-boundary,"
select the list of words before (and including)
the earliest word in the list for which
<corrected_value> != <current_value>
("words-to-correct")
- change focus to the last focused application
- count the number of characters in each
<current_value> in each word object in
"words-to-correct"; add the appropriate number of spaces
between each word. Send that many backspace characters.
- starting from the beginning of "words-to-correct,"
send the characters corresponding to the result of calling
getCorrectionCharacters on each word object
(with the appropriate number of
spaces between each); call commitCorrection on each word object.
Commands available from correction box
The numbers and keys vocabularies should be enabled in the correction
box by default.
- focusToWord n
- default user triggers provided: "word n"
- moves focus to the word box corresponding to n
- close commit?=true|false
- default user triggers provided: "okay" (close commit?=true),
"cancel" (close?=false)
- closes correction box, returns focus to previous application, and,
if commit?==true,
triggers correction process