SALT Interactors
Metrics |
In order to measure the usability of the software, we
employed two types of metrics: preference metrics and performance
metrics. Preference metrics are
a subjective measurement of how the test subject evaluated the software –
whether it was easy to use, if he/she enjoyed the experience, etc. Performance metrics are objective
measurements of the results of the test subject’s interaction with the
software – how often errors occurred, how often the test subject spoke a word
that was not understood, etc. |
Preference Metrics |
Preference metrics were measured on a scale from 1 to
5. In phase 1 of the tests, the
test subject was simply asked to give a numerical rating for each. In phase 2, we assigned specific
words to each of the numeric measurements. Not all preference metrics were used with all interactors. Clarity
– The test subject was told, “Rate how well you understood what the computer
wanted you to do.” In phase 2,
the value assignments were: 1=very poorly, 2=somewhat poorly, 3=ok, 4=well,
5=very well Effectiveness
– The test subject was told, “Rate how well you were able to accomplish your
task.” In phase 2, the value assignments
were: 1=very poorly, 2=somewhat poorly, 3=ok, 4=well, 5=very well Ease of use
– The test subject was told, “Rate how easy this program was to use.” In phase 2, the value assignments
were: 1=very difficult, 2=somewhat difficult, 3=ok, 4=easy, 5=very easy Ease of
Multiple Selection – The test subject was told, “Rate how easy it was to
select multiple items.” In phase
2, the value assignments were: 1=very difficult, 2=somewhat difficult, 3=ok,
4=easy, 5=very easy Confirmation
– The test subject was told, “Rate how you liked the confirmation of your
choices at the end of the dialog.”
In phase 2, the value assignments were: 1=did not like at all, 2=liked
somewhat, 3=neutral, 4=liked, 5=liked a lot Deselection
– The test subject was told, “Rate how well changing your selections at the
end of the program worked.” In
phase 2, the value assignments were: 1=very poorly, 2=somewhat poorly, 3=ok,
4=well, 5=very well Prompt Speed – The test subject was asked,
“Were the options provided too slow, too fast, or just right?” In phase 2, the value assignments
were: 1=very slowly, 2=somewhat slowly, 3=just right, 4=somewhat quickly,
5=very quickly Speech
Preference – The test subject was asked, “Do you prefer to say the items
you want immediately, or select them from a list of options.” In phase 2, the value assignments
were: 1=I strongly prefer to speak the items, 2=I slightly prefer to speak
the items, 3=I like having both options, 4=I slightly prefer to select from a
list, 5=I strongly prefer to select from a list Second
Prompt – The test subject was told, “Rate how helpful the second prompt
was.” This metric was not used
in the second round of testing. |
Performance Metrics |
The following performance metrics were used for these tests, not all metrics were used with all interactors. Time per iteration – The time (in milliseconds) it took to make a single choice in the interactor. Error Rate – The rate at which the interactor returned incorrect results – this is distinct from bad response rate, because it judges the end result of the interactor execution, and not any intermediate data. The error rate was measured as a percentage of the total number of results returned from the interactor. Bad Response Rate – The rate at which the test subject uttered a word or phrase that the interactor did not understand. The bad response rate was measured as a percentage of the total number of times the interactor was listening for a response. No Response Rate – The rate at which the test subject said nothing when the interactor expected a response. The no response rate was measured as a percentage of the total number of times the interactor was listening for a response. Second-level No Rate – The rate at which test subjects answered in the negative when asked to confirm their response. The second level no rate was measured as a percentage of the total number of results returned from the interactor. |
Phase 1 Metrics |
|
PerformanceMetrics |
|
Multiple List Selection with Deselecting Different Prompts for Different Responses (n-ary prompt) |
PreferenceMetrics |
Phase 2 Metrics |
|
Performance
Metrics |
|
Multiple List Selection with Deselecting 3 Double Yes/No (using Speech) 4 |
Preference
Metrics |
An excellent definition of the word metric could be found on:
|
Portland
State University
Capstone Summer 2004
|