Contextual Computing Group

Stuff for HTK....

We've gotten HTK to work for non-speech data on the Linux machines. These scripts have been tested on Redhat Linux. This current configuration is designed to take data from the "findlaser" program (which uses "captain crunch"). These files can be found on /net/hm41/projects/htk_scripts. Below are the two directories:

clean/
- commands
  contains a list of commands
- datafiles
  contains a list of your data files to be used in this run
- hmmdefs/
  directory that contains your hmm definitions - there are several in there, but you may have to create your own, depending on your data set; don't forget to change which definintion you use in the setupdata script
- *.txt
  text files that contain your data sets
- setupdata
  this is the script file that runs everything..
utils/
- parsedata.c --> parsedata
  takes in data file and separates out data sets separated by "MARK" in file
- helene_prepare.c --> prepare
  reads in commands, create a sentence file, and makes ext and lab files; assumes that sentences are one command sentences
- create_grammar.c --> create_grammar
  reads in commands from command file and creates a grammar and dictionary; assumes that the grammar is a simple, single gesture grammar
- random.c --> random
  takes in a list of files and randomizes them into a training set and a testing set

Getting started

First, copy utils and clean to the directory that you are going to be working in. Then you can setup your files to run how you want. The script setupdata in clean/ will assume that the utils directory is in a parellel directory. You will need to edit the following files:

commands - each command that you will be using should be listed on its own line in this file
datafiles - each data file that you will be using should be listed on its own line in this file
setupdata
- the default setting is to use hmmdefs/4state-1-noskip-4vec, you can change this to whatever you want (there are some hmm definitions already in the directory, or you can make your own)
- any HTK settings you want to change can easily be modified in the setupdata script
- currently the script only goes through 9 iterations, you can change this

The script randomizes which data sets end up in the training set and which in the testing set. I recommend running the tests multiple times to compare the performace.

Running the scripts

"copy -r clean" to your test directory and change to that directory
running "script" will keep a log of everything that passes by
run "setupdata" - it will take a while and print a lot of stuff to the screen really fast, thats why we ran script
when setupdata is done, type "exit" - this will end the scripting
the output from all of the training and testing will be in the file "typescript"

Troubleshooting

checking your HTK ext files - you can use the command "HList -h filename.ext" to show you the contents of your ext files, it should print out a nice header(Sample bytes, Sample kind, Num comps, Sample period, Num Samples, File format) and then a table of your input
check to make sure that the files datafile, training-extfiles, all-extfiles, and testing-extfiles contain only valid data files
you can view the generated grammar in the file "grammar"
you can view the generated dictionary in the file "dict"
you can view the generated word lattice in the file "word.lattice"
you can view your MLF file in the file "result"
the generated hmms will be in directories hmm.0/, hmm.1/, etc.
for each piece of data you should have the files: name, name.ext, name.sent, and name.lab where name is the name of the data piece
- name will contain the text version of the data parsed from the main data file
- name.ext will be the HTK data file
- name.lab will contain the time information and will be one line, like "0 194000 commandname"
- name.sent will contain the sentence which in our case consists of a single gesture command
there are some grep tricks in the script for file generation, so if you get weird files, check out the script... you might be inadvertantly picking up the wrong filenames.. i'll try to fix this or at least make it more clean later.

Credits

This stuff is an amalgam of resources from various places

The main HTK site
Nuria Oliver's HTK site at MIT Media Lab
Scripts written by the contextual computing group at Georgia Tech

[GVU Center] [College Of Computing] [Georgia Tech]