212 lines
		
	
	
		
			7.7 KiB
		
	
	
	
		
			HTML
		
	
	
	
	
	
		
		
			
		
	
	
			212 lines
		
	
	
		
			7.7 KiB
		
	
	
	
		
			HTML
		
	
	
	
	
	
|  | <! $Id: segment-nbest.1,v 1.10 2019/09/09 22:35:37 stolcke Exp $> | ||
|  | <HTML> | ||
|  | <HEADER> | ||
|  | <TITLE>segment-nbest</TITLE> | ||
|  | <BODY> | ||
|  | <H1>segment-nbest</H1> | ||
|  | <H2> NAME </H2> | ||
|  | segment-nbest - rescore and segment N-best lists using hidden segment N-gram model | ||
|  | <H2> SYNOPSIS </H2> | ||
|  | <PRE> | ||
|  | \fsegment-nbest\fP [ <B>-help</B> ] <I>option</I> ... <I>nbest-file-list</I> ... | ||
|  | </PRE> | ||
|  | <H2> DESCRIPTION </H2> | ||
|  | <B> segment-nbest </B> | ||
|  | processes a series of consecutive N-best lists from a speech | ||
|  | recognizer | ||
|  | and applies a hidden segment N-gram language model to them. | ||
|  | The language model is a standard backoff N-gram model in ARPA | ||
|  | <A HREF="ngram-format.5.html">ngram-format(5)</A> | ||
|  | modeling sentence segmentation using the boundary tags <s> and </s>. | ||
|  | The program reads in all N-best lists and outputs the  | ||
|  | hypotheses that have the highest aggregate (combined acoustic  | ||
|  | and language model) score. | ||
|  | Hypothesized sentence boundaries are marked by <s> tags. | ||
|  | <H2> OPTIONS </H2> | ||
|  | <P> | ||
|  | Each filename argument can be an ASCII file, or a  | ||
|  | compressed file (name ending in .Z or .gz), or ``-'' to indicate | ||
|  | stdin/stdout. | ||
|  | <DL> | ||
|  | <DT><B> -help </B> | ||
|  | <DD> | ||
|  | Print option summary. | ||
|  | <DT><B> -version </B> | ||
|  | <DD> | ||
|  | Print version information. | ||
|  | <DT><B>-order</B><I> n</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Set the maximal N-gram order to be used, by default 3. | ||
|  | NOTE: The order of the model is not set automatically when a model | ||
|  | file is read, so the same file can be used at various orders. | ||
|  | <DT><B>-debug</B><I> level</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Set the debugging output level (0 means no debugging output). | ||
|  | Debugging messages are sent to stderr. | ||
|  | <DT><B>-lm</B><I> file</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Read the N-gram model from | ||
|  | <I>file</I>.<I></I><I></I><I></I> | ||
|  | <DT><B> -tolower </B> | ||
|  | <DD> | ||
|  | Map all vocabulary to lowercase. | ||
|  | Useful if case conventions for N-best lists and language model differ. | ||
|  | <DT><B>-mix-lm</B><I> file</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Read a second, standard N-gram model for interpolation purposes. | ||
|  | <DT><B>-lambda</B><I> weight</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Set the weight of the main model when interpolating with | ||
|  | <B>-mix-lm</B>.<B></B><B></B><B></B> | ||
|  | Default value is 0.5. | ||
|  | <DT><B>-bayes</B><I> length</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Interpolate the second and the main model using posterior probabilities | ||
|  | for local N-gram-contexts of length | ||
|  | <I>length</I>.<I></I><I></I><I></I> | ||
|  | The  | ||
|  | <B> -lambda </B> | ||
|  | value is used as a prior mixture weight in this case. | ||
|  | <DT><B>-bayes-scale</B><I> scale</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Set the exponential scale factor on the context likelihood in conjunction | ||
|  | with the | ||
|  | <B> -bayes </B> | ||
|  | function. | ||
|  | Default value is 1.0. | ||
|  | <DT><B>-nbest-files</B><I> list</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Specifies a list of N-best files. | ||
|  | The file | ||
|  | <I> list </I> | ||
|  | should contain a list of filenames, one per line, | ||
|  | each corresponding to an N-best file in one of the formats | ||
|  | described in  | ||
|  | <A HREF="nbest-format.5.html">nbest-format(5)</A>. | ||
|  | The N-best files should correspond to consecutive speech waveforms | ||
|  | in the order listed. | ||
|  | <DT><B> -fb-rescore </B> | ||
|  | <DD> | ||
|  | Perform Forward-backward rescoring. | ||
|  | This generates new N-best lists | ||
|  | as output whose LM scores reflect the posterior probability of each | ||
|  | hypothesis. | ||
|  | The default is to perform Viterbi rescoring and output only the | ||
|  | best combined hypothesis. | ||
|  | <DT><B>-write-nbest-dir</B><I> dir</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Write rescored N-best lists to directory  | ||
|  | <I> dir </I> | ||
|  | instead of to stdout. | ||
|  | The filenames from the input are preserved. | ||
|  | <DT><B>-max-nbest</B><I> n</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Limits the number of hypotheses read from each N-best list to the first | ||
|  | <I>n</I>.<I></I><I></I><I></I> | ||
|  | <DT><B>-max-rescore</B><I> m</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Only choose among the top  | ||
|  | <I> m </I> | ||
|  | hypotheses of each list (after reordering hypotheses, see below). | ||
|  | This is an effective way to limit the quadratic computation  | ||
|  | of the Viterbi or forward/backward dynamic programming. | ||
|  | <DT><B> -no-reorder </B> | ||
|  | <DD> | ||
|  | Do not reorder the hypotheses before limiting the computation to | ||
|  | the top | ||
|  | <I>m</I>.<I></I><I></I><I></I> | ||
|  | By default the hypotheses will first be sorted according to the  | ||
|  | acoustic and language model scores recorded in the N-best lists. | ||
|  | <DT><B>-rescore-lmw</B><I> weight</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Specifies the language model weight to be use in combining | ||
|  | acoustic and language model scores to select the best hypotheses. | ||
|  | <DT><B>-rescore-wtw</B><I> weight</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Specifies the word transition weight to be used in selecting the | ||
|  | best hypotheses. | ||
|  | <DT><B>-noise</B><I> noise-tag</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Designate | ||
|  | <I> noise-tag </I> | ||
|  | as a vocabulary item that is to be ignored by the LM. | ||
|  | (This is typically used to identify a noise marker.) | ||
|  | <DT><B>-noise-vocab</B><I> file</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Read several noise tags from | ||
|  | <I>file</I>,<I></I><I></I><I></I> | ||
|  | instead of, or in addition to, the single noise tag specified by | ||
|  | <B>-noise</B>.<B></B><B></B><B></B> | ||
|  | <DT><B>-decipher-lm</B><I> model-file</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Designates the N-gram backoff model (typically a bigram) that was used by the | ||
|  | Decipher(TM) recognizer in computing composite scores. | ||
|  | Used to compute acoustic scores from the composite scores if the | ||
|  | N-best lists are in "NBestList1.0" format. | ||
|  | <DT><B>-decipher-lmw</B><I> weight</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Specifies the language model weight used by the recognizer. | ||
|  | Used to compute acoustic scores from the composite scores. | ||
|  | <DT><B>-decipher-wtw</B><I> weight</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Specifies the word transition weight used by the recognizer. | ||
|  | Used to compute acoustic scores from the composite scores. | ||
|  | <DT><B>-stag</B><I> string</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Use | ||
|  | <I> string </I> | ||
|  | to mark segment boundaries in the output. | ||
|  | Default is the start-of-sentence symbol defined in the language model (<s>). | ||
|  | <DT><B>-bias</B><I> b</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Make a segment boundary a priori more likely by a factor of | ||
|  | <I>b</I>.<I></I><I></I><I></I> | ||
|  | If | ||
|  | <I> b </I> | ||
|  | is 0, the dynamic program algorithm is restricted to never consider | ||
|  | hidden sentence boundaries; this is useful when | ||
|  | <B> segment-nbest </B> | ||
|  | is used merely for its ability to apply the LM across N-best boundaries. | ||
|  | <DT><B>-start-tag</B><I> string</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Insert a tag  | ||
|  | <I> string </I> | ||
|  | at the front of every N-best hypothesis read in. | ||
|  | <DT><B>-end-tag</B><I> string</I><B></B><I></I><B></B><I></I><B></B> | ||
|  | <DD> | ||
|  | Insert a tag  | ||
|  | <I> string </I> | ||
|  | at the end of every N-best hypothesis read in. | ||
|  | This and the previous option are useful if the LM marks acoustic | ||
|  | waveform boundaries with a special tag. | ||
|  | </DD> | ||
|  | </DL> | ||
|  | <P> | ||
|  | <B> segment-nbest </B> | ||
|  | will also process any command line arguments following the options | ||
|  | as lists of N-best lists, as with the  | ||
|  | <B> -nbest-files </B> | ||
|  | option. | ||
|  | Each  | ||
|  | <I> nbest-file-list </I> | ||
|  | will be processed in turn, | ||
|  | with individual output delimited by a line of the form | ||
|  | <PRE> | ||
|  | 	<nbestfile <I>nbest-file-list</I>> | ||
|  | </PRE> | ||
|  | <H2> SEE ALSO </H2> | ||
|  | <A HREF="ngram-count.1.html">ngram-count(1)</A>, <A HREF="segment.1.html">segment(1)</A>, <A HREF="ngram-format.5.html">ngram-format(5)</A>, <A HREF="nbest-format.5.html">nbest-format(5)</A>. | ||
|  | <BR> | ||
|  | A. Stolcke, ``Modeling Linguistic Segment and Turn Boundaries for N-best | ||
|  | Rescoring of Spontaneous Speech,'' <I>Proc. Eurospeech</I>, 2779-2782, 1997. | ||
|  | <H2> BUGS </H2> | ||
|  | N-gram models of arbitrary order can be used, but the context at the  | ||
|  | beginning of a hypothesis never extends beyond the words from the preceding | ||
|  | N-best list. | ||
|  | <H2> AUTHOR </H2> | ||
|  | Andreas Stolcke <stolcke@icsi.berkeley.edu> | ||
|  | <BR> | ||
|  | Copyright (c) 1997-2004 SRI International | ||
|  | </BODY> | ||
|  | </HTML> |