Android本土语音识别引擎PocketSphinx-语言建模

2012-07-30

Android本地语音识别引擎PocketSphinx-语言建模idngram2lm -vocab_type 0 -idngram weather.idngram -voca

Android本地语音识别引擎PocketSphinx-语言建模

idngram2lm -vocab_type 0 -idngram weather.idngram -vocab weather.tmp.vocab -arpa weather.arpa

?如果无误，在目录下会生成weather.tmp.DMP文件。

官网说在http://www.speech.cs.cmu.edu/tools/lmtool.html也可以在线提交txt文件，在服务器生成DMP文件，但是我在试的时候，发现访问不了，可能是太多人用了，CMU把这个服务给关了？

哈哈，在查资料的时候，又发现有这方面的博文，再引用一下，http://www.cnblogs.com/huanghuang/archive/2011/07/14/2106579.html，http://archive.cnblogs.com/a/2111834/，http://www.cnblogs.com/huanghuang/archive/2011/07/18/2109101.html，这三篇应该讲的很全面了。

1 楼 cs3230524 2012-04-17   我怎么还是识别不了呢？服务器不能生成dmp文件。翻.墙访问
log：
INFO: cmd_ln.c(691): Parsing command line:

Current configuration:
[NAME][DEFLT][VALUE]
-agcnonenone
-agcthresh2.02.000000e+00
-alpha0.979.700000e-01
-ascale20.02.000000e+01
-aw11
-backtracenono
-beam1e-481.000000e-48
-bestpathyesyes
-bestpathlw9.59.500000e+00
-bghistnono
-ceplen1313
-cmncurrentcurrent
-cmninit8.08.0
-compallsennono
-debug0
-dict
-dictcasenono
-dithernono
-doublebwnono
-ds11
-fdict
-feat1s_c_d_dd1s_c_d_dd
-featparams
-fillprob1e-81.000000e-08
-frate100100
-fsg
-fsgusealtpronyesyes
-fsgusefilleryesyes
-fwdflatyesyes
-fwdflatbeam1e-641.000000e-64
-fwdflatefwid44
-fwdflatlw8.58.500000e+00
-fwdflatsfwin2525
-fwdflatwbeam7e-297.000000e-29
-fwdtreeyesyes
-hmm
-input_endianlittlelittle
-jsgf
-kdmaxbbi-1-1
-kdmaxdepth00
-kdtree
-latsize50005000
-lda
-ldadim00
-lextreedump00
-lifter00
-lm
-lmctl
-lmnamedefaultdefault
-logbase1.00011.000100e+00
-logfn
-logspecnono
-lowerf133.333341.333333e+02
-lpbeam1e-401.000000e-40
-lponlybeam7e-297.000000e-29
-lw6.56.500000e+00
-maxhmmpf-1-1
-maxnewoov2020
-maxwpf-1-1
-mdef
-mean
-mfclogdir
-min_endfr00
-mixw
-mixwfloor0.00000011.000000e-07
-mllr
-mmapyesyes
-ncep1313
-nfft512512
-nfilt4040
-nwpen1.01.000000e+00
-pbeam1e-481.000000e-48
-pip1.01.000000e+00
-pl_beam1e-101.000000e-10
-pl_pbeam1e-51.000000e-05
-pl_window00
-rawlogdir
-remove_dcnono
-round_filtersyesyes
-samprate160001.600000e+04
-seed-1-1
-sendump
-senlogdir
-senmgau
-silprob0.0055.000000e-03
-smoothspecnono
-svspec
-tmat
-tmatfloor0.00011.000000e-04
-topn44
-topn_beam00
-toprule
-transformlegacylegacy
-unit_areayesyes
-upperf6855.49766.855498e+03
-usewdphonesnono
-uw1.01.000000e+00
-var
-varfloor0.00011.000000e-04
-varnormnono
-verbosenono
-warp_params
-warp_typeinverse_linearinverse_linear
-wbeam7e-297.000000e-29
-wip0.656.500000e-01
-wlen0.0256252.562500e-02

INFO: cmd_ln.c(691): Parsing command line:
\
-nfilt 20 \
-lowerf 1 \
-upperf 4000 \
-wlen 0.025 \
-transform dct \
-round_filters no \
-remove_dc yes \
-feat 1s_c_d_dd \
-svspec 0-12/13-25/26-38 \
-agc none \
-cmn current \
-cmninit 54,-1,2 \
-varnorm no

Current configuration:
[NAME][DEFLT][VALUE]
-agcnonenone
-agcthresh2.02.000000e+00
-alpha0.979.700000e-01
-ceplen1313
-cmncurrentcurrent
-cmninit8.054,-1,2
-dithernono
-doublebwnono
-feat1s_c_d_dd1s_c_d_dd
-frate100100
-input_endianlittlelittle
-lda
-ldadim00
-lifter00
-logspecnono
-lowerf133.333341.000000e+00
-ncep1313
-nfft512512
-nfilt4020
-remove_dcnoyes
-round_filtersyesno
-samprate160008.000000e+03
-seed-1-1
-smoothspecnono
-svspec0-12/13-25/26-38
-transformlegacydct
-unit_areayesyes
-upperf6855.49764.000000e+03
-varnormnono
-verbosenono
-warp_params
-warp_typeinverse_linearinverse_linear
-wlen0.0256252.500000e-02

INFO: acmod.c(242): Parsed model-specific feature parameters from /sdcard/Android/data/test/hmm/tdt_sc_8k/feat.params
INFO: feat.c(684): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(163): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(520): Reading model definition: /sdcard/Android/data/test/hmm/tdt_sc_8k/mdef
INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(330): Reading binary model definition: /sdcard/Android/data/test/hmm/tdt_sc_8k/mdef
INFO: bin_mdef.c(507): 70 CI-phone, 65021 CD-phone, 3 emitstate/phone, 210 CI-sen, 5210 Sen, 11271 Sen-Seq
INFO: tmat.c(205): Reading HMM transition probability matrices: /sdcard/Android/data/test/hmm/tdt_sc_8k/transition_matrices
INFO: acmod.c(117): Attempting to use SCHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /sdcard/Android/data/test/hmm/tdt_sc_8k/means
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /sdcard/Android/data/test/hmm/tdt_sc_8k/variances
INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(294): 256x13
INFO: ms_gauden.c(354): 0 variance values floored
INFO: s2_semi_mgau.c(908): Loading senones from dump file /sdcard/Android/data/test/hmm/tdt_sc_8k/sendump
INFO: s2_semi_mgau.c(932): BEGIN FILE FORMAT DESCRIPTION
INFO: s2_semi_mgau.c(1027): Using memory-mapped I/O for senones
INFO: s2_semi_mgau.c(1304): Maximum top-N: 4 Top-N beams: 0 0 0
INFO: phone_loop_search.c(105): State beam -230231 Phone exit beam -115115 Insertion penalty 0
INFO: dict.c(306): Allocating 4107 * 20 bytes (80 KiB) for word entries
INFO: dict.c(321): Reading main dictionary: /sdcard/Android/data/test/lm/test.dic
ERROR: "dict.c", line 194: Line 1: Phone 'EH' is mising in the acoustic model; word '<S>不好</S>' ignored
ERROR: "dict.c", line 194: Line 2: Phone 'EH' is mising in the acoustic model; word '<S>好</S>' ignored
ERROR: "dict.c", line 194: Line 3: Phone 'EH' is mising in the acoustic model; word '<S>小赖</S>' ignored
ERROR: "dict.c", line 194: Line 4: Phone 'EH' is mising in the acoustic model; word '<S>祝东芝</S>' ignored
INFO: dict.c(212): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(324): 0 words read
INFO: dict.c(330): Reading filler dictionary: /sdcard/Android/data/test/hmm/tdt_sc_8k/noisedict
INFO: dict.c(212): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(333): 7 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(404): Allocating 70^3 * 2 bytes (669 KiB) for word-initial triphones
INFO: dict2pid.c(131): Allocated 59080 bytes (57 KiB) for word-final triphones
INFO: dict2pid.c(195): Allocated 59080 bytes (57 KiB) for single-phone word triphones
INFO: ngram_model_arpa.c(477): ngrams 1=6, 2=8, 3=4
INFO: ngram_model_arpa.c(135): Reading unigrams
INFO: ngram_model_arpa.c(516):        6 = #unigrams created
INFO: ngram_model_arpa.c(195): Reading bigrams
INFO: ngram_model_arpa.c(533):        8 = #bigrams created
INFO: ngram_model_arpa.c(534):        3 = #prob2 entries
INFO: ngram_model_arpa.c(542):        3 = #bo_wt2 entries
INFO: ngram_model_arpa.c(292): Reading trigrams
INFO: ngram_model_arpa.c(555):        4 = #trigrams created
INFO: ngram_model_arpa.c(556):        2 = #prob3 entries
INFO: ngram_search_fwdtree.c(99): 0 unique initial diphones
INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 8 single-phone words
INFO: ngram_search_fwdtree.c(186): Creating search tree
INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 8 single-phone words
INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 128
ERROR: "ngram_search_fwdtree.c", line 336: No word from the language model has pronunciation in the dictionary
INFO: ngram_search_fwdtree.c(338): after: 0 root, 0 non-root channels, 7 single-phone words
INFO: ngram_search_fwdflat.c(156): fwdflat: min_ef_width = 4, max_sf_win = 25
INFO: pocketsphinx.c(299): zslog use ngs
2 楼 wushaoliu3000 2012-05-21   我更惨了，我连apk文件都不能运行，请博主多指点

热点排行