语料如下:
<B> <overlap /> hi </B>
<B> yeah <overlap /> all right </B>
<B> I think that's the topic I'm interested in or I'd like to tell you about which is a country I have visited and: it is a country which has impressed me . obviously (erm) . to describe my visit and say why I found that country particularly impressive </B>
<B> (eh) . well it was . Mexico (eh) that I visited for<?> some four years ago and it was .. around Christmas time . so . it was really . (er) . quite . an experience for me . (eh) it was not my first flight but it was my first overseas flight . (erm) . just at Christmas that is December the twenty-fourth and we had turkey <starts laughing> and champagne <stops laughing> </B>
<B> on the plane yes so it started from there no it started in Spain . cos I (eh) . left from here I went to see my friend in Spain <overlap /> and then I spent some time </B>
<B> with her there and I left . I I took the . the flight (em) . on Christmas day so this was the first time and . (eh) . it still is my first time that I've . (eh) ... had one and the same day twice </B>
<B> <overlap /> <XXX> . </B>
结果提取出来的是字母块,如下:
1 33 h e r e
2 25 w e l l
3 24 t h e r
4 23 y e a h
5 22 t h a t
6 21 t h i s
7 20 v e r y
8 18 d t h e
9 18 n d t h
10 18 t h e s
11 17 t h i n
12 14 a n d t
13 14 i g h t
14 13 e t h e
不知是什么原因?设置了不同的Language encodings都是如此。
忘各位指导指导!非常感谢!
<B> <overlap /> hi </B>
<B> yeah <overlap /> all right </B>
<B> I think that's the topic I'm interested in or I'd like to tell you about which is a country I have visited and: it is a country which has impressed me . obviously (erm) . to describe my visit and say why I found that country particularly impressive </B>
<B> (eh) . well it was . Mexico (eh) that I visited for<?> some four years ago and it was .. around Christmas time . so . it was really . (er) . quite . an experience for me . (eh) it was not my first flight but it was my first overseas flight . (erm) . just at Christmas that is December the twenty-fourth and we had turkey <starts laughing> and champagne <stops laughing> </B>
<B> on the plane yes so it started from there no it started in Spain . cos I (eh) . left from here I went to see my friend in Spain <overlap /> and then I spent some time </B>
<B> with her there and I left . I I took the . the flight (em) . on Christmas day so this was the first time and . (eh) . it still is my first time that I've . (eh) ... had one and the same day twice </B>
<B> <overlap /> <XXX> . </B>
结果提取出来的是字母块,如下:
1 33 h e r e
2 25 w e l l
3 24 t h e r
4 23 y e a h
5 22 t h a t
6 21 t h i s
7 20 v e r y
8 18 d t h e
9 18 n d t h
10 18 t h e s
11 17 t h i n
12 14 a n d t
13 14 i g h t
14 13 e t h e
不知是什么原因?设置了不同的Language encodings都是如此。
忘各位指导指导!非常感谢!