Time Expression and Named Entity Recognition by Xiaoshi Zhong & Erik Cambria

Time Expression and Named Entity Recognition by Xiaoshi Zhong & Erik Cambria

Author:Xiaoshi Zhong & Erik Cambria
Language: eng
Format: epub
ISBN: 9783030789619
Publisher: Springer International Publishing


We find that time expressions are formed by loose structure and the loose structure mainly exhibits in the following two aspects. Firstly, many time expressions consist of loose collocations. For example, the time token “September” can form a time expression by itself, or forms “September 2006” by another time token appearing after it, or forms “1 September 2006” by a numeral appearing before it and another time token appearing after it. Secondly, some time expressions can change their word order without changing their meanings. For example, “September 2006” can be written as “2006 September” with the same meaning. From the point of view of the positions within time expressions, the time token “September” may appear as the (i) beginning or (ii) inside word of a time expression when time expressions are modeled by the BIO scheme; or it may appear as (1) a unit-word time expression, or the (2) beginning, (3) inside, (4) last word of a multi-word time expression when time expressions are modeled by the BILOU scheme.

Table 3.6 presents the percentages of distinct time tokens and distinct modifiers that appear in different positions within time expressions. “Different positions” here means the two different positions under the BIO scheme and at least two of the four different positions under the BILOU scheme. For each dataset, under the BIO scheme, more than 53.5% of distinct time tokens appear in different positions, and under the BILOU scheme, more than 61.4% of distinct time tokens appear in different positions. The number of modifiers that appear in different positions is more than 27.5%. When the BIO scheme or the BILOU scheme is used to model time expressions, the appearance in different positions leads to inconsistent tag assignment, and the inconsistent tag assignment causes difficulty for statistical models to model time expressions. We need to explore an appropriate tagging scheme (see Sect. 5.​1 for details). Table 3.6Percentage of distinct time tokens and distinct modifiers that appear in different positions within time expressions



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.