2019-11-05: Integrating Dictionary Feature into A Deep Learning Model for Disease Named Entity Recognition https://arxiv.org/abs/1911.01600v1Instead of hand engineering the features for tokens, character-level embeddings are used to represent orthographical features of tokens in addition to using word embeddings and dictionary information
In recent years, Deep Learning (DL) models are becoming important due to
their demonstrated success at overcoming complex learning problems. DL models
have been applied effectively for different Natural Language Processing (NLP)
tasks such as part-of-Speech (PoS) tagging and Machine Translation (MT).
Disease Named Entity Recognition (Disease-NER) is a crucial task which aims at
extracting disease Named Entities (NEs) from text. In this paper, a DL model
for Disease-NER using dictionary information is proposed and evaluated on
National Center for Biotechnology Information (NCBI) disease corpus and BC5CDR
dataset. Word embeddings trained over general domain texts as well as
biomedical texts have been used to represent input to the proposed model. This
study also compares two different Segment Representation (SR) schemes, namely
IOB2 and IOBES for Disease-NER. The results illustrate that using dictionary
information, pre-trained word embeddings, character embeddings and CRF with
global score improves the performance of Disease-NER system.