Sequence labeling has been one of the most discussed topics in Linguistics and Computational Linguistics history. Challenges like Dependency Parsing, Word Sense Disambiguation and Sequence Labeling etc. arose with the formal definition of the syntax.
Sequence labeling is a Natural Language Processing task. Sequence labeling aims to classify each token (word) in a class space C. This classifying approach can be independent (each word is treated as an independent), or dependent (each word is dependent on other words).
In this post; we will examine how those words are dependent, and how do we classify them from a syntactic perspective.
Words And Their Roles
Words are sequential building blocks of sentences. Each word contributes syntactic and semantic properties to a sentence. For example, a word can be an adjective, which can give positive semantics (delicate). By doing that, this adjective can describe a noun (delicate boy). This sequential relationship can go recursively and unbounded. This shows us words are related to each other.
But how can we determine the role of the word? For example, how do we choose which word is a noun or which word is an adjective?
Part-Of-Speech Tagging (POS Tagging) is a sequence labeling task. It is the process of marking up a word in a text (corpus) as corresponding to a particular part of speech (syntactic tag), based on both its definition and its context. POS Tagging is a helper task for many tasks about NLP: Word Sense Disambiguation, Dependency Parsing, etc.
A sequence is a series of tokens where tokens are not independent of each other. Series in mathematics and sentences in linguistics are both sequences. Because; in both of them, the next token depends on the previous ones or vice versa.
Determining POS Tags is much more complicated than simply mapping words to their tags. Consider word black:
- earnings growth took a back/JJ seat.
- a small building in the back/NN.
- a clear majority of senators back/VBP at the bill.
Each word black has a different role. The first black is Adjective, the second black is Noun, the third black is non-3rd person singular present Verb.
Named Entity Recognition (NER)
The first step in information extraction is to detect the entities in the text. A named entity is, roughly speaking, anything that can be referred to with a proper name: a person, a location, an organization. NER helps us to identify and extract critical elements from the text.
NER can be used for customer services, medical purposes, document categorization. For example, extracting aspects from customer reviews with NER helps identification of semantics. Or extracting medical diseases and drugs from a text helps identification of treatment.
For more details about NER, you can visit our previous blog post.
Sequence Labeling Models
Sequence labeling can be done with various methods. While traditional models are based on corpus statistics (Hidden Markov Models, Maximum Entropy Markov Models, Conditional Random Field, etc.), recent models are based on neural networks (Recurrent Neural Networks, Long Short-Term Memory, BERT, etc.). In the next post, we will discuss those models in detail.
- Dan Jurafsky and James H. Martin, “Speech and Language Processing”, second edition.