Real-World Natural Language Processing
eBook - ePub

Real-World Natural Language Processing

Practical applications with deep learning

  1. 336 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Real-World Natural Language Processing

Practical applications with deep learning

Book details
Book preview
Table of contents
Citations

About This Book

Real-world Natural Language Processing shows you how to build the practical NLP applications that are transforming the way humans and computers work together. In Real-world Natural Language Processing you will learn how to: Design, develop, and deploy useful NLP applications
Create named entity taggers
Build machine translation systems
Construct language generation systems and chatbots
Use advanced NLP concepts such as attention and transfer learning Real-world Natural Language Processing teaches you how to create practical NLP applications without getting bogged down in complex language theory and the mathematics of deep learning. In this engaging book, you'll explore the core tools and techniques required to build a huge range of powerful NLP apps, including chatbots, language detectors, and text classifiers. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the technology
Training computers to interpret and generate speech and text is a monumental challenge, and the payoff for reducing labor and improving human/computer interaction is huge! Th e field of Natural Language Processing (NLP) is advancing rapidly, with countless new tools and practices. This unique book offers an innovative collection of NLP techniques with applications in machine translation, voice assistants, text generation, and more. About the book
Real-world Natural Language Processing shows you how to build the practical NLP applications that are transforming the way humans and computers work together. Guided by clear explanations of each core NLP topic, you'll create many interesting applications including a sentiment analyzer and a chatbot. Along the way, you'll use Python and open source libraries like AllenNLP and HuggingFace Transformers to speed up your development process. What's inside Design, develop, and deploy useful NLP applications
Create named entity taggers
Build machine translation systems
Construct language generation systems and chatbotsAbout the reader
For Python programmers. No prior machine learning knowledge assumed. About the author
Masato Hagiwara received his computer science PhD from Nagoya University in 2009. He has interned at Google and Microsoft Research, and worked at Duolingo as a Senior Machine Learning Engineer. He now runs his own research and consulting company. Table of Contents
PART 1 BASICS
1 Introduction to natural language processing
2 Your first NLP application
3 Word and document embeddings
4 Sentence classification
5 Sequential labeling and language modeling
PART 2 ADVANCED MODELS
6 Sequence-to-sequence models
7 Convolutional neural networks
8 Attention and Transformer
9 Transfer learning with pretrained language models
PART 3 PUTTING INTO PRODUCTION
10 Best practices in developing NLP applications
11 Deploying and serving NLP applications

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access Real-World Natural Language Processing by Masato Hagiwara in PDF and/or ePUB format, as well as other popular books in Ciencia de la computación & Ciencias computacionales general. We have over one million books available in our catalogue for you to explore.

Information

Part 1 Basics

Welcome to the beautiful and exciting world of natural language processing (NLP)! NLP is a subfield of artificial intelligence (AI) that concerns computational approaches to processing, understanding, and generating human languages. NLP is used in many technologies you interact with in your daily life—spam filtering, conversational assistants, search engines, and machine translation. This first part of the book is intended to give you a gentle introduction to the field and bring you up to speed with how to build practical NLP applications.
In chapter 1, we’ll begin by introducing the “what” and “why” of NLP—what is NLP, what is not NLP, how NLP technologies are used, and how it’s related to other fields of AI.
In chapter 2, you’ll build a complete, working NLP application—a sentiment analyzer—within an hour with the help of a powerful NLP framework, AllenNLP. You’ll also learn to use basic machine learning (ML) concepts, including word embeddings and recurrent neural networks (RNNs). Don’t worry if this sounds intimidating—we’ll introduce you to the concepts gradually and provide an intuitive explanation.
Chapter 3 provides a deep dive into the one of the most important concepts for deep learning approaches to NLP—word and sentence embeddings. The chapter demonstrates how to use and even train them using your own data.
Chapters 4 and 5 cover fundamental NLP tasks, sentence classification and sequence labeling. Though simple, these tasks have a wide range of applications, including sentiment analysis, part-of-speech tagging, and named entity recognition.
This part familiarizes you with some basic concepts of modern NLP and we’ll build useful NLP applications along the way.

1 Introduction to natural language processing

This chapter covers
  • What natural language processing (NLP) is, what it is not, and why it’s such an interesting, yet challenging, field
  • How NLP relates to other fields, including artificial intelligence (AI) and machine learning (ML)
  • What typical NLP applications and tasks are
  • How a typical NLP application is developed and structured
This is not an introductory book to machine learning or deep learning. You won’t learn how to write neural networks in mathematical terms or how to compute gradients, for example. But don’t worry, even if you don’t have any idea what they are. I’ll explain those concepts as needed, not mathematically but conceptually. In fact, this book contains no mathematical formulae—not a single one. Also, thanks to modern deep learning libraries, you don’t really need to understand the math to build practical NLP applications. If you are interested in learning the theories and the math behind machine learning and deep learning, you can find a number of great resources out there.
But you do need to be at least comfortable enough to write in Python and know its ecosystems. However, you don’t need to be an expert in software engineering topics. In fact, this book’s purpose is to introduce software engineering best practices for developing NLP applications. You also don’t need to know NLP in advance. Again, this book is designed to be a gentle introduction to the field.
You need Python version 3.6.1 or higher and AllenNLP 2.5.0 or higher to run the code examples in this book. Note that we do not support Python 2, mainly because AllenNLP (https://allennlp.org/), the deep natural language processing framework I’m going to heavily use in this book, supports only Python 3. If you haven’t done so, I strongly recommend upgrading to Python 3 and familiarizing yourself with the latest language features such as type hints and new string-formatting syntax. This will be helpful, even if you are developing non-NLP applications.
Don’t worry if you don’t have a Python development environment ready. Most of the examples in this book can be run via the Google Colab platform (https://colab.research.google.com). You need only a web browser to build and experiment with NLP models!
This book will use PyTorch (https://pytorch.org/) as its main choice of deep learning framework. This was a difficult decision for me, because several deep learning frameworks are equally great choices for building NLP applications, namely, TensorFlow, Keras, and Chainer. A few factors make PyTorch stand out among those frameworks—it’s a flexible and dynamic framework that makes it easier to prototype and debug NLP models; it’s becoming increasingly popular within the research community, so it’s easy to find open source implementations of major models; and the deep NLP framework AllenNLP mentioned earlier is built on top of PyTorch.

1.1 What is natural language processing (NLP)?

NLP is a principled approach to processing human language. Formally, it is a subfield of artificial intelligence (AI) that refers to computational approaches to process, understand, and generate human language. The reason it is part of AI is because language processing is considered a huge part of human intelligence. The use of language is arguably the most salient skill that separates humans from other animals.

1.1.1 What is NLP?

NLP includes a range of algorithms, tasks, and problems that take human-produced text as an input and produce some useful information, such as labels, semantic representations, and so on, as an output. Other tasks, such as translation, summarization, and text generation, directly produce text as output. In any case, the focus is on producing some output that is useful per se (e.g., a translation) or as input to other downstream tasks (e.g., parsing). I’ll touch upon some popular NLP applications and tasks in section 1.3.
You might wonder why NLP explicitly has “natural” in its name. What does it mean for a language to be natural? Are there any unnatural languages? Is English natural? Which is more natural: Spanish or French?
The word “natural” here is used to contrast natural languages with formal languages. In this sense, all the languages humans speak are natural. Many experts believe that language emerged naturally tens of thousands of years ago and has evolved organically ever since. Formal languages, on the other hand, are types of languages that are invented by humans and have strictly and explicitly defined syntax (i.e., what is grammatical) and semantics (i.e., what it means).
Programming languages such as C and Python are good examples of formal languages. These languages are defined in such a strict way that it is always clear what is grammatical and ungrammatical. When you run a compiler or an interpreter on the code you write in those languages, you either get a syntax error or not. The compiler won’t say something like, “Hmm, this code is maybe 50% grammatical.” Also, the behavior of your program is always the same if it’s run on the same code, assuming external factors such as the random seed and the system states remain constant. Your interpreter won’t show one result 50% of the time and another the other 50% of the time.
This is not the case for human languages. You can write a sentence that is maybe grammatical. For example, do you consider the phrase “The person I spoke to” ungrammatical? There are some grammar topics where even experts disagree with each other. This is what makes human languages interesting but challenging, and why the entire field of NLP even exists. Human languages are ambiguous, meaning that their interpretation is often not unique. Both structures (how sentences are formed) and semantics (what sentences mean) can have ambiguities in human language. As an example, let’s take a close look at the next sentence:
He saw a girl with a telescope.
When you read this sentence, who do you think has a telescope? Is it the boy, who’s using a telescope to see a girl (from somewhere far), or the girl, who has a telescope and is seen by the boy? There seem to be at least two interpretations of this sentence as shown in figure 1.1.
CH01_F01_Hagiwara

Figure 1.1 Two interpretations of “He saw a girl with a telescope.”
The reason you are confused upon reading this sentence is because you don’t know what the phrase “with a telescope” is about. More technically, you don’t know what this prepositional phrase (PP) modifies. This is called a PP-attachment problem and is a classic example of syntactic ambiguity. A syntactically ambiguous sentence has more than one interpretation of how the sentence is structured. You can interpret the sentence in multiple ways, depending on which structure of the sentence you believe.
Another type of ambiguity that may arise in natural language is semantic ambiguity. This is when the meaning of a word or a sentence, not its structure, is ambiguous. For example, let’s look at the following sentence:
I saw a bat.
There is no question how this sentence is structured. The subject of the sentence is “I” and the object is “a bat,” connected by the verb “saw.” In other words, there is no syntactical ambiguity in it. But how about its meaning? “Saw” has at least two meanings. One is the past tense of the verb...

Table of contents

  1. Real-World Natural Language Processing
  2. inside front cover
  3. Copyright
  4. dedication
  5. contents
  6. front matter
  7. Part 1 Basics
  8. 1 Introduction to natural language processing
  9. 2 Your first NLP application
  10. 3 Word and document embeddings
  11. 4 Sentence classification
  12. 5 Sequential labeling and language modeling
  13. Part 2 Advanced models
  14. 6 Sequence-to-sequence models
  15. 7 Convolutional neural networks
  16. 8 Attention and Transformer
  17. 9 Transfer learning with pretrained language models
  18. Part 3 Putting into production
  19. 10 Best practices in developing NLP applications
  20. 11 Deploying and serving NLP applications
  21. index