Session Information

Industrial Strength - Natural Language Processing

Processing raw text intelligently is difficult: most words are rare, and it's common for words that look completely different to mean almost the same thing. The same words in a different order can mean something completely different. Even splitting text into useful word-like units can be difficult in many languages. While it's possible to solve some problems starting from only the raw characters, it's usually better to use linguistic knowledge to add useful information. Written communication is growing substutiionally with many new messaging platforms. The issue is how to you get true meaning and understanding from this text to make true informed decisions to power algorithms. There are many open source solutions that provide different options but require a significant learning curve. During this talk I will be walking people through the basics of NLP (Sentence Boundaries, Tokenization, Parts of Speech Tagging, Stemming, Dependency Visualization and Entity Extraction). I will be applying this to a real world problem to provide a grounding on how to apply NLP.

100 - Beginner
Other Web
Room 027