Introduction


  • NLP is embedded in numerous daily-use products
  • Key tasks include language modeling, text classification, information extraction, information retrieval, conversational agents, and topic modeling, each supporting various real-world applications.
  • NLP is a subfield of Artificial Intelligence (AI) that deals with approaches to process, understand and generate natural language
  • Deep learning has significantly advanced NLP, but the challenge remains in processing the discrete and ambiguous nature of language
  • The ultimate goal of NLP is to enable machines to understand and process language as humans do, but challenges in measuring and interpreting linguistic information still exist.

Episode 1: From text to vectors


  • Preprocessing involves a number of steps that one can apply to their text to prepare it for further processing.
  • Preprocessing is important because it can improve your results
  • You do not always need to do all preprocessing steps. It depends on the task at hand which preprocessing steps are important.
  • We can represent text as vectors of numbers (which makes it interpretable for machines)
  • The most efficient and useful way is to use word embeddings
  • We can easily compute how words are similar to each other with the cosine similarity

Episode 2: BERT and TransformersTransformersBERTBERT ArchitectureBERT as a Language ModelBERT for Text ClassificationUnderstanding BERT ArchitectureBERT for Token Classification———- END HERE ??? ———-