Start Date: | 2025-01-13 | Course Code: | CS 332 | L-T-P-C: | 3-0-0 |
---|---|---|---|---|---|
Course Name: | Natural Language Processing | Semester: | 6th Semester (Elective) | Course Faculty: | Partha Pakray |
Course Plan
Natural Language Processing (CS 332)
Professional Core Elective - I (6th Semester, CSE)
Course Details
Course Code: CS 332
Date of Starting: 13.01.2025
Course Faculty: Dr. Partha Pakray
Associate ProfessorDepartment of Computer Science & Engineering
National Institute of Technology Silchar, Assam, INDIA
Textbooks
- Jurafsky D., Martin J. H., Speech and Language Processing, Prentice Hall.
- Manning C., Schütze H., Foundations of Statistical Natural Language Processing, MIT Press.
Course Outcomes (COs)
- Understand basic concepts in linguistics.
- Learn fundamental mathematical models and algorithms in NLP.
- Apply these models and algorithms in software design for NLP.
- Understand theoretical underpinnings of NLP in linguistics and formal language theory.
Unit | Topic | Hours | Content |
---|---|---|---|
Unit 1 | Introduction to NLP | 1 | Overview, applications, challenges |
Regular Expressions & Text Normalization | 2 | Tokenization, case folding, stemming, lemmatization | |
Edit Distance | 1 | Levenshtein distance, applications in NLP | |
N-gram Language Models | 2 | Smoothing, perplexity, applications | |
Ambiguity, Naive Bayes, and Sentiment Classification | 1 | Ambiguity in language, Naive Bayes for text classification, sentiment analysis | |
Vector Semantics | 1 | Word embeddings, cosine similarity | |
Unit 2 | Neural Networks and Neural Language Models | 2 | Feedforward networks, Word2Vec, Glove |
RNN, LSTM, GRU | 2 | Recurrent architectures, handling long-term dependencies | |
Part-of-Speech Tagging | 1 | Definition, applications, tagsets (Penn Treebank) | |
HMM and Maximum Entropy Models | 2 | Probabilistic sequence models, applications | |
CRF (Conditional Random Fields) | 1 | Overview, usage in sequence labeling | |
Sequence Processing with Recurrent Networks | 2 | Applications of RNNs, LSTMs in tagging and entity recognition | |
Unit 3 | Formal Grammars of English | 1 | CFGs, derivations, basic structures |
Treebanks as Grammars | 1 | Penn Treebank, constituency structures | |
Syntactic Parsing | 2 | Top-down, bottom-up parsing | |
Statistical Parsing and PCFG | 2 | Probabilistic CFGs, statistical approaches | |
Dependency Parsing | 2 | Dependency grammars, transition-based and graph-based parsing | |
Unit 4 | The Representation of Sentence Meaning | 2 | Logical forms, semantic representation, challenges |
Word Sense Disambiguation (WSD) | 1 | Supervised and unsupervised methods, Lesk algorithm | |
Information Extraction | 2 | Named entity recognition, relation extraction | |
Semantic Role Labeling | 1 | Predicate-argument structure, FrameNet | |
Lexicons for Sentiment and Discourse Coherence | 2 | Sentiment lexicons, discourse parsing | |
Unit 5 | Machine Translation | 2 | Statistical, rule-based, neural machine translation |
Question Answering | 1 | QA systems, applications in NLP | |
Dialog Systems and Chatbots | 1 | Architecture, types, conversational AI | |
Speech Recognition and Synthesis | 2 | ASR systems, TTS systems, deep learning techniques |
Resources
Class PPTs and Notes
- Introduction to NLP
- Tokenization
- Lemmatization and Tokenization
- Porter Stemmer
- Regular Expression
- POS Tagging
- Language Model
- Probabilistic Language Model
- Recurrent Neural Network (RNN) Model
Attendance
Shared in Google Excel Sheet.
Course Evaluation
- End Semester: 50
- Mid Semester: 30
- Assignment + Tutorials: 10
- Minor Test: 10
Course Feedback
Feedback link to be shared later.
Note:
For any additional information, refer to the resources shared in class.
Natural Language Processing (CS 332): Lab Experiments
Guessing Game
Guess Number: Click Here
Guess Word: Click Here
Partha Pakray
Class Notes & PPTs
- - PPT