1. What is Natural Language?
Answer:
A natural language is any language that humans use for communication, such as English, Hindi, French, or Spanish.
It includes both spoken and written languages used in daily life.
2. What are the features of natural languages?
Answer:
The main features are:
- They follow specific rules like syntax, lexicon, and semantics.
- They are redundant – meaning the same idea can be expressed in different ways.
- They evolve over time and may have regional variations.
3. What is Natural Language Processing (NLP)?
Answer:
NLP is a branch of Artificial Intelligence that enables computers to understand, interpret, and respond to human language (text or speech) in a meaningful way.
It allows humans to interact with machines using natural language instead of code.
4. Why is NLP important?
Answer:
NLP bridges the gap between human language and computer understanding.
It allows machines to:
- Communicate naturally with humans.
- Extract useful information from text and speech.
- Perform language-related tasks like translation, summarization, and question answering.
5. Mention some real-life applications of NLP.
Answer:
- Voice Assistants – Alexa, Siri, Google Assistant.
- Auto-Captioning – YouTube and Google Meet subtitles.
- Machine Translation – Google Translate.
- Sentiment Analysis – Detecting tone in product reviews.
- Text Classification – Sorting spam or non-spam emails.
- Keyword Extraction – Finding key topics in articles.
6. What is Keyword Extraction?
Answer:
Keyword Extraction means automatically identifying important words or phrases from a text that represent its key idea or topic.
Example:
For the sentence:
“Artificial Intelligence helps computers learn from data.”
Extracted keywords may be: Artificial Intelligence, computers, data, learn.
It helps summarize long texts and improves search results.
7. What are the main stages of NLP? Explain with examples.
Answer:
The five main stages of NLP are:
| Stage | Description | Example |
| 1. Lexical Analysis | Splitting text into individual words or tokens. | “AI is fun” → {AI, is, fun} |
| 2. Syntactic Analysis | Checking grammar and sentence structure. | Detects “He are good” as incorrect. |
| 3. Semantic Analysis | Understanding the meaning of words and phrases. | The word “bark” means sound (dog) or part of a tree, depending on context. |
| 4. Discourse Integration | Connecting the meaning of one sentence to the next. | “Riya bought a pen. It is blue.” → “It” refers to “pen”. |
| 5. Pragmatic Analysis | Understanding the intent behind the sentence. | “Can you open the door?” → A polite request, not a question about ability. |
These stages together help computers understand language like humans do.
8. What is a Chatbot?
Answer:
A chatbot is an AI-based software program that simulates human conversation through text or voice.
It uses Natural Language Processing (NLP) to understand what the user says and respond appropriately.
Examples:
- Customer Service Bots: Answer queries on shopping or banking websites.
- Virtual Assistants: Alexa, Siri, and Google Assistant.
- Educational Bots: Used in online learning platforms to answer student queries.
Chatbots save time, provide quick responses, and improve user experience.
9. What are the types of Chatbots?
Answer:
- Scripted Chatbots:
- Work on pre-defined rules and keywords.
- Example: If you type “Hello,” it replies “Hi, how can I help you?”
- Cannot understand complex sentences.
- Smart Chatbots:
- Use AI and NLP to understand the intent and context of the message.
- Example: Siri, Alexa, or ChatGPT that can hold meaningful conversations and learn from user input.
10. What is Text Processing in NLP?
Answer:
Text Processing means preparing text so that a computer can analyze it easily.
It removes unnecessary parts and converts text into a structured format.
This process includes several steps such as normalization, tokenization, removing stop words, stemming, and lemmatization.
Example:
For the sentence:
“The cats are running fast.”
After text processing:
{cat, run, fast}
11. What is Text Normalisation?
Answer:
Text Normalisation is the process of converting text into a consistent and simpler format before analysis.
It includes:
- Sentence Segmentation: Splitting text into sentences.
- Tokenization: Breaking sentences into words or tokens.
- Removing Stop Words: Removing words like a, an, the, is.
- Stemming and Lemmatization: Converting words to their root form.
Example:
Sentence: “AI technologies are transforming industries.”
After normalisation → {AI, technology, transform, industry}
12. What is the difference between Stemming and Lemmatization?
| Stemming | Lemmatization |
| Removes suffixes to reach the root form. | Converts word to its dictionary form (lemma). |
| Result may not be a valid word. | Always gives a valid word. |
| Works faster but less accurate. | Slower but more accurate. |
| Example: “studying” → “study”, “studies” → “studi” | Example: “studying” → “study”, “better” → “good” |
In short:
Stemming is like cutting words to their root, while Lemmatization is like understanding their real meaning before reducing.
13. What are Stop Words?
Answer:
Stop Words are common words that occur frequently but add little meaning to a sentence, such as a, an, the, is, are, of, and.
They are removed during preprocessing to make text analysis faster and more efficient.
Example:
Sentence: “AI is the future of technology.”
After removing stop words → “AI future technology.”
14. What is the Bag of Words (BoW) model?
Answer:
The Bag of Words model represents text as a collection (bag) of words and counts how many times each word appears, without caring about grammar or order.
It helps computers analyze word frequency for classification or comparison.
Example:
Sentences:
- “AI is fun.”
- “AI is powerful.”
BoW representation:
{AI:2, is:2, fun:1, powerful:1}
15. What is TF-IDF?
Answer:
TF-IDF (Term Frequency – Inverse Document Frequency) is a method used to find important words in a document.
It gives higher weight to rare but meaningful words and lower weight to common words.
Example:
In a collection of news articles, the word “budget” may appear often in finance articles but rarely in sports articles.
Hence, its importance (TF-IDF score) will be higher in the finance category.
16. What are the applications of TF-IDF?
Answer:
- Document Classification: Categorizing text into topics (sports, politics, etc.)
- Search Engines: Ranking web pages based on keyword importance.
- Topic Modeling: Finding main topics in large text sets.
- Stop Word Filtering: Removing unimportant words automatically.
Example:
Google uses TF-IDF-like models to show the most relevant pages for your search query.
17. What is Sentiment Analysis?
Answer:
Sentiment Analysis is an NLP technique that identifies the emotion, tone, or opinion expressed in text — whether it is positive, negative, or neutral.
Example:
- “I love this movie!” → Positive
- “The food was cold and tasteless.” → Negative
- “It was okay, not great.” → Neutral
It helps companies analyze customer reviews, tweets, and feedback to understand public sentiment.
18. What are the applications of Sentiment Analysis?
Answer:
- Customer Feedback Analysis – Understanding satisfaction levels.
- Social Media Monitoring – Tracking brand reputation on platforms like Twitter.
- Product Reviews – Analyzing user opinions on e-commerce sites.
- Public Opinion Tracking – During elections or campaigns.