Python Simple Web Scraper for Titles Example
Build a small Python script that downloads a web page and extracts page titles. This example is meant to be practical, short, and easy to run step by step.
If you are new to packages, see how to install a Python package with pip.
Quick example
This is the fastest working example. It assumes requests and beautifulsoup4 are installed.
import requests
from bs4 import BeautifulSoup
url = "https://example.com"
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
print(soup.title.string.strip())
What it does:
- Downloads the page at
https://example.com - Parses the HTML
- Finds the
<title>tag - Prints the title text
Expected output:
Example Domain
What this example does
This script does four simple things:
- Downloads HTML from a web page
- Parses the HTML with BeautifulSoup
- Finds the
<title>tag - Prints the title text
This is a good beginner scraping example because it focuses on one small task.
What you need before running it
Before you run the script, make sure you have:
- Python installed
- Basic understanding of running a Python script
- The
requestspackage installed - The
beautifulsoup4package installed - An internet connection for live pages
You can check your Python version with:
python --version
Install the required packages
Install the required packages with pip:
pip install requests beautifulsoup4
If pip gives you errors, the problem is usually:
pipis not installed- You are using the wrong Python environment
- Python and
pippoint to different installations
If needed, read how to install a Python package with pip.
You can also check pip with:
pip --version
Minimal example: get one page title
Here is a simple version with the main steps shown clearly.
import requests
from bs4 import BeautifulSoup
url = "https://example.com"
response = requests.get(url)
html = response.text
soup = BeautifulSoup(html, "html.parser")
title_text = soup.title.string
print(title_text)
How this code works
requests.get(url)fetches the pageresponse.textgives you the HTML as a stringBeautifulSoup(html, "html.parser")parses the HTMLsoup.titlefinds the<title>tagsoup.title.stringgets the text inside the tag
For https://example.com, the output is:
Example Domain
Safer version with basic error handling
The first example is useful, but it can fail if:
- The page cannot be downloaded
- The website returns an error
- The page has no title tag
- The title exists but has no text
This version is safer for beginners:
import requests
from bs4 import BeautifulSoup
url = "https://example.com"
try:
response = requests.get(url, timeout=10)
response.raise_for_status()
except requests.RequestException:
print("Could not fetch the page.")
else:
soup = BeautifulSoup(response.text, "html.parser")
if soup.title and soup.title.string:
print(soup.title.string.strip())
else:
print("No title tag was found.")
Why this version is better
timeout=10prevents the request from hanging too longresponse.raise_for_status()catches HTTP errors like 404 and 500if soup.title and soup.title.stringavoids crashes when the title is missing
If you are not familiar with HTTP requests yet, see how to make an API request in Python. The same requests library is used here.
Example: scrape titles from multiple pages
You can also put several URLs in a list and loop through them.
import requests
from bs4 import BeautifulSoup
urls = [
"https://example.com",
"https://www.python.org",
"https://www.wikipedia.org",
]
for url in urls:
try:
response = requests.get(url, timeout=10)
response.raise_for_status()
soup = BeautifulSoup(response.text, "html.parser")
if soup.title and soup.title.string:
title = soup.title.string.strip()
else:
title = "No title found"
print(f"{url} -> {title}")
except requests.RequestException:
print(f"{url} -> Could not fetch page")
Example output might look like this:
https://example.com -> Example Domain
https://www.python.org -> Welcome to Python.org
https://www.wikipedia.org -> Wikipedia
This example keeps the loop small and readable. That is a good way to start.
How this example works
There are two main tools in this script:
requestshandles the HTTP requestBeautifulSoupreads and searches the HTML
A page title is usually inside the <head> section, like this:
<title>Example Domain</title>
BeautifulSoup makes it easy to find that tag.
Keep in mind:
- Not every page has a title
- Some pages have messy HTML
- Some pages return different content than you expect
- Some sites load content later with JavaScript
If you want a broader BeautifulSoup example after this one, see Python web scraping example with BeautifulSoup.
Common problems beginners hit
Here are the most common causes when this example does not work:
requestsis not installedbeautifulsoup4is not installed- The URL is wrong or missing
https:// - The page request failed
- The HTML has no title tag
- The site uses JavaScript for content you expected to scrape
You may also run into these specific errors:
ModuleNotFoundError: No module named ...ifrequestsorbs4is missingAttributeError: 'NoneType' object has no attribute ...ifsoup.titleisNone
Useful commands for debugging:
pip install requests beautifulsoup4
python --version
pip --version
python script.py
A few examples of common mistakes
If bs4 is not installed:
from bs4 import BeautifulSoup
You may get a ModuleNotFoundError.
If the page has no title and you do this:
print(soup.title.string)
You may get an error because soup.title is None.
If the URL is incomplete:
url = "example.com"
The request may fail. Use the full URL instead:
url = "https://example.com"
Important beginner notes
Keep these points in mind:
- This example only reads page HTML and extracts the title
- It is not a full web scraping guide
- Some sites do not allow scraping
- Start with simple public pages like
example.com
Also remember that requests only downloads the initial HTML. If a site loads text later with JavaScript, your script may not see that content.
FAQ
Why does soup.title return None?
The page may not have a title tag, or the HTML was not fetched correctly.
Why is my scraper not finding text I can see in the browser?
Some websites load content with JavaScript. requests only gets the initial HTML.
Do I need BeautifulSoup just to get the title?
Not always, but it makes HTML parsing much easier and clearer for beginners.
What packages do I need for this example?
You usually need requests and beautifulsoup4.
Can I scrape many pages with a loop?
Yes. Start with a short list of URLs and print each page title.