Python Simple Web Scraper for Titles Example

Build a small Python script that downloads a web page and extracts page titles. This example is meant to be practical, short, and easy to run step by step.

If you are new to packages, see how to install a Python package with pip.

Quick example

This is the fastest working example. It assumes requests and beautifulsoup4 are installed.

import requests
from bs4 import BeautifulSoup

url = "https://example.com"
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

print(soup.title.string.strip())

What it does:

  • Downloads the page at https://example.com
  • Parses the HTML
  • Finds the <title> tag
  • Prints the title text

Expected output:

Example Domain

What this example does

This script does four simple things:

  • Downloads HTML from a web page
  • Parses the HTML with BeautifulSoup
  • Finds the <title> tag
  • Prints the title text

This is a good beginner scraping example because it focuses on one small task.

What you need before running it

Before you run the script, make sure you have:

  • Python installed
  • Basic understanding of running a Python script
  • The requests package installed
  • The beautifulsoup4 package installed
  • An internet connection for live pages

You can check your Python version with:

python --version

Install the required packages

Install the required packages with pip:

pip install requests beautifulsoup4

If pip gives you errors, the problem is usually:

  • pip is not installed
  • You are using the wrong Python environment
  • Python and pip point to different installations

If needed, read how to install a Python package with pip.

You can also check pip with:

pip --version

Minimal example: get one page title

Here is a simple version with the main steps shown clearly.

import requests
from bs4 import BeautifulSoup

url = "https://example.com"

response = requests.get(url)
html = response.text

soup = BeautifulSoup(html, "html.parser")

title_text = soup.title.string
print(title_text)

How this code works

  • requests.get(url) fetches the page
  • response.text gives you the HTML as a string
  • BeautifulSoup(html, "html.parser") parses the HTML
  • soup.title finds the <title> tag
  • soup.title.string gets the text inside the tag

For https://example.com, the output is:

Example Domain

Safer version with basic error handling

The first example is useful, but it can fail if:

  • The page cannot be downloaded
  • The website returns an error
  • The page has no title tag
  • The title exists but has no text

This version is safer for beginners:

import requests
from bs4 import BeautifulSoup

url = "https://example.com"

try:
    response = requests.get(url, timeout=10)
    response.raise_for_status()
except requests.RequestException:
    print("Could not fetch the page.")
else:
    soup = BeautifulSoup(response.text, "html.parser")

    if soup.title and soup.title.string:
        print(soup.title.string.strip())
    else:
        print("No title tag was found.")

Why this version is better

  • timeout=10 prevents the request from hanging too long
  • response.raise_for_status() catches HTTP errors like 404 and 500
  • if soup.title and soup.title.string avoids crashes when the title is missing

If you are not familiar with HTTP requests yet, see how to make an API request in Python. The same requests library is used here.

Example: scrape titles from multiple pages

You can also put several URLs in a list and loop through them.

import requests
from bs4 import BeautifulSoup

urls = [
    "https://example.com",
    "https://www.python.org",
    "https://www.wikipedia.org",
]

for url in urls:
    try:
        response = requests.get(url, timeout=10)
        response.raise_for_status()
        soup = BeautifulSoup(response.text, "html.parser")

        if soup.title and soup.title.string:
            title = soup.title.string.strip()
        else:
            title = "No title found"

        print(f"{url} -> {title}")

    except requests.RequestException:
        print(f"{url} -> Could not fetch page")

Example output might look like this:

https://example.com -> Example Domain
https://www.python.org -> Welcome to Python.org
https://www.wikipedia.org -> Wikipedia

This example keeps the loop small and readable. That is a good way to start.

How this example works

There are two main tools in this script:

  • requests handles the HTTP request
  • BeautifulSoup reads and searches the HTML

A page title is usually inside the <head> section, like this:

<title>Example Domain</title>

BeautifulSoup makes it easy to find that tag.

Keep in mind:

  • Not every page has a title
  • Some pages have messy HTML
  • Some pages return different content than you expect
  • Some sites load content later with JavaScript

If you want a broader BeautifulSoup example after this one, see Python web scraping example with BeautifulSoup.

Common problems beginners hit

Here are the most common causes when this example does not work:

  • requests is not installed
  • beautifulsoup4 is not installed
  • The URL is wrong or missing https://
  • The page request failed
  • The HTML has no title tag
  • The site uses JavaScript for content you expected to scrape

You may also run into these specific errors:

Useful commands for debugging:

pip install requests beautifulsoup4
python --version
pip --version
python script.py

A few examples of common mistakes

If bs4 is not installed:

from bs4 import BeautifulSoup

You may get a ModuleNotFoundError.

If the page has no title and you do this:

print(soup.title.string)

You may get an error because soup.title is None.

If the URL is incomplete:

url = "example.com"

The request may fail. Use the full URL instead:

url = "https://example.com"

Important beginner notes

Keep these points in mind:

  • This example only reads page HTML and extracts the title
  • It is not a full web scraping guide
  • Some sites do not allow scraping
  • Start with simple public pages like example.com

Also remember that requests only downloads the initial HTML. If a site loads text later with JavaScript, your script may not see that content.

FAQ

Why does soup.title return None?

The page may not have a title tag, or the HTML was not fetched correctly.

Why is my scraper not finding text I can see in the browser?

Some websites load content with JavaScript. requests only gets the initial HTML.

Do I need BeautifulSoup just to get the title?

Not always, but it makes HTML parsing much easier and clearer for beginners.

What packages do I need for this example?

You usually need requests and beautifulsoup4.

Can I scrape many pages with a loop?

Yes. Start with a short list of URLs and print each page title.

See also