Python Remove Duplicates from Data Example
This beginner-friendly example shows simple ways to remove duplicate values from Python data.
The focus is on lists, because that is where this problem appears most often. You will see:
- how to remove repeated items quickly
- how to keep the original order when needed
- how to do the same job step by step with a loop
Quick answer
data = ["apple", "banana", "apple", "orange", "banana"]
unique_data = list(dict.fromkeys(data))
print(unique_data)
# ['apple', 'banana', 'orange']
Use dict.fromkeys() when you want to remove duplicates and keep the original order.
What this example solves
This example is useful when you need to:
- remove repeated items from a list
- understand the difference between removing duplicates with and without keeping order
- learn a practical pattern without reading a full theory lesson
If you want a task-focused guide, see how to remove duplicates from a list in Python.
Example 1: Remove duplicates with a set
The shortest way to remove duplicates is to turn the list into a set.
data = ["apple", "banana", "apple", "orange", "banana"]
unique_data = set(data)
print(unique_data)
Possible output:
{'banana', 'orange', 'apple'}
What this does
set(data)removes duplicate values- each value appears only once in the result
- the result is a
set, not a list
Important note
A set does not keep the original order of the list. That means this method is best when order does not matter.
If you want to learn more about sets, read Python sets explained or Python set: creating a set.
If you need the result as a list, convert it back:
data = ["apple", "banana", "apple", "orange", "banana"]
unique_data = list(set(data))
print(unique_data)
Example 2: Remove duplicates and keep order
If you want to keep the first appearance of each item, use dict.fromkeys().
data = ["apple", "banana", "apple", "orange", "banana"]
unique_data = list(dict.fromkeys(data))
print(unique_data)
Output:
['apple', 'banana', 'orange']
Why this works
dict.fromkeys(data) creates a dictionary using each list item as a key.
Dictionary keys must be unique, so duplicates are removed automatically. Then list(...) turns those keys back into a list.
Why beginners like this method
- it is short
- it keeps order
- it works well for common cleanup tasks
If dictionaries are new to you, see Python dictionaries explained.
Example 3: Build a unique list with a loop
A loop is longer, but it helps you understand the logic step by step.
data = ["apple", "banana", "apple", "orange", "banana"]
unique_data = []
for item in data:
if item not in unique_data:
unique_data.append(item)
print(unique_data)
Output:
['apple', 'banana', 'orange']
How it works
- start with an empty list called
unique_data - go through each item in
data - check whether the item is already in
unique_data - if not, add it
This approach is useful when you want custom rules. For example, you might want to ignore case or skip empty values.
Here is a simple case-insensitive version:
data = ["Apple", "banana", "apple", "Orange", "BANANA"]
unique_data = []
seen = set()
for item in data:
normalized = item.lower()
if normalized not in seen:
seen.add(normalized)
unique_data.append(item)
print(unique_data)
Output:
['Apple', 'banana', 'Orange']
When each method is a good choice
Use set() when:
- you want the shortest solution
- order does not matter
- you only need unique values quickly
Use dict.fromkeys() when:
- you want to preserve the original order
- you want a short and readable solution
- you are cleaning list data in a beginner script
Use a loop when:
- you want to learn how duplicate removal works
- you need custom rules
- you want more control over the logic
Beginner notes
These methods work best with lists that contain hashable values, such as:
- strings
- numbers
- tuples
Be careful with these cases:
- a list of dictionaries cannot be directly turned into a set
dict.fromkeys()also needs hashable values"Apple"and"apple"are different unless you normalize them first
Example of normalization:
data = ["Apple", "apple", "BANANA", "banana"]
normalized_unique = list(dict.fromkeys(item.lower() for item in data))
print(normalized_unique)
Output:
['apple', 'banana']
If you are doing broader cleanup work, this may also help: Python data cleaning script example.
Common mistakes
Here are some common problems beginners run into.
Using set() and expecting the original order to stay the same
This will remove duplicates, but the order may change:
data = ["apple", "banana", "apple", "orange"]
print(list(set(data)))
If order matters, use:
print(list(dict.fromkeys(data)))
Trying to remove duplicates from a list of dictionaries with set()
This causes an error because dictionaries cannot be added to a set.
data = [{"name": "Alice"}, {"name": "Alice"}]
# set(data) # This will fail
If you need to work with more complex data, you usually have to compare a specific field yourself with a loop.
Not normalizing values like "Apple" and "apple"
Python treats them as different strings:
data = ["Apple", "apple"]
print(list(dict.fromkeys(data)))
Output:
['Apple', 'apple']
Normalize first if needed:
data = ["Apple", "apple"]
print(list(dict.fromkeys(item.lower() for item in data)))
Forgetting to convert the result back to a list after using set()
This:
data = ["apple", "banana", "apple"]
unique_data = set(data)
print(unique_data)
gives you a set, not a list.
If you need a list, write:
unique_data = list(set(data))
FAQ
What is the easiest way to remove duplicates from a Python list?
Use set(your_list) if order does not matter. Use list(dict.fromkeys(your_list)) if you want to keep the original order.
Does set() keep the same order as the original list?
No. A set removes duplicates, but it does not guarantee the same order as the original list.
How do I remove duplicates and keep order in Python?
Use list(dict.fromkeys(data)). It keeps the first occurrence of each value.
Can I use set() on a list of dictionaries?
No. Dictionaries are mutable and cannot be added to a set directly.