Member-only story
Pandas Merge Explained in Simple English
Python tutor here. I recently had a fierce onslaught of new student consultations regarding various Pandas Assignments where they had to manipulate data in weird wacky ways. I found myself repeating the same explanation for a couple of common functions across the different assignments, and the most common was probably the pd.merge function. Here it is for the benefit of more people (and so I hopefully don’t have to repeat the same old explanation again!)
The pd.merge Function in a Nutshell
Long story short, the pd.merge function allows us to combine 2 dataframes based on a certain column, which must be present in both dataframes. Let’s dive into this with an example.
Understanding Our Sample Data (Pet Ownership)
Let’s create 2 simple DataFrames to put things in perspective
import pandas as pddogs = pd.DataFrame([
["owner1", "dog1"],
["owner2", "dog2"],
["owner3", "dog3"],
], columns=["owner", "dog"])
The dog dataframe here contains information regarding dog ownership.