Pandas List to DataFrame
If you want to convert a Python list to DataFrame, there are many options to choose from. You can convert 1-dimensional lists, multidimensional lists, and more to a Pandas DataFrame - either with built-in functions such as zip()
and from_records()
, or by being a bit creative.
Today’s article will teach you everything you need to know on how to convert List to DataFrame in Python and will show you 5 practical ways to do so.
Regarding library imports, you only need Pandas, so stick this line at the top of your Python script or Notebook:
import pandas as pd
Table of contents:
1. Simple Python List to Pandas DataFrame Conversion
Okay, this is an obvious one, and you should use it whenever you want to make a Pandas DataFrame from list - but the list has to be 1-dimensional.
The best way to explain is through an example. The following code snippet declares a list of names and creates a Pandas DataFrame from it:
employees = ["Bob", "Mark", "Jane", "Patrick"]
data = pd.DataFrame(employees)
data
The resulting DataFrame has only one column, and the column name is set to the default range index value:
Now, DataFrames that have only one column aren’t particularly interesting, but maybe that’s all you need. If that’s the case, you’ll certainly want to add a column name, so here’s how to do that:
employees = ["Bob", "Mark", "Jane", "Patrick"]
data = pd.DataFrame(employees, columns=["First Name"])
data
The resulting Pandas DataFrame is a bit easier to understand now:
Up next, let’s see how to convert multiple Python lists into a DataFrame using Python’s zip()
function.
2. Python’s zip() Method to Convert List to DataFrame
Python’s zip()
function is used to iterate over two or more iterables at the same time. The function takes iterables as arguments and returns an iterator of tuples, which contains the corresponding elements from each iterable.
It goes without saying, but the length of the iterables should be identical. Python won’t raise an Exception otherwise, but the length of the resulting iterator will be equal to the shortest input iterable.
For example, if list a
has 10 elements and list b
has only 5, then zip()
would return an iterable with 5 elements only.
On the practical side, here’s how to use zip()
if you want to know how to add a list to DataFrame. The following code snippet declares two lists and uses zip()
to produce a new list which is the combination of corresponding elements. This is a way to go if you want to make a Pandas DataFrame from two lists:
e_first_names = ["Bob", "Mark", "Jane", "Patrick"]
e_last_names = ["Doe", "Markson", "Swift", "Johnson"]
data = pd.DataFrame(
data=list(zip(e_first_names, e_last_names)),
columns=["First Name", "Last Name"]
)
data
When wrapped into a Pandas DataFrame, this means each list will be a dedicated column:
But what if you have more than two lists? Well, zip()
can take as many iterables as you want, meaning you can simply stick in the third one:
e_first_names = ["Bob", "Mark", "Jane", "Patrick"]
e_last_names = ["Doe", "Markson", "Swift", "Johnson"]
e_emails = ["[email protected]", "[email protected]", "[email protected]", "[email protected]"]
data = pd.DataFrame(
data=list(zip(e_first_names, e_last_names, e_emails)),
columns=["First Name", "Last Name", "Email"]
)
data
The resulting DataFrame also contains email information now:
Using zip()
is a good start, but somewhat tedious if you have many lists/features. Let’s find a more practical and scalable way to convert list to DataFrame.
3. Pandas List to DataFrame with Multidimensional Lists
Think of multidimensional lists as lists of lists, or a list that has other lists as child elements. In Pandas, this means you can make a DataFrame from list of lists, which is the input data format you’ll often encounter, as it’s easy to read and understand.
Let’s now see in action how you can make a DataFrame from list of lists. Our outer list will have lists as child elements, and each child list will contain employee information - first name, last name, and email address.
This multidimensional list is then used to construct a Pandas DataFrame:
employees = [
["Bob", "Doe", "[email protected]"],
["Mark", "Markson", "[email protected]"],
["Jane", "Swift", "[email protected]"],
["Patrick", "Johnson", "[email protected]"]
]
data = pd.DataFrame(
data=employees,
columns=["First Name", "Last Name", "Email"]
)
data
The DataFrame looks identically to the one from the previous section:
Neat, but what if your lists are inside a dictionary? Let’s cover that use case next.
4. Convert Dictionary of Lists to a Pandas DataFrame
Python dictionaries are used everywhere, but especially when working with Pandas. You can declare a dictionary that has a string for a key and list as a value. When a dictionary in this format is used to construct a Pandas DataFrame, the dictionary keys will be used as column names and dictionary values as values for each column at a given row.
This approach is by far the most common you’ll encounter, as it’s both easy to read and write:
employees = {
"First Name": ["Bob", "Mark", "Jane", "Patrick"],
"Last Name": ["Doe", "Markson", "Swift", "Johnson"],
"Email": ["[email protected]", "[email protected]", "[email protected]", "[email protected]"]
}
data = pd.DataFrame(employees)
data
Note how you don’t need to specify the columns
value inside pd.DataFrame()
since column names are inferred from dictionary keys.
The resulting DataFrame is something you’re used to seeing by now:
Let’s cover another way to convert list to DataFrame in Python, and that is by using a built-in method from Pandas.
5. Pandas List to DataFrame with the Pandas from_records() Function
Pandas has a built-in method that allows you to convert a multidimensional Python list to DataFrame. It’s named from_records()
, and it is a DataFrame
specific method.
You don’t have to use it, since you can accomplish the same by passing a list of lists into a data
argument when constructing a new Pandas DataFrame. But still, here’s an example in code:
employees = [
["Bob", "Doe", "[email protected]"],
["Mark", "Markson", "[email protected]"],
["Jane", "Swift", "[email protected]"],
["Patrick", "Johnson", "[email protected]"]
]
data = pd.DataFrame.from_records(employees)
data
The resulting DataFrame is correct, at least data-wise, but has no column names:
You need to explicitly supply the column names if you don’t want to use the default range index:
employees = [
["Bob", "Doe", "[email protected]"],
["Mark", "Markson", "[email protected]"],
["Jane", "Swift", "[email protected]"],
["Patrick", "Johnson", "[email protected]"]
]
data = pd.DataFrame.from_records(employees, columns=["First Name", "Last Name", "Email"])
data
You’ve now fixed the column names issue:
And that’s five ways to convert list to DataFrame in Python. Let’s make a short recap next.
Summing up Pandas List to DataFrame
You’ll often have to convert various data structures to Pandas DataFrames when working as a data analyst. Good thing for you - Python lists are the most common ones.
This article showed you five ways to convert a Python list to Pandas DataFrame, irrelevant to the list dimensionality. A one-dimensional list translates to a single-column Pandas DataFrame, and a two-dimensional list means the DataFrame will have as many columns as there are elements in the first child list. Easy!
Stay tuned to Practical Pandas website because next, we’ll explore how to convert Python dictionaries to Pandas DataFrames.