How do you iterate data in a dataframe in python?

Pandas Iterate Over Rows – 5 Methods

How do you iterate data in a dataframe in python?

Folks come to me and often say, “I have a Pandas DataFrame and I want to iterate over rows.” My first response is, are you sure? Ok, fine, let’s continue.

Depending on your situation, you have a menu of methods to choose from. Each with their own performance and usability tradeoffs. Here are the methods in recommended order:

  • DataFrame.apply()
  • DataFrame.iterrows()
  • DataFrame.itertuples()
  • Concert to DataFrame to Dictionary
  • DataFrame.iloc

Pseudo code: Go through each one of my DataFrame’s rows and do something with row data

Warning: Iterating through pandas objects is slow. In many cases, iterating manually over the rows is not needed.

Pandas Iterate Over Rows – Priority Order

How do you iterate data in a dataframe in python?

DataFrame.apply()

DataFrame.apply() is our first choice for iterating through rows. Apply() applies a function along a specific axis (rows/columns) of a DataFrame. It’s quick and efficient – .apply() takes advantage of internal optimizations and uses cython iterators.

DataFrame.iterrows()

iterrows() is a generator that iterates over the rows of your DataFrame and returns 1. the index of the row and 2. an object containing the row itself. Think of this function as going through each row, generating a series, and returning it back to you.

That’s a lot of compute on the backend you don’t see.

DataFrame.itertuples()

DataFrame.itertuples() is a cousin of .iterrows() but instead of returning a series, .itertuples() will return…you guessed it, a tuple. In this case, it’ll be a named tuple. A named tuple is a data type from python’s Collections module that acts like a tuple, but you can look it up by name.

Since you need to utilize Collections for .itertuples(), many people like to stay in pandas and use .iterrows() or .apply()

Convert your DataFrame To A Dictionary

Not the most elegant, but you can convert your DataFrame to a dictionary. Then iterate over your new dictionary. This won’t give you any special pandas functionality, but it’ll get the job done.

This is the reverse direction of Pandas DataFrame From Dict

Dataframe.iloc[]

As a last resort, you could also simply run a for loop and call the row of your DataFrame one by one. This method is not recommended because it is slow.

You’re holding yourself back by using this method. To to push yourself to learn one of the methods above.

This is the equivalent of having 20 items on your grocery list, going to store, but only limiting yourself 1 item per store visit. Get your walking shoes on.

Pandas Iterate Over Rows¶

So you want to iterate over your pandas DataFrame rows? We'll you think you want to.

Let's run through 5 examples (in speed order):

  1. DataFrame.apply()
  2. DataFrame.iterrows()
  3. DataFrame.itertuples()
  4. Concert to DataFrame to Dictionary
  5. Last resort - DataFrame.iloc

First, let's create a DataFrame

In [48]:

df = pd.DataFrame([('Foreign Cinema', 'Restaurant', 289.0),
                   ('Liho Liho', 'Restaurant', 224.0),
                   ('500 Club', 'bar', 80.5),
                   ('The Square', 'bar', 25.30),
                   ('The Lighthouse', 'bar', 15.30),
                   ("Al's Place", 'Restaurant', 456.53)],
           columns=('name', 'type', 'AvgBill')
                 )
df

Out[48]:

nametypeAvgBill
0Foreign Cinema Restaurant 289.00
1Liho Liho Restaurant 224.00
2500 Club bar 80.50
3The Square bar 25.30
4The Lighthouse bar 15.30
5Al's Place Restaurant 456.53

1. DataFrame.apply()¶

We are first going to use pandas apply. This will run through each row and apply a function for us. I'll use a quick lambda function for this example. Make sure you're axis=1 to go through rows.

In [55]:

# Printing Name and AvgBill. In this case, "x" is a series with index of column names
df.apply(lambda x: print ("{} - {}".format(x['name'], x['AvgBill'])), axis=1)

Foreign Cinema - 289.0
Liho Liho - 224.0
500 Club - 80.5
The Square - 25.3
The Lighthouse - 15.3
Al's Place - 456.53

Out[55]:

0    None
1    None
2    None
3    None
4    None
5    None
dtype: object

2. DataFrame.iterrows()¶

Next we are going to head over the .iter-land. We are starting with iterrows().

In [50]:

for index, contents in df.iterrows():
    print ("Index: {}".format(index))
    print ("{} - {}".format(contents['name'], contents['AvgBill']))
    print ()

Index: 0
Foreign Cinema - 289.0

Index: 1
Liho Liho - 224.0

Index: 2
500 Club - 80.5

Index: 3
The Square - 25.3

Index: 4
The Lighthouse - 15.3

Index: 5
Al's Place - 456.53

3. DataFrame.itertuples()¶

Next head over to itertupes. This will return a named tuple - a regular tuple, but you're able to reference data points by name.

In [51]:

for row in df.itertuples():
    print (row)
    print (row.name)
    print ()
    

Pandas(Index=0, name='Foreign Cinema', type='Restaurant', AvgBill=289.0)
Foreign Cinema

Pandas(Index=1, name='Liho Liho', type='Restaurant', AvgBill=224.0)
Liho Liho

Pandas(Index=2, name='500 Club', type='bar', AvgBill=80.5)
500 Club

Pandas(Index=3, name='The Square', type='bar', AvgBill=25.3)
The Square

Pandas(Index=4, name='The Lighthouse', type='bar', AvgBill=15.3)
The Lighthouse

Pandas(Index=5, name="Al's Place", type='Restaurant', AvgBill=456.53)
Al's Place

4. Concert to DataFrame to Dictionary¶

Now we are getting down into the desperate zone. If you really wanted to (without much reason), you can convert your DataFrame to a dictionary first and then iterate through.

In [52]:

df_dict = df.to_dict(orient='index')
df_dict

Out[52]:

{0: {'name': 'Foreign Cinema', 'type': 'Restaurant', 'AvgBill': 289.0},
 1: {'name': 'Liho Liho', 'type': 'Restaurant', 'AvgBill': 224.0},
 2: {'name': '500 Club', 'type': 'bar', 'AvgBill': 80.5},
 3: {'name': 'The Square', 'type': 'bar', 'AvgBill': 25.3},
 4: {'name': 'The Lighthouse', 'type': 'bar', 'AvgBill': 15.3},
 5: {'name': "Al's Place", 'type': 'Restaurant', 'AvgBill': 456.53}}

In [53]:

for key in df_dict:
    print (key)
    print (df_dict[key])
    print (df_dict[key]['name'])
    print ()

0
{'name': 'Foreign Cinema', 'type': 'Restaurant', 'AvgBill': 289.0}
Foreign Cinema

1
{'name': 'Liho Liho', 'type': 'Restaurant', 'AvgBill': 224.0}
Liho Liho

2
{'name': '500 Club', 'type': 'bar', 'AvgBill': 80.5}
500 Club

3
{'name': 'The Square', 'type': 'bar', 'AvgBill': 25.3}
The Square

4
{'name': 'The Lighthouse', 'type': 'bar', 'AvgBill': 15.3}
The Lighthouse

5
{'name': "Al's Place", 'type': 'Restaurant', 'AvgBill': 456.53}
Al's Place

5. Last resort - DataFrame.iloc¶

I didn't even want to put this one on here. I don't want to give you ideas. This method is crude and slow. I bet you $5 of AWS credit there is a faster way.

As a last resort, you can iterate through your DataFrame by iterating through a list, and then calling each of your DataFrame rows individually.

In [54]:

for i in range(len(df)):
    print (df.iloc[i])
    print ()
    print ("Name: {}".format(df.iloc[i]['name']))
    print ("\n")

name       Foreign Cinema
type           Restaurant
AvgBill               289
Name: 0, dtype: object

Name: Foreign Cinema


name        Liho Liho
type       Restaurant
AvgBill           224
Name: 1, dtype: object

Name: Liho Liho


name       500 Club
type            bar
AvgBill        80.5
Name: 2, dtype: object

Name: 500 Club


name       The Square
type              bar
AvgBill          25.3
Name: 3, dtype: object

Name: The Square


name       The Lighthouse
type                  bar
AvgBill              15.3
Name: 4, dtype: object

Name: The Lighthouse


name       Al's Place
type       Restaurant
AvgBill        456.53
Name: 5, dtype: object

Name: Al's Place


Check out more Pandas functions on our Pandas Page

You May Also Like

User Retention – How To Manually Calculate

Learn more

TypeError Pandas Missing Argument – How to fix

Learn more

Selecting Data – Pandas loc & iloc[] – The Guide

Learn more

Can you iterate through a DataFrame in Python?

Iterating over the rows of a DataFrame You can do so using either iterrows() or itertuples() built-in methods. Before seeing both methods in action, let's create an example DataFrame that we'll use to iterate over. pandas. DataFrame.

How do I iterate a pandas DataFrame in Python?

Method 2: Using loc[] function of the Dataframe..
Method 3: Using iloc[] function of the DataFrame..
Method 4: Using iterrows() method of the Dataframe..
Method 6: Using apply() method of the Dataframe..

How do you iterate data in Python?

In Python, there is not C like syntax for(i=0; iuse for in n . They can be used to iterate over a sequence of a list , string , tuple , set , array , data frame . Given a list of elements, for loop can be used to iterate over each item in that list and execute it.

How do you iterate through a DataFrame column in Python?

One simple way to iterate over columns of pandas DataFrame is by using for loop. You can use column-labels to run the for loop over the pandas DataFrame using the get item syntax ([]) . Yields below output. The values() function is used to extract the object elements as a list.