The cumsum[] function can be used to calculate the cumulative sum of values in a column of a pandas DataFrame.
You can use the following syntax to calculate a reversed cumulative sum of values in a column:
df['cumsum_reverse'] = df.loc[::-1, 'my_column'].cumsum[][::-1]
This particular syntax adds a new column called cumsum_reverse to a pandas DataFrame that shows the reversed cumulative sum of values in the column titled my_column.
The following example shows how to use this syntax in practice.
Example: Calculate a Reversed Cumulative Sum in Pandas
Suppose we have the following pandas DataFrame that shows the total sales made by some store during 10 consecutive days:
import pandas as pd #create DataFrame df = pd.DataFrame[{'day': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 'sales': [3, 6, 0, 2, 4, 1, 0, 1, 4, 7]}] #view DataFrame df day sales 0 1 3 1 2 6 2 3 0 3 4 2 4 5 4 5 6 1 6 7 0 7 8 1 8 9 4 9 10 7
We can use the following syntax to calculate a reversed cumulative sum of the sales column:
#add new column that shows reverse cumulative sum of sales
df['cumsum_reverse_sales'] = df.loc[::-1, 'sales'].cumsum[][::-1]
#view updated DataFrame
df
day sales cumsum_reverse_sales
0 1 3 28
1 2 6 25
2 3 0 19
3 4 2 19
4 5 4 17
5 6 1 13
6 7 0 12
7 8 1 12
8 9 4 11
9 10 7 7
The new column titled cumsum_reverse_sales shows the cumulative sales starting from the last row.
Here’s how we would interpret the values in the cumsum_reverse_sales column:
- The cumulative sum of sales for day 10 is 7.
- The cumulative sum of sales for day 10 and day 9 is 11.
- The cumulative sum of sales for day 10, day 9, and day 8 is 12.
- The cumulative sum of sales for day 10, day 9, day 8, and day 7 is 12.
And so on.
Additional Resources
The following tutorials explain how to perform other common tasks in pandas:
How to Sum Specific Columns in Pandas
How to Perform a GroupBy Sum in Pandas
How to Sum Columns Based on a Condition in Pandas
Overview
The
cummsum[]
function of a DataFrame object is used to obtain the cumulative sum over its axis.
Note: Axis here simply represents the row and column of the DataFrame. An axis with a value of
0
indicates the axes running vertically downwards across a row, while a value of1
indicates the axes running horizontally across a column.
Syntax
DataFrame.cumsum[axis=None, skipna=True, *args, **kwargs]
Syntax for the cumsum[] function in pandas
Parameters
axis
: This represents the name for the row [ designated as0
or'index'
] or the column [designated as1
orcolumns
] axis.skipna
: This takes a boolean value indicating if null values are to be excluded or not. This is an optional parameter.args
,**kwargs
: These keywords have no effect but may be accepted for compatibility with NumPy. These are optional.
Return value
This function returns a Series or DataFrame object showing the cumulative maximum in the axis.
Example
# A code to illustrate the cumsum[] function in Pandas
# importing the pandas library
import pandas as pd
# creating a dataframe
df = pd.DataFrame[[[5,10,4,15,3],
[1,7,5,9,0.5],
[3,11,13,14,12]],
columns=list['ABCDE']]
# printing the dataframe
print[df]
# obtaining the cummulative sum vertically across rows
print[df.cumsum[axis="index"]]
# obtaining the cummulative sum horizontally over columns
print[df.cumsum[axis="columns"]]
Syntax for the cumusum[] function
Explanation
- Line 4: We import the
pandas
library. - Lines 7–10: We create a
DataFrame,
df
. - Line 12: We print the DataFrame,
df
. - Line 15: We use the
cumsum[]
function to obtain the cumulative maximum values running downwards across the rows [axis0
]. We print the result to the console. - Line 18: We use the
cumsum[]
function to obtain the cumulative maximum values running horizontally across columns [axis1
]. We print the result to the console.
CONTRIBUTOR