Cara menggunakan DEEPDIFF pada Python

Get a Meaningful Assertion Error When Comparing Two Python Objects

Motivation

When comparing two Python objects, you might not want the test to focus on some trivial differences such as the order of the values in a list.

For example, you might want the test to consider[3, 2] to be the same as [2, 3] .

However, you get an error when the order is different.

Cara menggunakan DEEPDIFF pada Python

Image by Author

Is there a way that you can ignore the order when comparing two Python objects?

Cara menggunakan DEEPDIFF pada Python

Image by Author

Cara menggunakan DEEPDIFF pada Python

Image by Author

That is when DeepDiff comes in handy. In this article, you will learn how to use DeepDiff to prevent comparing certain parts of Python objects.

What is DeepDiff?

DeepDiff is a Python library that recursively looks for all the changes in dictionaries, iterables, strings, and other objects.

To install DeepDiff, type:

pip install deepdiff

Descriptive Error

When comparing between two different Python objects using assert , you might get an error similar to below:

AssertionError: assert {'apple': 2, 'banana': [3, 2, 2], 'orange': 3} == {'apple': 2, 'banana': [3, 2], 'orange': 3}

This assertion error is not very informative since we don’t know the exact elements that make these two dictionaries different.

With DeepDiff, we can see a more descriptive error showing what the differences are and where the differences occur.

Cara menggunakan DEEPDIFF pada Python

Image by Author — Copy the code above here

From the error above, we know three things:

  • One item in price1 is removed in price2 .
  • The removed item is at price1['banana'][2].
  • The value of the removed item is 2 .

Tree View

The default view of DeepDiff is "text" . To change the view into a tree view, use view="tree" . In the tree view, you can traverse through the tree and see what items were compared to each other.

Cara menggunakan DEEPDIFF pada Python

Image by Author — Copy the code above here

Ignore Order

As shown at the beginning of the article, you can ignore the order using ignore_order=True :

Cara menggunakan DEEPDIFF pada Python

Image by Author — Copy the code above here

The output {} shows that there is no difference between two Python objects.

You can also use ignore_order=True to ignore duplicates:

Cara menggunakan DEEPDIFF pada Python

Image by Author — Copy the code above here

Ignore Small Difference Between Two Numbers

Ignore up to a Certain Digit

It can be annoying to see the assertion error when two numbers are very close to each other. If you want to ignore the difference between two numbers after a particular digit, use significant_digits .

In the code below, we only compare the two numbers up to the second digit.

Cara menggunakan DEEPDIFF pada Python

Image by Author — Copy the code above here

Ignore the Difference Below An Epsilon

Sometimes, two numbers are very close, but they don’t have similar digits. To ignore small differences between two numbers, use math_epsilon .

In the code below, I use math_epsilon=0.001 to tell DeepDiff to ignore the difference that is smaller than 0.001. Since the difference between 1 and 0.9999 is 0.0001, the difference is ignored.

Cara menggunakan DEEPDIFF pada Python

Image by Author — Copy the code above here

Ignore String Case

If you want to ignore string case (.i.e, “Red” and “red”), use ignore_string_case=True .

Cara menggunakan DEEPDIFF pada Python

Image by Author — Copy the code above here

Ignore NaN Inequality

If you have worked with some NaNs, you might know that not every NaNs in Python are equal:

Cara menggunakan DEEPDIFF pada Python

Image by Author

Thus, it can be confusing to compare objects that contain different types of NaNs. (Isn’t [nan, 1, 2] equal to [nan, 1, 2] ?)

Cara menggunakan DEEPDIFF pada Python

Image by Author — Copy the code above here

To ignore different types of NaNs, use ignore_nan_inequality=True :

Cara menggunakan DEEPDIFF pada Python

Image by Author — Copy the code above here

Exclude Types

Sometimes, you might not care if certain types change or not. To include certain data types, use exclude_types :

Cara menggunakan DEEPDIFF pada Python

Image by Author — Copy the code above here

Ignore Numeric Type

2 != 2.0 since 2 is an integer and 2.0 is a float. You can ignore variation in numeric type using ignore_numeric_type_changes=True .

Cara menggunakan DEEPDIFF pada Python

Image by Author — Copy the code above here

Truncate Datetime

When comparing between two datetime objects, you might just want to make sure they are similar to a certain extent (have the same hour, not the same hour and minute).

Cara menggunakan DEEPDIFF pada Python

Image by Author

You can specify to how precise DeepDiff should compare between two datetime objects using truncate_datetime .

Cara menggunakan DEEPDIFF pada Python

Image by Author — Copy the code above here

Ignore Path

If you want to exclude certain paths from the report, you exclude_paths :

Cara menggunakan DEEPDIFF pada Python

Image by Author — Copy the code above here

Exclude Regrex Path

If you want to ignore multiple paths with a certain pattern, use exclude_regrex_paths .

For example, to avoid comparing aldi[0]['color'] with walmart[0]['color']and aldi[1]['color'] with walmart[1]['color'] , we simply ignore the paths specified by the regular expression root[\d+\]\['color'\] , where \d+ stands for one or more digits.

Cara menggunakan DEEPDIFF pada Python

Image by Author — Copy the code above here

Check this cheatsheet if you are not familiar with regular expression.

Use DeepDiff with pytest

To use DeepDiff with pytest, write assert not DeepDiff(...) . This means that we want to assert that there is no difference between two Python objects.

Cara menggunakan DEEPDIFF pada Python

Image by Author

Conclusion

Congratulations! You have just learned how to ignore certain elements when comparing between two Python objects using DeepDiff. I hope this tool will make it easier for you to write tests and debug your code.

Feel free to play and fork the source code of this article here:

I like to write about basic data science concepts and play with different data science tools. You could connect with me on LinkedIn and Twitter.

Star this repo if you want to check out the codes for all of the articles I have written. Follow me on Medium to stay informed with my latest data science articles like these: