Skip to content

Fixed Delta serialization when None type is present. #225

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 15, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
DeepDiff Change log

- v5-2-2: Fixed Delta serialization when None type is present.
- v5-2-0: Removed Murmur3 as the preferred hashing method. Using SHA256 by default now. Added commandline for deepdiff. Added group_by. Added math_epsilon. Improved ignoring of NoneType.
- v5-0-2: Bug Fix NoneType in ignore type groups https://github.com/seperman/deepdiff/issues/207
- v5-0-1: Bug fix to not apply format to non numbers.
Expand Down
26 changes: 13 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# DeepDiff v 5.2.1
# DeepDiff v 5.2.2

![Downloads](https://img.shields.io/pypi/dm/deepdiff.svg?style=flat)
![Python Versions](https://img.shields.io/pypi/pyversions/deepdiff.svg?style=flat)
Expand All @@ -18,7 +18,7 @@ Tested on Python 3.6+ and PyPy3.

**NOTE: The last version of DeepDiff to work on Python 3.5 was DeepDiff 5-0-2**

- [Documentation](https://zepworks.com/deepdiff/5.2.1/)
- [Documentation](https://zepworks.com/deepdiff/5.2.2/)


## Installation
Expand Down Expand Up @@ -54,13 +54,13 @@ Note: if you want to use DeepDiff via commandline, make sure to run `pip install

DeepDiff gets the difference of 2 objects.

> - Please take a look at the [DeepDiff docs](https://zepworks.com/deepdiff/5.2.1/diff.html)
> - The full documentation of all modules can be found on <https://zepworks.com/deepdiff/5.2.1/>
> - Please take a look at the [DeepDiff docs](https://zepworks.com/deepdiff/5.2.2/diff.html)
> - The full documentation of all modules can be found on <https://zepworks.com/deepdiff/5.2.2/>
> - Tutorials and posts about DeepDiff can be found on <https://zepworks.com/tags/deepdiff/>

## A few Examples

> Note: This is just a brief overview of what DeepDiff can do. Please visit <https://zepworks.com/deepdiff/5.2.1/> for full documentation.
> Note: This is just a brief overview of what DeepDiff can do. Please visit <https://zepworks.com/deepdiff/5.2.2/> for full documentation.

### List difference ignoring order or duplicates

Expand Down Expand Up @@ -264,8 +264,8 @@ Example:
```


> - Please take a look at the [DeepDiff docs](https://zepworks.com/deepdiff/5.2.1/diff.html)
> - The full documentation can be found on <https://zepworks.com/deepdiff/5.2.1/>
> - Please take a look at the [DeepDiff docs](https://zepworks.com/deepdiff/5.2.2/diff.html)
> - The full documentation can be found on <https://zepworks.com/deepdiff/5.2.2/>


# Deep Search
Expand Down Expand Up @@ -297,17 +297,17 @@ And you can pass all the same kwargs as DeepSearch to grep too:
{'matched_paths': {"root['somewhere']": 'around'}, 'matched_values': {"root['long']": 'somewhere'}}
```

> - Please take a look at the [DeepSearch docs](https://zepworks.com/deepdiff/5.2.1/dsearch.html)
> - The full documentation can be found on <https://zepworks.com/deepdiff/5.2.1/>
> - Please take a look at the [DeepSearch docs](https://zepworks.com/deepdiff/5.2.2/dsearch.html)
> - The full documentation can be found on <https://zepworks.com/deepdiff/5.2.2/>

# Deep Hash
(New in v4-0-0)

DeepHash is designed to give you hash of ANY python object based on its contents even if the object is not considered hashable!
DeepHash is supposed to be deterministic in order to make sure 2 objects that contain the same data, produce the same hash.

> - Please take a look at the [DeepHash docs](https://zepworks.com/deepdiff/5.2.1/deephash.html)
> - The full documentation can be found on <https://zepworks.com/deepdiff/5.2.1/>
> - Please take a look at the [DeepHash docs](https://zepworks.com/deepdiff/5.2.2/deephash.html)
> - The full documentation can be found on <https://zepworks.com/deepdiff/5.2.2/>

Let's say you have a dictionary object.

Expand Down Expand Up @@ -355,8 +355,8 @@ Which you can write as:
At first it might seem weird why DeepHash(obj)[obj] but remember that DeepHash(obj) is a dictionary of hashes of all other objects that obj contains too.


> - Please take a look at the [DeepHash docs](https://zepworks.com/deepdiff/5.2.1/deephash.html)
> - The full documentation can be found on <https://zepworks.com/deepdiff/5.2.1/>
> - Please take a look at the [DeepHash docs](https://zepworks.com/deepdiff/5.2.2/deephash.html)
> - The full documentation can be found on <https://zepworks.com/deepdiff/5.2.2/>


# Using DeepDiff in unit tests
Expand Down
2 changes: 1 addition & 1 deletion deepdiff/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""This module offers the DeepDiff, DeepSearch, grep, Delta and DeepHash classes."""
# flake8: noqa
__version__ = '5.2.1'
__version__ = '5.2.2'
import logging

if __name__ == '__main__':
Expand Down
51 changes: 40 additions & 11 deletions deepdiff/commands.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ def cli():
@click.option('--significant-digits', required=False, default=None, type=int, show_default=True)
@click.option('--truncate-datetime', required=False, type=click.Choice(['second', 'minute', 'hour', 'day'], case_sensitive=True), show_default=True, default=None)
@click.option('--verbose-level', required=False, default=1, type=click.IntRange(0, 2), show_default=True)
@click.option('--debug', is_flag=True, show_default=False)
def diff(
*args, **kwargs
):
Expand All @@ -59,6 +60,7 @@ def diff(

T1 and T2 are the path to the files to be compared with each other.
"""
debug = kwargs.pop('debug')
kwargs['ignore_private_variables'] = not kwargs.pop('include_private_variables')
kwargs['progress_logger'] = logger.info if kwargs['progress_logger'] == 'info' else logger.error
create_patch = kwargs.pop('create_patch')
Expand All @@ -71,7 +73,10 @@ def diff(
try:
kwargs[name] = load_path_content(t_path, file_type=t_extension)
except Exception as e: # pragma: no cover.
sys.exit(str(f"Error when loading {name}: {e}")) # pragma: no cover.
if debug: # pragma: no cover.
raise # pragma: no cover.
else: # pragma: no cover.
sys.exit(str(f"Error when loading {name}: {e}")) # pragma: no cover.

# if (t1_extension != t2_extension):
if t1_extension in {'csv', 'tsv'}:
Expand All @@ -92,7 +97,10 @@ def diff(
try:
delta = Delta(diff)
except Exception as e: # pragma: no cover.
sys.exit(f"Error when loading the patch (aka delta): {e}") # pragma: no cover.
if debug: # pragma: no cover.
raise # pragma: no cover.
else: # pragma: no cover.
sys.exit(f"Error when loading the patch (aka delta): {e}") # pragma: no cover.

# printing into stdout
sys.stdout.buffer.write(delta.dumps())
Expand All @@ -105,8 +113,9 @@ def diff(
@click.argument('delta_path', type=click.Path(exists=True, resolve_path=True))
@click.option('--backup', '-b', is_flag=True, show_default=True)
@click.option('--raise-errors', is_flag=True, show_default=True)
@click.option('--debug', is_flag=True, show_default=False)
def patch(
path, delta_path, backup, raise_errors
path, delta_path, backup, raise_errors, debug
):
"""
Deep Patch Commandline
Expand All @@ -123,7 +132,10 @@ def patch(
try:
delta = Delta(delta_path=delta_path, raise_errors=raise_errors)
except Exception as e: # pragma: no cover.
sys.exit(str(f"Error when loading the patch (aka delta) {delta_path}: {e}")) # pragma: no cover.
if debug: # pragma: no cover.
raise # pragma: no cover.
else: # pragma: no cover.
sys.exit(str(f"Error when loading the patch (aka delta) {delta_path}: {e}")) # pragma: no cover.

extension = path.split('.')[-1]

Expand All @@ -137,7 +149,10 @@ def patch(
try:
save_content_to_path(result, path, file_type=extension, keep_backup=backup)
except Exception as e: # pragma: no cover.
sys.exit(str(f"Error when saving {path}: {e}")) # pragma: no cover.
if debug: # pragma: no cover.
raise # pragma: no cover.
else: # pragma: no cover.
sys.exit(str(f"Error when saving {path}: {e}")) # pragma: no cover.


@cli.command()
Expand All @@ -148,7 +163,8 @@ def patch(
@click.option('--exclude-paths', required=False, type=str, show_default=False, multiple=True)
@click.option('--exclude-regex-paths', required=False, type=str, show_default=False, multiple=True)
@click.option('--verbose-level', required=False, default=1, type=click.IntRange(0, 2), show_default=True)
def grep(item, path, **kwargs):
@click.option('--debug', is_flag=True, show_default=False)
def grep(item, path, debug, **kwargs):
"""
Deep Grep Commandline

Expand All @@ -162,19 +178,26 @@ def grep(item, path, **kwargs):
try:
content = load_path_content(path)
except Exception as e: # pragma: no cover.
sys.exit(str(f"Error when loading {path}: {e}")) # pragma: no cover.
if debug: # pragma: no cover.
raise # pragma: no cover.
else: # pragma: no cover.
sys.exit(str(f"Error when loading {path}: {e}")) # pragma: no cover.

try:
result = DeepSearch(content, item, **kwargs)
except Exception as e: # pragma: no cover.
sys.exit(str(f"Error when running deep search on {path}: {e}")) # pragma: no cover.
if debug: # pragma: no cover.
raise # pragma: no cover.
else: # pragma: no cover.
sys.exit(str(f"Error when running deep search on {path}: {e}")) # pragma: no cover.
pprint(result, indent=2)


@cli.command()
@click.argument('path_inside', required=True, type=str)
@click.argument('path', type=click.Path(exists=True, resolve_path=True))
def extract(path_inside, path):
@click.option('--debug', is_flag=True, show_default=False)
def extract(path_inside, path, debug):
"""
Deep Extract Commandline

Expand All @@ -185,10 +208,16 @@ def extract(path_inside, path):
try:
content = load_path_content(path)
except Exception as e: # pragma: no cover.
sys.exit(str(f"Error when loading {path}: {e}")) # pragma: no cover.
if debug: # pragma: no cover.
raise # pragma: no cover.
else: # pragma: no cover.
sys.exit(str(f"Error when loading {path}: {e}")) # pragma: no cover.

try:
result = deep_extract(content, path_inside)
except Exception as e: # pragma: no cover.
sys.exit(str(f"Error when running deep search on {path}: {e}")) # pragma: no cover.
if debug: # pragma: no cover.
raise # pragma: no cover.
else: # pragma: no cover.
sys.exit(str(f"Error when running deep search on {path}: {e}")) # pragma: no cover.
pprint(result, indent=2)
22 changes: 18 additions & 4 deletions deepdiff/delta.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,24 +69,29 @@ def __init__(
serializer=pickle_dump,
verify_symmetry=False,
):
if 'safe_to_import' not in set(deserializer.__code__.co_varnames):
def _deserializer(obj, safe_to_import=None):
return deserializer(obj)
else:
_deserializer = deserializer

if diff is not None:
if isinstance(diff, DeepDiff):
self.diff = diff._to_delta_dict(directed=not verify_symmetry)
elif isinstance(diff, Mapping):
self.diff = diff
elif isinstance(diff, strings):
self.diff = deserializer(diff, safe_to_import=safe_to_import)
self.diff = _deserializer(diff, safe_to_import=safe_to_import)
elif delta_path:
with open(delta_path, 'rb') as the_file:
content = the_file.read()
self.diff = deserializer(content, safe_to_import=safe_to_import)
self.diff = _deserializer(content, safe_to_import=safe_to_import)
elif delta_file:
try:
content = delta_file.read()
except UnicodeDecodeError as e:
raise ValueError(BINIARY_MODE_NEEDED_MSG.format(e)) from None
self.diff = deserializer(content, safe_to_import=safe_to_import)
self.diff = _deserializer(content, safe_to_import=safe_to_import)
else:
raise ValueError(DELTA_AT_LEAST_ONE_ARG_NEEDED)

Expand Down Expand Up @@ -512,7 +517,16 @@ def dump(self, file):
"""
Dump into file object
"""
file.write(self.dumps())
# Small optimization: Our internal pickle serializer can just take a file object
# and directly write to it. However if a user defined serializer is passed
# we want to make it compatible with the expectation that self.serializer(self.diff)
# will give the user the serialization and then it can be written to
# a file object when using the dump(file) function.
param_names_of_serializer = set(self.serializer.__code__.co_varnames)
if 'file_obj' in param_names_of_serializer:
self.serializer(self.diff, file_obj=file)
else:
file.write(self.dumps())

def dumps(self):
"""
Expand Down
39 changes: 35 additions & 4 deletions deepdiff/serialization.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,8 @@ class UnsupportedFormatErr(TypeError):
pass


NONE_TYPE = type(None)

CSV_HEADER_MAX_CHUNK_SIZE = 2048 # The chunk needs to be big enough that covers a couple of rows of data.


Expand Down Expand Up @@ -254,10 +256,40 @@ def find_class(self, module, name):
# Forbid everything else.
raise ForbiddenModule(FORBIDDEN_MODULE_MSG.format(module_dot_class)) from None

def persistent_load(self, persistent_id):
if persistent_id == "<<NoneType>>":
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super happy with this solution but I didn't find a better solution to the problem.

return type(None)


class _RestrictedPickler(pickle.Pickler):
def persistent_id(self, obj):
if obj is NONE_TYPE: # NOQA
return "<<NoneType>>"
return None


def pickle_dump(obj):
def pickle_dump(obj, file_obj=None):
"""
**pickle_dump**
Dumps the obj into pickled content.

**Parameters**

obj : Any python object

file_obj : (Optional) A file object to dump the contents into

**Returns**

If file_obj is passed the return value will be None. It will write the object's pickle contents into the file.
However if no file_obj is passed, then it will return the pickle serialization of the obj in the form of bytes.
"""
file_obj_passed = bool(file_obj)
file_obj = file_obj or io.BytesIO()
# We expect at least python 3.5 so protocol 4 is good.
return pickle.dumps(obj, protocol=4, fix_imports=False)
_RestrictedPickler(file_obj, protocol=4, fix_imports=False).dump(obj)
if not file_obj_passed:
return file_obj.getvalue()


def pickle_load(content, safe_to_import=None):
Expand Down Expand Up @@ -406,8 +438,7 @@ def _save_content(content, path, file_type, keep_backup=True):
content = toml.dump(content, the_file)
elif file_type == 'pickle':
with open(path, 'wb') as the_file:
content = pickle_dump(content)
the_file.write(content)
content = pickle_dump(content, file_obj=the_file)
elif file_type in {'csv', 'tsv'}:
if clevercsv is None: # pragma: no cover.
raise ImportError('CleverCSV needs to be installed.') # pragma: no cover.
Expand Down
1 change: 1 addition & 0 deletions docs/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ Changelog

DeepDiff Changelog

- v5-2-2: Fixed Delta serialization when None type is present.
- v5-2-0: Removed Murmur3 as the preferred hashing method. Using SHA256 by default now. Added commandline for deepdiff. Added group_by. Added math_epsilon. Improved ignoring of NoneType.
- v5-0-2: Bug Fix NoneType in ignore type groups https://github.com/seperman/deepdiff/issues/207
- v5-0-1: Bug fix to not apply format to non numbers.
Expand Down
4 changes: 2 additions & 2 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,9 +60,9 @@
# built documents.
#
# The short X.Y version.
version = '5.2.1'
version = '5.2.2'
# The full version, including alpha/beta/rc tags.
release = '5.2.1'
release = '5.2.2'

load_dotenv(override=True)
DOC_VERSION = os.environ.get('DOC_VERSION', version)
Expand Down
Loading