Skip to content

INTPYTHON-676 Add optional signing of cache data #336

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 18 additions & 4 deletions django_mongodb_backend/cache.py
Original file line number Diff line number Diff line change
@@ -1,30 +1,36 @@
import pickle
from datetime import datetime, timezone
from base64 import b64encode, b64decode

from django.core.cache.backends.base import DEFAULT_TIMEOUT, BaseCache
from django.core.cache.backends.db import Options
from django.core.signing import Signer, BadSignature
from django.db import connections, router
from django.utils.functional import cached_property
from pymongo import ASCENDING, DESCENDING, IndexModel, ReturnDocument
from pymongo.errors import DuplicateKeyError, OperationFailure


class MongoSerializer:
def __init__(self, protocol=None):
def __init__(self, protocol=None, signer=True, salt=None):
self.protocol = pickle.HIGHEST_PROTOCOL if protocol is None else protocol
self.signer = Signer(salt=salt) if signer else None

def dumps(self, obj):
# For better incr() and decr() atomicity, don't pickle integers.
# Using type() rather than isinstance() matches only integers and not
# subclasses like bool.
if type(obj) is int: # noqa: E721
return obj
return pickle.dumps(obj, self.protocol)
pickled_data = pickle.dumps(obj, protocol=self.protocol) # noqa: S301
return self.signer.sign(b64encode(pickled_data).decode()) if self.signer else pickled_data

def loads(self, data):
try:
return int(data)
except (ValueError, TypeError):
if self.signer is not None:
data = b64decode(self.signer.unsign(data))
return pickle.loads(data) # noqa: S301


Expand All @@ -39,6 +45,8 @@ class CacheEntry:
_meta = Options(collection_name)

self.cache_model_class = CacheEntry
self._sign_cache = params.get("ENABLE_SIGNING", True)
self._salt = params.get("SALT", None)

def create_indexes(self):
expires_index = IndexModel("expires_at", expireAfterSeconds=0)
Expand All @@ -47,7 +55,7 @@ def create_indexes(self):

@cached_property
def serializer(self):
return MongoSerializer(self.pickle_protocol)
return MongoSerializer(self.pickle_protocol, self._sign_cache, self._salt)

@property
def collection_for_read(self):
Expand Down Expand Up @@ -84,7 +92,13 @@ def get_many(self, keys, version=None):
with self.collection_for_read.find(
{"key": {"$in": tuple(keys_map)}, **self._filter_expired(expired=False)}
) as cursor:
return {keys_map[row["key"]]: self.serializer.loads(row["value"]) for row in cursor}
results = {}
for row in cursor:
try:
results[keys_map[row["key"]]] = self.serializer.loads(row["value"])
except (BadSignature, TypeError):
self.delete(row["key"])
Comment on lines +99 to +100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble understanding why we're catching BadSignature and TypeError issues. What would trigger this and why would it be okay to delete a row in these instances?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the data in the cache collection is tampered with or somehow the HMAC secret key or salt is changed, a BadSignature error is thrown. I added error handling here to delete the row otherwise the exception will DOS the page until the cache entry gets culled. TypeError can be thrown if you switch from ENABLE_SIGNING=True to ENABLE_SIGNING=False without clearing all cache entries. It's highly unlikely that will ever happen in prod, but I was able to get that error while in changing settings in debug mode.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better behavior than silently deleting bad data (which could happen in what circumstances besides an attacker putting malicious data in the cache?) could be to raise (or at least log) SuspiciousOperation. There is some precedent in Django for this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explained in a above comment the cases I was able to create where these two exceptions were thrown. If we do not delete the entry, won't we just create a DOS of the affected page until the cache entry is culled?

I do agree that this probably shouldn't be silent, but throwing an error here will stop the request from being handled, generate a 500, and require the request to be resent. I think that is probably ok if we delete the offending cache entry so only one request would be affected by the issue. What do you think?

return results

def set(self, key, value, timeout=DEFAULT_TIMEOUT, version=None):
key = self.make_and_validate_key(key, version=version)
Expand Down
5 changes: 5 additions & 0 deletions docs/source/topics/cache.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,11 @@ In addition, the cache is culled based on ``CULL_FREQUENCY`` when ``add()``
or ``set()`` is called, if ``MAX_ENTRIES`` is exceeded. See
:ref:`django:cache_arguments` for an explanation of these two options.

Cache entries include a HMAC signature to ensure data integrity by default.
You can disable this by setting ``ENABLE_SIGNING`` to ``False``.
Signatures can also include an optional salt parameter by setting ``SALT``
to a string value.

Creating the cache collection
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
17 changes: 17 additions & 0 deletions tests/cache_/tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,7 @@ def caches_setting_for_tests(base=None, exclude=None, **params):
BACKEND="django_mongodb_backend.cache.MongoDBCache",
# Spaces are used in the name to ensure quoting/escaping works.
LOCATION="test cache collection",
ENABLE_SIGNING=False,
),
)
@modify_settings(
Expand Down Expand Up @@ -955,6 +956,22 @@ def test_serializer_dumps(self):
self.assertIsInstance(cache.serializer.dumps("abc"), bytes)


@override_settings(
CACHES=caches_setting_for_tests(
BACKEND="django_mongodb_backend.cache.MongoDBCache",
# Spaces are used in the name to ensure quoting/escaping works.
LOCATION="test cache collection",
ENABLE_SIGNING=True,
SALT="test-salt",
),
)
class SignedCacheTests(CacheTests):
def test_serializer_dumps(self):
# The serializer should return a bytestring for signed caches.
self.assertEqual(cache.serializer.dumps(123), 123)
self.assertIsInstance(cache.serializer.dumps(True), str)
self.assertIsInstance(cache.serializer.dumps("abc"), str)

class DBCacheRouter:
"""A router that puts the cache table on the 'other' database."""

Expand Down