Skip to content

create_asset hash suffix are not unique between multiple runs #216

@Uil2liv

Description

@Uil2liv

Considering the following code block :

def create_asset(self, content, extra_index,
test_index, file_extension, mode='w'):
hash_key = ''.join([self.test_id, str(extra_index),
str(test_index)])
hash_generator = hashlib.md5()
hash_generator.update(hash_key.encode('utf-8'))
hex_digest = hash_generator.hexdigest()
# 255 is the common max filename length on various filesystems,
# we subtract hash length, file extension length and 2 more
# characters for the underscore and dot
max_length = 255 - len(hex_digest) - len(file_extension) - 2
asset_file_name = '{0}_{1}.{2}'.format(hash_key[:max_length],
hex_digest,
file_extension)

As discussed in #199 :

But I think I got a better ideia for this issue. What do you think of the name of the assets being the unhashed name plus the hash? Something like "test01_fail_testname_[hash].png", it would solve your issue plus not having the issue of overwriting the assets every time the tests are run.

Originally posted by @RibeiroAna in #199 (comment)

It seems that the hash suffix has been introduced in the file naming to prevent multiple test runs to overwrite previous assets.
This case would occur when the test_id, the extra_index and the test_index are identicals, typically when file attachement occurs statically in the same places.

BUT, according to the code above, when we are in a such situation, we generate the hash suffix from the same key (string composition of test_id, extra_index and test_index).
In consequence, the hash digest used to create the file name will be identical, and asset overwriting will still occurs.

One workaround would be to use asset data instead of test reference:
After thinking of it, it can lead to overwriting when the asset to attach is identical (we can imagine short logs, or text files which have the exact same content over different runs in a well controlled test envrionment)

The workaround that I would suggest would be to hash time information in addition to the test reference, something like :

import hashlib
import time
[...]
            hash_key = ''.join([self.test_id, str(extra_index),
                                str(test_index)])
            hash_time = "%.9f" % time.time()
            hash_generator = hashlib.md5()
            hash_generator.update(hash_key.encode('utf-8'))
            hash_generator.update(hash_time.encode('utf-8'))
            hex_digest = hash_generator.hexdigest()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions