Skip to content

Dense Vector Field Support #356

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
YakPort opened this issue Jul 6, 2021 · 3 comments
Closed

Dense Vector Field Support #356

YakPort opened this issue Jul 6, 2021 · 3 comments

Comments

@YakPort
Copy link

YakPort commented Jul 6, 2021

I note the closed issue
https://github.com/elastic/elasticsearch-dsl-py/issues/1278

I created a densevector field as mentioned above
I have a method on my model the calculates the embedding called get_embedding

I am trying to create a Field that will access the attribute on my model and store the calculated embedding similar to

class DenseVector(DEDField, Field):
    name = 'dense_vector'

    def __init__(self):
        dims = 1024
        super(DenseVector, self).__init__(dims=dims)

in documents.py

@registry.register_document
class ItemDocument(Document):
    title_vector = DenseVector(attr='get_embedding')

    class Index:
        name = "products"
        settings = {
            "number_of_shards": 1,
            "number_of_replicas": 0,
            "analysis": {"analyzer": {"standard": {"type": "standard"}}},
        }

    class Django:
        model = Item

I am getting TypeError: init() got an unexpected keyword argument 'attr'
where is attr being initialised, or am I going down the wrong path?

Many thanks

@saadmk11
Copy link
Contributor

saadmk11 commented Jul 7, 2021

you need to pass attr kwarg to the __init__() method of DenseVector field, so that DEDField can process the attr

def __init__(self, attr=None, **kwargs):
    dims = 1024
    super(DenseVector, self).__init__(attr=attr, dims=dims, **kwargs)

@YakPort YakPort closed this as completed Jul 7, 2021
@BoPeng
Copy link

BoPeng commented Oct 18, 2024

It looks like elastiearch_dsl now has support for DenseVector

https://github.com/elastic/elasticsearch-dsl-py/blob/579f57205c395e17024d9ae827cbf6fd626969c4/elasticsearch_dsl/field.py#L392-L397

The DenseVector field is defined as multi Float field, with no dims. Maybe dense_vector does not require a fixed dimensions?

Then, in django_elasticsearch_dsl, the proper way to define a field might be

class DenseVectorField(DEDField, DenseVector):
      pass

although I need to explore how to use DenseVectorField for embedding search.

@Tayyab-R
Copy link

Custom DenseVector Field
Followed every step as mentioned in the thread.

class DenseVector(DEDField, Field):
    """
    Custom field from me for DenseVector in django-elasticsearch-dsl
    """

    name = 'dense_vector'

    def __init__(self, attr=None, **kwargs):
        dims = 1024
        super(DenseVector, self).__init__(attr=attr, dims=dims, **kwargs)

@registry.register_document
class BlogDocument(Document): 
             tags_embeddings = DenseVector(dims=384)
                  

but got error:
super(DenseVector, self).init(attr=attr, dims=dims, **kwargs)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: django_elasticsearch_dsl.fields.DEDField.init() got multiple values for keyword argument 'dims'


What am I doing wrong here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants