Skip to content

Refactor collection creation to prefer named vectors #1654

@dirkkul

Description

@dirkkul

Currently we have two ways of creating vector indices:

  • legacy:
        vector_index_config=Configure.VectorIndex.hnsw(),
        vectorizer_config=Configure.Vectorizer.none(),
  • named vectors
        vectorizer_config=[
            wvc.config.Configure.NamedVectors.text2vec_contextionary(
                "title",
                source_properties=["title"],
                vectorize_collection_name=False,
                vector_index_config=wvc.config.Configure.VectorIndex.flat(
                    distance_metric=wvc.config.VectorDistances.HAMMING,
                    quantizer=wvc.config.Configure.VectorIndex.Quantizer.bq(rescore_limit=10),
                ),
            ),
            wvc.config.Configure.NamedVectors.none(
                "custom",
            ),
            wvc.config.Configure.NamedVectors.text2vec_contextionary(
                "default",
                vectorize_collection_name=False,  # needed as contextionary cant handle "_" in collection names
            ),
        ],

Instead we should:

  • deprecate both ways of setting vectorizers in Configure
  • deprecate vectorizer_config
  • replace it with NamedVectorsByDefault that does not contain named vectors

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions