Skip to content

Python: Introducing Keyword Hybrid Search and Lambda Filters for Azure AI Search #11482

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

eavanvalkenburg
Copy link
Member

@eavanvalkenburg eavanvalkenburg commented Apr 10, 2025

Motivation and Context

Hybrid
Adds a new abstraction for keyword hybrid search
Implements this for Azure AI Search
Adds search tests for Azure AI Search
Filters
Adds support for a Callable (lambda func) as a filter in VectorSearchOptions
Implements a ast walker to go from a lambda to a string for the filter
Adds a set of test parameter that can be extended

This further does some work on the vector abstractions:

Relates to #10561

Description

Contribution Checklist

@eavanvalkenburg eavanvalkenburg requested a review from a team as a code owner April 10, 2025 13:34
@markwallace-microsoft markwallace-microsoft added python Pull requests for the Python Semantic Kernel memory labels Apr 10, 2025
@markwallace-microsoft
Copy link
Member

markwallace-microsoft commented Apr 10, 2025

Python Unit Test Overview

Tests Skipped Failures Errors Time
3462 5 💤 0 ❌ 0 🔥 1m 40s ⏱️

@eavanvalkenburg eavanvalkenburg changed the title Python: Introducing Keyword Hybrid Search for Azure AI Search Python: Introducing Keyword Hybrid Search and Lamba Filters for Azure AI Search Apr 15, 2025
@eavanvalkenburg eavanvalkenburg changed the title Python: Introducing Keyword Hybrid Search and Lamba Filters for Azure AI Search Python: Introducing Keyword Hybrid Search and Lambda Filters for Azure AI Search Apr 15, 2025
@roji
Copy link
Member

roji commented Apr 15, 2025

If I understand correctly, there aren't any integration tests for this, right? I'm seeing what looks like unit tests only for now?

@eavanvalkenburg
Copy link
Member Author

eavanvalkenburg commented Apr 15, 2025

@roji indeed int tests still to be done, made a note

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces support for keyword hybrid search and adds lambda filter functionality across multiple memory connectors while updating filter property names. Key changes include:

  • Adding support for Callable filters (lambda filters) in methods such as _build_filter with corresponding parameter type updates.
  • Renaming data field properties from is_filterable/is_full_text_searchable to is_indexed/is_full_text_indexed in connectors and samples.
  • Removing obsolete Azure AI Search connector files to consolidate functionality.

Reviewed Changes

Copilot reviewed 52 out of 52 changed files in this pull request and generated no comments.

Show a summary per file
File Description
python/semantic_kernel/connectors/memory/pinecone/_pinecone.py Updated filter handling with lambda support and renaming of filter properties.
python/semantic_kernel/connectors/memory/mongodb_atlas/* Updated field property checks and filter dictionary creation.
python/semantic_kernel/connectors/memory/in_memory/in_memory_collection.py Added support for Callable filters and updated error handling for get without keys.
python/semantic_kernel/connectors/memory/chroma/chroma.py Introduced lambda filter support and updated type imports.
python/semantic_kernel/connectors/memory/azure_cosmos_db/* Updated filter property usage and query building logic.
python/semantic_kernel/connectors/memory/azure_ai_search/* Removed obsolete Azure AI Search connector files as part of consolidation.
python/samples/concepts/memory/* Updated sample data model definitions and search/upsert methods to reflect rename changes.
python/samples/concepts/chat_history/, python/samples/concepts/caching/ Updated field attribute names for filtering and indexing.
Comments suppressed due to low confidence (2)

python/semantic_kernel/connectors/memory/pinecone/_pinecone.py:465

  • [nitpick] The variable name 'filter' shadows the built-in filter function. Consider renaming it (e.g. 'flt' or 'built_filter') to avoid potential confusion.
if options.filter and (filter := self._build_filter(options.filter)):

python/semantic_kernel/connectors/memory/azure_ai_search/utils.py:1

  • The entire Azure AI Search connector has been removed. Please ensure that any dependent modules or documentation are updated accordingly and that this change is intentional.
# File removed entirely

else:
raise VectorStoreOperationException("No lambda expression found in the filter.")

def _lambda_parser(self, node: ast.AST) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this be used in the future by other memory connectors? Should this live in a common module? Given the complexity of the logic, it may be beneficial to move it to its own module for test purposes.

) -> Sequence[Any] | None:
if not keys:
if options is not None:
raise NotImplementedError("Get without keys is not yet implemented.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only raised when options is not None and key is empty, but the error message doesn't give the full context.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This happens in other connectors too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
memory python Pull requests for the Python Semantic Kernel
Projects
None yet
4 participants