API Endpoint Overview

The HGS API provides a /query endpoint that accepts a JSON payload. This payload specifies the parameters and search logic, enabling keyword searches, phrase matches, and vector-based nearest neighbor (KNN) searches. The endpoint supports flexible query structures, which can be easily adapted for various use cases, such as provider search, information search, and EMR search.

See: Clinia Search API

Example Query Payload

Here's a basic example of a HGS query payload:

{
  "page": 0,
  "perPage": 5,
  "filter": {},
  "highlighting": ["abstract.passages", "title"],
  "query": {
    "or": [
      {
        "match": {
          "title": {
            "value": "How can NSAIDs help in the inflammation response in CTE",
            "type": "phrase"
          }
        }
      },
      {
        "match": {
          "abstract.passages": {
            "value": "How can NSAIDs help in the inflammation response in CTE",
            "type": "phrase"
          }
        }
      },
      {
        "knn": {
          "vector": {
            "value": "How can NSAIDs help in the inflammation response in CTE"
          }
        }
      }
    ]
  }
}

Breakdown of the Payload

Let’s break down each part of the payload to understand how it works:

Pagination and Filters

Pagination Parameters:
- "page", "perPage" and "filters" act the same as in the Standard Search.

Query

Query Parameters:
- More information on how various operators and logic can be applied to query parameters can be found in Search Operators.
- "or": An array of conditions that are evaluated using a logical "OR" operation. This allows multiple search methods to be applied simultaneously. In the example above, it combines a phrase match and a KNN search.
Search Methods:
- Phrase Match:
  - match: Operator defining a phrase-based search procedure.
  - "title" and "abstract.passages": Fields in the model where the search is performed.
    - "value": The search term to query against.
    - "type": Specifies the match type (e.g., "phrase", which looks for an exact match).
- KNN (K-Nearest Neighbor) Search:
  - "knn": Defines a vector-based search, retrieving results based on semantic similarity.
  - "value": The query text inputted by the user, is converted into a vector using the specified model.
  - "propertyKey": The key indicating where the vector embeddings are stored (e.g., "vector").
  - "modelID": The embedding model used for vectorization (e.g., "embed-v1").
  - "k": The number of nearest neighbors (results) to retrieve (e.g., 10).

Highlighting

The highlighting parameter is used to emphasize key search terms in the results. This feature helps users understand why specific results were returned by visually distinguishing the matching portions.

For example:

"highlighting": {  
    "address.city": [  
        {  
            "highlight": "<em>Montreal</em>",  
            "type": "textual"  
        }  
    ],  
    "object.addresses.city": [  
        {  
            "highlight": "<em>Quebec</em>",  
            "type": "textual"  
        },  
        {  
            "highlight": "<em>Laval</em>",  
            "type": "textual"  
        }  
    ],  
    "symbol": [  
        {  
            "highlight": "<em>A</em> <em>clinic</em> name",  
            "type": "textual"  
        }  
    ]  
}

The highlighting parameter allows you to pass an array of strings representing the fields where the highlights should appear. In the response, you will receive the specific highlighted values in markdown format, making it easy to integrate into UI elements.

Customizing Your Queries

HGS allows you to customize your search procedures based on the type of search you want to perform:

Keyword or Phrase Search

Use the "match" parameter to specify the type of match:

"type": "phrase": Searches for an exact match of the phrase within the specified fields.
"type": "word": Searches for word occurrences.

Example:

{
  "match": {
    "title": {
      "value": "CTE treatments",
      "type": "keyword"
    }
  }
}

Vector-Based (KNN) Search

Use the "knn" parameter to perform a K-Nearest Neighbor search based on the query’s vector embedding. The path to the vector embedding (e.g., abstract.passages.vector) is enriched during the ingestion pipeline, so you do not need to specify a model when performing the search.

Here’s an example of how to structure a KNN query:

{
  "knn": {
    "abstract.passages.vector": {
      "value": "neurological impact of CTE"
    }
  }  
}

In this example:

The field abstract.passages.vector is used to search for documents based on their semantic similarity to the query "neurological impact of CTE". This vector field is generated by the ingestion pipeline using a preconfigured model.
The "value" key holds the query text, which will be converted into a vector and compared to the document vectors stored in the field.

Response Structure:

results: Contains the list of documents matching the query.
- Each document includes fields like "id", "title", and "abstract".
- The "highlights" section shows the exact text matches, highlighted for clarity.
pagination: Provides information about the page number, results per page, and total number of results.

Combining Multiple Search Methods

HGS supports combining different search methods using logical operators like "or" and "and".

Example combining phrase and KNN search:

{  
  "or": [  
    {  
      "match": {  
        "abstract.passages": {  
          "value": "neurological impact of CTE",  
          "type": "phrase"  
        }  
      }  
    },  
    {  
      "knn": {
        "abstract.passages.vector": {
        	"value": "neurological impact of CTE"
        }
      }  
    }  
  ]  
}

API Response

The HGS endpoint returns a list of matched documents, including highlighted sections that show why a result was relevant.

{
  "hits": [
    {
      "highlighting": {
        "abstract.passages": [
          {
            "data": "NSAIDs help in reducing inflammation...",
            "path": "abstract.passages.0",
            "score": 0.2887,
            "type": "vector"
          }
        ]
      },
      "resource": {
        "data": {
          "title": "How NSAIDs Reduce Inflammation",
          "abstract": "NSAIDs are effective in reducing inflammation in CTE patients..."
        },
        "id": "1",
        "meta": {
          "published": "2023-10-10",
          "author": "Dr. Smith"
        },
        "type": "articles"
      }
    },
    {
      "highlighting": {
        "abstract.passages": [
          {
            "data": "CTE is a chronic condition...",
            "path": "abstract.passages.1",
            "score": 0.2887,
            "type": "vector"
          }
        ]
      },
      "resource": {
        "data": {
          "title": "The Role of NSAIDs in CTE Treatment",
          "abstract": "This study explores the use of NSAIDs in treating inflammation associated with CTE..."
        },
        "id": "2",
        "meta": {
          "published": "2024-01-15",
          "author": "Dr. Johnson"
        },
        "type": "articles"
      }
    }
  ],
  "pagination": {
    "page": 0,
    "perPage": 5,
    "totalResults": 2
  }
}

In the response, the highlighting field shows the matching portions of text and the resource field contains full document metadata, including publication information. This structure helps users understand the relevance of each result and display the appropriate data in their application.