Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
97 changes: 97 additions & 0 deletions src/MongoDB.Driver/CreateAutoEmbeddingVectorSearchIndexModel.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
/* Copyright 2010-present MongoDB Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

using System;
using System.Collections.Generic;
using System.Linq;
using System.Linq.Expressions;
using MongoDB.Bson;

namespace MongoDB.Driver;

/// <summary>
/// Defines a vector index model for an auto-embedding vector index using strongly-typed C# APIs.
/// </summary>
public sealed class CreateAutoEmbeddingVectorSearchIndexModel<TDocument> : CreateVectorSearchIndexModelBase<TDocument>
{
/// <summary>
/// The name of the embedding model to use, such as "voyage-4", "voyage-4-large", etc.
/// </summary>
public string AutoEmbeddingModelName { get; }

/// <summary>
/// Indicates the type of data that will be embedded for an auto-embedding index.
/// </summary>
public VectorEmbeddingModality Modality { get; init; } = VectorEmbeddingModality.Text;

/// <summary>
/// Initializes a new instance of the <see cref="CreateVectorSearchIndexModel{TDocument}"/> for a vector index
/// that will automatically create embeddings from a given field in the document. The embedding model to use must
/// be passed to this constructor.
/// </summary>
/// <param name="name">The index name.</param>
/// <param name="field">The field containing the vectors to index.</param>
/// <param name="embeddingModelName">The name of the embedding model to use, such as "voyage-4", "voyage-4-large", etc.</param>
/// <param name="filterFields">Fields that may be used as filters in the vector query.</param>
public CreateAutoEmbeddingVectorSearchIndexModel(
FieldDefinition<TDocument> field,
string name,
string embeddingModelName,
params FieldDefinition<TDocument>[] filterFields)
: base(field, name, filterFields)
{
AutoEmbeddingModelName = embeddingModelName;
}

/// <summary>
/// Initializes a new instance of the <see cref="CreateVectorSearchIndexModel{TDocument}"/> for a vector index
/// that will automatically create embeddings from a given field in the document. The embedding model to use must
/// be passed to this constructor.
/// </summary>
/// <param name="name">The index name.</param>
/// <param name="field">An expression pointing to the field containing the vectors to index.</param>
/// <param name="embeddingModelName">The name of the embedding model to use, such as "voyage-4", "voyage-4-large", etc.</param>
/// <param name="filterFields">Expressions pointing to fields that may be used as filters in the vector query.</param>
public CreateAutoEmbeddingVectorSearchIndexModel(
Expression<Func<TDocument, object>> field,
string name,
string embeddingModelName,
params Expression<Func<TDocument, object>>[] filterFields)
: this(
new ExpressionFieldDefinition<TDocument>(field),
name,
embeddingModelName,
filterFields?
.Select(f => (FieldDefinition<TDocument>)new ExpressionFieldDefinition<TDocument>(f))
.ToArray())
{
}

/// <inheritdoc/>
internal override BsonDocument Render(RenderArgs<TDocument> renderArgs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion for CreateSearchIndexModel:
Should we introduce
BsonDocument Render(RenderArgs<TDocument> renderArgs) => Definition
for more uniform handling in CreateCreateIndexesOperation?

Also now that Render is internal, the exception in CreateSearchIndexModel.Definition is not accurate. Is it possible to override that property and throw there? I think making a property is acceptable as breaking change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you refer back to the comments that Robert had on the previous PR, and see if you are okay with the breaking changes that were rejected there?
#1769
#1795

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

. Is it possible to override that property and throw there?

From: #1795 (review)

If making this property virtual is a breaking change, we could just let it return null in the base class.

Copy link
Contributor Author

@ajcvickers ajcvickers Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, anything that breaks what the property currently does--like throwing and expecting people (almost certainly just us) to call Render instead for current cases that work with the old type.

My preference historically would be to take small binary breaks to things like this that are primarily called by the driver, in order to keep internal quality high, and to create a coherent design and experience. But I was under the impression that we didn't want to take binary breaks because of the impact of third-party libraries that have not been rebuilt against latest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

property virtual is a breaking change
I think this is something we can tolerate, I don't see how this can be breaking in practice.

Also my other suggestion was having Render method in CreateSearchIndexModel
BsonDocument Render(RenderArgs<TDocument> renderArgs) => Definition, for consistency.

Both suggestions for your consideration.

{
var vectorField = new BsonDocument
{
{ "type", "autoEmbed" },
{ "path", Field.Render(renderArgs).FieldName },
{ "modality", Modality.ToString().ToLowerInvariant() },
{ "model", AutoEmbeddingModelName },
};

var fieldDocuments = new List<BsonDocument> { vectorField };
RenderFilterFields(renderArgs, fieldDocuments);
return new BsonDocument { { "fields", new BsonArray(fieldDocuments) } };
}
}
77 changes: 23 additions & 54 deletions src/MongoDB.Driver/CreateVectorSearchIndexModel.cs
Original file line number Diff line number Diff line change
Expand Up @@ -22,15 +22,10 @@
namespace MongoDB.Driver;

/// <summary>
/// Defines a vector index model using strongly-typed C# APIs.
/// Defines a vector index model for pre-embedded vector indexes using strongly-typed C# APIs.
/// </summary>
public sealed class CreateVectorSearchIndexModel<TDocument> : CreateSearchIndexModel
public sealed class CreateVectorSearchIndexModel<TDocument> : CreateVectorSearchIndexModelBase<TDocument>
{
/// <summary>
/// The field containing the vectors to index.
/// </summary>
public FieldDefinition<TDocument> Field { get; }

/// <summary>
/// The <see cref="VectorSimilarity"/> to use to search for top K-nearest neighbors.
/// </summary>
Expand All @@ -41,11 +36,6 @@ public sealed class CreateVectorSearchIndexModel<TDocument> : CreateSearchIndexM
/// </summary>
public int Dimensions { get; }

/// <summary>
/// Fields that may be used as filters in the vector query.
/// </summary>
public IReadOnlyList<FieldDefinition<TDocument>> FilterFields { get; }

/// <summary>
/// Type of automatic vector quantization for your vectors.
/// </summary>
Expand All @@ -57,13 +47,15 @@ public sealed class CreateVectorSearchIndexModel<TDocument> : CreateSearchIndexM
public int? HnswMaxEdges { get; init; }

/// <summary>
/// Analogous to numCandidates at query-time, this parameter controls the maximum number of nodes to evaluate to find the closest neighbors to connect to a new node.
/// Analogous to numCandidates at query-time, this parameter controls the maximum number of nodes to evaluate to
/// find the closest neighbors to connect to a new node.
/// </summary>
public int? HnswNumEdgeCandidates { get; init; }

/// <summary>
/// Initializes a new instance of the <see cref="CreateVectorSearchIndexModel{TDocument}"/> class, passing the
/// required options for <see cref="VectorSimilarity"/> and the number of vector dimensions to the constructor.
/// Initializes a new instance of the <see cref="CreateVectorSearchIndexModel{TDocument}"/> class for a vector
/// index where the vector embeddings are created manually. The required options for <see cref="VectorSimilarity"/>
/// and the number of vector dimensions are passed to the constructor.
/// </summary>
/// <param name="name">The index name.</param>
/// <param name="field">The field containing the vectors to index.</param>
Expand All @@ -76,17 +68,16 @@ public CreateVectorSearchIndexModel(
VectorSimilarity similarity,
int dimensions,
params FieldDefinition<TDocument>[] filterFields)
: base(name, SearchIndexType.VectorSearch)
: base(field, name, filterFields)
{
Field = field;
Similarity = similarity;
Dimensions = dimensions;
FilterFields = filterFields?.ToList() ?? [];
}

/// <summary>
/// Initializes a new instance of the <see cref="CreateVectorSearchIndexModel{TDocument}"/> class, passing the
/// required options for <see cref="VectorSimilarity"/> and the number of vector dimensions to the constructor.
/// Initializes a new instance of the <see cref="CreateVectorSearchIndexModel{TDocument}"/> class for a vector
/// index where the vector embeddings are created manually. The required options for <see cref="VectorSimilarity"/>
/// and the number of vector dimensions are passed to the constructor.
/// </summary>
/// <param name="name">The index name.</param>
/// <param name="field">An expression pointing to the field containing the vectors to index.</param>
Expand All @@ -110,56 +101,34 @@ public CreateVectorSearchIndexModel(
{
}

/// <summary>
/// Renders the index model to a <see cref="BsonDocument"/>.
/// </summary>
/// <param name="renderArgs">The render arguments.</param>
/// <returns>A <see cref="BsonDocument" />.</returns>
public BsonDocument Render(RenderArgs<TDocument> renderArgs)
/// <inheritdoc/>
internal override BsonDocument Render(RenderArgs<TDocument> renderArgs)
{
var similarityValue = Similarity == VectorSimilarity.DotProduct
? "dotProduct" // Because neither "DotProduct" or "dotproduct" are allowed.
: Similarity.ToString().ToLowerInvariant();

var vectorField = new BsonDocument
{
{ "type", BsonString.Create("vector") },
{ "type", "vector" },
{ "path", Field.Render(renderArgs).FieldName },
{ "numDimensions", BsonInt32.Create(Dimensions) },
{ "similarity", BsonString.Create(similarityValue) },
{ "numDimensions", Dimensions },
{ "similarity", similarityValue },
};

if (Quantization.HasValue)
{
vectorField.Add("quantization", BsonString.Create(Quantization.ToString()?.ToLower()));
}
vectorField.Add("quantization", Quantization.ToString()?.ToLowerInvariant(), Quantization.HasValue);

if (HnswMaxEdges != null || HnswNumEdgeCandidates != null)
{
var hnswDocument = new BsonDocument
{
{ "maxEdges", BsonInt32.Create(HnswMaxEdges ?? 16) },
{ "numEdgeCandidates", BsonInt32.Create(HnswNumEdgeCandidates ?? 100) }
};
vectorField.Add("hnswOptions", hnswDocument);
}

var fieldDocuments = new List<BsonDocument> { vectorField };

if (FilterFields != null)
{
foreach (var filterPath in FilterFields)
{
var fieldDocument = new BsonDocument
vectorField.Add("hnswOptions",
new BsonDocument
{
{ "type", BsonString.Create("filter") },
{ "path", BsonString.Create(filterPath.Render(renderArgs).FieldName) }
};

fieldDocuments.Add(fieldDocument);
}
{ "maxEdges", HnswMaxEdges ?? 16 }, { "numEdgeCandidates", HnswNumEdgeCandidates ?? 100 }
});
}

var fieldDocuments = new List<BsonDocument> { vectorField };
RenderFilterFields(renderArgs, fieldDocuments);
return new BsonDocument { { "fields", new BsonArray(fieldDocuments) } };
}
}
82 changes: 82 additions & 0 deletions src/MongoDB.Driver/CreateVectorSearchIndexModelBase.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
/* Copyright 2010-present MongoDB Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

using System.Collections.Generic;
using System.Linq;
using MongoDB.Bson;

namespace MongoDB.Driver;

/// <summary>
/// Defines common parts of a vector index model using strongly-typed C# APIs.
/// </summary>
public abstract class CreateVectorSearchIndexModelBase<TDocument> : CreateSearchIndexModel
{
/// <summary>
/// The field containing the vectors to index.
/// </summary>
public FieldDefinition<TDocument> Field { get; }

/// <summary>
/// Fields that may be used as filters in the vector query.
/// </summary>
public IReadOnlyList<FieldDefinition<TDocument>> FilterFields { get; }

/// <summary>
/// Initializes a new instance of the <see cref="CreateVectorSearchIndexModel{TDocument}"/> class for a vector
/// index where the vector embeddings are created manually. The required options for <see cref="VectorSimilarity"/>
/// and the number of vector dimensions are passed to the constructor.
/// </summary>
/// <param name="name">The index name.</param>
/// <param name="field">The field containing the vectors to index.</param>
/// <param name="filterFields">Fields that may be used as filters in the vector query.</param>
protected CreateVectorSearchIndexModelBase(
FieldDefinition<TDocument> field,
string name,
params FieldDefinition<TDocument>[] filterFields)
: base(name, SearchIndexType.VectorSearch)
{
Field = field;
FilterFields = filterFields?.ToList() ?? [];
}

/// <summary>
/// Renders the index model to a <see cref="BsonDocument"/>.
/// </summary>
/// <param name="renderArgs">The render arguments.</param>
/// <returns>A <see cref="BsonDocument" />.</returns>
internal abstract BsonDocument Render(RenderArgs<TDocument> renderArgs);

/// <summary>
/// Called by subclasses to render the filters for the index fields section.
/// </summary>
/// <param name="renderArgs">The render args.</param>
/// <param name="fields">The list into which fields should be added.</param>
private protected void RenderFilterFields(RenderArgs<TDocument> renderArgs, List<BsonDocument> fields)
{
if (FilterFields != null)
{
foreach (var filterPath in FilterFields)
{
var fieldDocument = new BsonDocument
{
{ "type", "filter" }, { "path", filterPath.Render(renderArgs).FieldName }
};

fields.Add(fieldDocument);
}
}
}
}
2 changes: 1 addition & 1 deletion src/MongoDB.Driver/MongoCollectionImpl.cs
Original file line number Diff line number Diff line change
Expand Up @@ -1752,7 +1752,7 @@ private CreateSearchIndexesOperation CreateCreateIndexesOperation(
=> new CreateSearchIndexRequest(
model.Name,
model.Type,
model is CreateVectorSearchIndexModel<TDocument> createVectorSearchIndexModel
model is CreateVectorSearchIndexModelBase<TDocument> createVectorSearchIndexModel
? createVectorSearchIndexModel.Render(renderArgs)
: model.Definition)),
_collection._messageEncoderSettings);
Expand Down
12 changes: 11 additions & 1 deletion src/MongoDB.Driver/PipelineStageDefinitionBuilder.cs
Original file line number Diff line number Diff line change
Expand Up @@ -2167,9 +2167,9 @@ public static PipelineStageDefinition<TInput, TInput> VectorSearch<TInput>(
args =>
{
ClientSideProjectionHelper.ThrowIfClientSideProjection(args.DocumentSerializer, operatorName);

var vectorSearchOperator = new BsonDocument
{
{ "queryVector", queryVector.Vector },
{ "path", field.Render(args).FieldName },
{ "limit", limit },
{ "numCandidates", options?.NumberOfCandidates ?? limit * 10, options?.Exact != true },
Expand All @@ -2178,6 +2178,16 @@ public static PipelineStageDefinition<TInput, TInput> VectorSearch<TInput>(
{ "exact", true, options?.Exact == true }
};

if (queryVector.Vector is BsonString bsonString)
{
vectorSearchOperator["query"] = new BsonDocument { { "text", bsonString } };
vectorSearchOperator.Add("model", options?.AutoEmbeddingModelName, options?.AutoEmbeddingModelName != null);
}
else
{
vectorSearchOperator["queryVector"] = queryVector.Vector;
}

var document = new BsonDocument(operatorName, vectorSearchOperator);
return new RenderedPipelineStageDefinition<TInput>(operatorName, document, args.DocumentSerializer);
});
Expand Down
21 changes: 21 additions & 0 deletions src/MongoDB.Driver/QueryVector.cs
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,18 @@ private QueryVector(BsonArray array)
Vector = array;
}

/// <summary>
/// Initializes a new instance of the <see cref="QueryVector"/> class with un-vectorized text for use with
/// an auto-embedding vector index, which will create the actual vector from this text.
/// </summary>
/// <param name="text">The text to search for.</param>
public QueryVector(string text)
{
Ensure.IsNotNull(text, nameof(text));

Vector = text;
}

/// <summary>
/// Initializes a new instance of the <see cref="QueryVector"/> class.
/// </summary>
Expand Down Expand Up @@ -79,6 +91,15 @@ public QueryVector(ReadOnlyMemory<int> readOnlyMemory) :
{
}

/// <summary>
/// Performs an implicit conversion from <see cref="string"/> to <see cref="QueryVector"/>.
/// </summary>
/// <param name="text">The query text, for use with an auto-embedding index.</param>
/// <returns>
/// The result of the conversion.
/// </returns>
public static implicit operator QueryVector(string text) => new(text);

/// <summary>
/// Performs an implicit conversion from <see cref="double"/>[] to <see cref="QueryVector"/>.
/// </summary>
Expand Down
Loading