22
33Map in a Map vertex takes an input and returns 0, 1, or more outputs (also known as flat-map operation). Map is an element wise operator.
44
5- ## Build Your Own UDF
6-
7- You can build your own UDF in multiple languages.
8-
9- Check the links below to see the UDF examples for different languages.
10-
11- - [ Python] ( https://github.com/numaproj/numaflow-python/tree/main/packages/pynumaflow/examples/map/ )
12- - [ Golang] ( https://github.com/numaproj/numaflow-go/tree/main/examples/mapper/ )
13- - [ Java] ( https://github.com/numaproj/numaflow-java/tree/main/examples/src/main/java/io/numaproj/numaflow/examples/map/ )
14-
155After building a docker image for the written UDF, specify the image as below in the vertex spec.
166
177``` yaml
@@ -23,44 +13,45 @@ spec:
2313 image : my-python-udf-example:latest
2414` ` `
2515
26- ### Streaming Mode
16+ Map supports three modes: [Unary](#unary-mode), [ Streaming](#streaming-mode), and [Batch](#batch-mode).
2717
28- In cases the map function generates more than one output (e.g., flat map), the UDF can be
29- configured to run in a streaming mode instead of batching, which is the default mode.
30- In streaming mode, the messages will be pushed to the downstream vertices once generated
31- instead of in a batch at the end.
18+ ## Unary Mode
3219
33- Note that to maintain data orderliness, we restrict the read batch size to be ` 1` .
20+ Unary Map is the default mode where each input message is processed individually and returns 0, 1, or more outputs .
3421
35- ` ` ` yaml
36- spec:
37- vertices:
38- - name: my-vertex
39- limits:
40- # mapstreaming won't work if readBatchSize is != 1
41- readBatchSize: 1
42- ` ` `
22+ Check the links below to see the UDF examples for different languages.
23+
24+ - [Python](https://github.com/numaproj/numaflow-python/tree/main/packages/pynumaflow/examples/map/)
25+ - [Golang](https://github.com/numaproj/numaflow-go/tree/main/examples/mapper/)
26+ - [Java](https://github.com/numaproj/numaflow-java/tree/main/examples/src/main/java/io/numaproj/numaflow/examples/map/)
27+
28+ ## Streaming Mode
29+
30+ In cases the map function generates more than one output (e.g., flat map), the UDF can be
31+ configured to run in a streaming mode where the messages will be pushed to the downstream vertices as
32+ soon as the output is generated instead of collecting all the responses and then sending them
33+ together at the end when the function returns.
4334
4435Check the links below to see the UDF examples in streaming mode for different languages.
4536
4637- [Python](https://github.com/numaproj/numaflow-python/tree/main/packages/pynumaflow/examples/mapstream/flatmap_stream/)
4738- [Golang](https://github.com/numaproj/numaflow-go/tree/main/examples/mapstreamer/flatmap_stream/)
4839- [Java](https://github.com/numaproj/numaflow-java/tree/main/examples/src/main/java/io/numaproj/numaflow/examples/mapstream/flatmapstream/)
4940
50- # ## Batch Map Mode
41+ ## Batch Mode
5142
5243BatchMap is an interface that allows developers to process multiple data items in a UDF single call,
5344rather than each item in separate calls.
5445
5546The BatchMap interface can be helpful in scenarios where performing operations on a group of data can be more efficient.
5647
57- # ### Important Considerations
48+ ### Important Considerations
5849
5950When using BatchMap, there are a few important considerations to keep in mind:
6051
61- - Ensure that the BatchResponses object is tagged with the correct request ID.
52+ - Ensure that the BatchResponses object is tagged with the correct request ID.
6253Each Datum has a unique ID tag, which will be used by Numaflow to ensure correctness.
63- - Ensure that the length of the BatchResponses list is equal to the number of requests received. This means that for
54+ - Ensure that the length of the BatchResponses list is equal to the number of requests received. This means that for
6455every input data item, there should be a corresponding response in the BatchResponses list.
6556- The total batch size can be up to ` readBatchSize` long.
6657
@@ -71,7 +62,7 @@ Check the links below to see the UDF examples in batch mode for different langua
7162- [Java](https://github.com/numaproj/numaflow-java/tree/main/examples/src/main/java/io/numaproj/numaflow/examples/batchmap/)
7263- [Rust](https://github.com/numaproj/numaflow-rs/tree/main/examples/batchmap-cat/)
7364
74- # ## Available Environment Variables
65+ # # Available Environment Variables
7566
7667Some environment variables are available in the user-defined function container, they might be useful in your own UDF implementation.
7768
@@ -81,7 +72,9 @@ Some environment variables are available in the user-defined function container,
8172- ` NUMAFLOW_PIPELINE_NAME` - Name of the pipeline.
8273- ` NUMAFLOW_VERTEX_NAME` - Name of the vertex.
8374
84- # ## Configuration
75+ # # Configuration
76+
77+ To achieve ordering, please set `readBatchSize` to 1.
8578
8679Configuration data can be provided to the UDF container at runtime multiple ways.
8780
0 commit comments