-
Notifications
You must be signed in to change notification settings - Fork 148
Feat/second nic communication #3129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feat/second nic communication #3129
Conversation
Signed-off-by: Kazuki Yamamoto <[email protected]>
Signed-off-by: Kazuki Yamamoto <[email protected]>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3129 +/- ##
==========================================
+ Coverage 79.73% 80.14% +0.41%
==========================================
Files 291 296 +5
Lines 65143 67530 +2387
==========================================
+ Hits 51941 54122 +2181
- Misses 12649 12858 +209
+ Partials 553 550 -3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Thank you for putting this together. I have a couple of basic questions.
|
As a premise, we assume a use case of object detection using video inference. Therefore, video frames are used as input data. In this use case, losing some video frames is not a critical issue. As a result, retransmission of lost data is out of scope.
First, the user defines the connectivity between Vertices in spec.edges of the manifest file, as before. An external controller watches Pod deployments and checks whether the domain name of the MultiNetwork Service corresponding to each Vertex and the Pod’s SecondNIC IP are registered in CoreDNS. If not, it registers them. Now, let me move on to application execution. Additionally, standardization of the Service for MultiNetwork via the Gateway API is currently under consideration, but the specification is still undecided. |
Signed-off-by: Kazuki Yamamoto <[email protected]>
vigith
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please provide links to all the important technology you are referencing to. It will help us understand better.
proposal/README.md
Outdated
|
|
||
| Therefore, we consider introducing a high-speed communication method with low transfer overhead, such as GPUDirect RDMA, for inter-vertex communication. To achieve this, the following elements are required. | ||
|
|
||
| 1. GPUDirect RDMA, which enables direct device-to-device communication between GPUs, requires RDMA-capable NICs. In other words, it is necessary to introduce a high-speed network by assigning a second NIC for RDMA to each pod, separate from the default network. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GPUDirect RDMA
Could you please provide link for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve updated the README and embedded the links.
Here is a list of links that I have come up with so far. |
I will explain the MultiNetwork-related links as a supplement to my response to the previous question ( As a prerequisite, in order to use GPUDirect RDMA, the application running inside the container must know the IP addresses of the destination Pods. The related components currently work as follows:
The problem is that EndpointSlice only collects IP addresses from the default network. As a result, this mechanism does not work for MultiNetwork paths that use a Second NIC. To address this issue, we are considering two possible approaches. The first approach is to use the CoreDNS etcd plugin to register a domain name corresponding to each Vertex, along with the list of Second NIC IP addresses assigned to the Pods belonging to that Vertex. Given the current Numaflow specification, where each Vertex already has the name of the next destination Vertex as an environment variable, we believe it is also feasible to provide the domain name of the destination Vertex. By querying CoreDNS during application execution, the application can then obtain the candidate destination IP addresses. The second approach is based on ongoing efforts(GEP-3539) to evolve the Service API for MultiNetwork support using the Gateway API. At this point, we prefer to start with the first approach because it has a lower implementation cost. However, if it turns out to be infeasible, we plan to move forward with the second approach. |
- Change doc structure - add a new chapter(Functionality) - Swap the order Workflow and Resource Specification - Update Workflow and Resource Specification Signed-off-by: Kazuki Yamamoto <[email protected]>
Signed-off-by: Kazuki Yamamoto <[email protected]>
What this PR does / why we need it
This PR proposes a new method for enabling direct data communication between Numaflow vertices, and describes the motivation, resource specifications and workflows.
Internal discussions are ongoing, so we will revise the doc as appropriate.
Related issues
#2990
This PR is an initial design derived from the above post.
We would like to discuss this with the community.
Testing
This PR includes only documentation, so no tests were performed.
Special notes for reviewers
Since base branch is wrong in #3125, I recreated this PR.