Flow Matching Elites with NVIDIA PhysicalAI Autonomous Vehicles dataset
A few months ago NVIDIA open-sourced their Physical AI AV dataset (PAI) and since I was planning to experiment with developing the Flow Matching Elites after my earlier post on the Diffusion Elites, I decided to give it a try and use the NVIDIA PAI dataset for it instead of synthesizing trajectories.
NVIDIA PAI Dataset
The NVIDIA PAI dataset seems to be mostly data collected using their Hyperion platform, not clear to me which versions of the platform they used (probably a mix) but since I was going to use only the ego motion trajectories I looked at some samples and they looked reasonable from the kinematic perspective (except for a few), since it is a lot of data with ~306k 20s scenes, I decided to use only ~20k initially to make training more amenable since I was using only a single RTX A6000 for the experiments.
Here you are some trajectories from the dataset after transforming into ego coordinate frame and interpolating at 10hz:

And here you are an animation sampling trajectories from it and unrolling to facilitate visualization:
Note that I’m not using anything from the dataset besides the ego motion trajectories, the main goal is to get a decent flow matching model where I can sample trajectories and then use this model in the elites loop like I showed on the Diffusion Elites post.
Flow Matching Elites
The diagram below shows the main overview of the method:
First we train a flow matching model using the 10hz resampled trajectories from the NVIDIA PAI dataset, we predict 6 seconds in the future unconditionally, there is no conditioning in the model other than the time conditioning from flow matching and we don’t use anything else from the dataset besides the ego trajectories.
The model I trained was a 1D CNN Unet using FiLM modulation for the time but no goal, obstacle or anything, we go from the gaussian noise to the trajectory unconditionally.
After that, we have a decent model where we can easily sample trajectories. This is where the Flow Matching Elites (FM-Elites) enters now, we basically run a few iteration steps of:
- Sampling from the unit Gaussian distribution
- Decode a bunch of trajectories using Euler for solving the flow matching ODE
- We then score the trajectories using a simple reward (distance to goal, obstacle avoidance and cross-tracking error)
- Then we keep the 10% elites and refit the Gaussian of the latent noise
- The best trajectory gets executed (first step or an action chunk)
- Repeat !
Some results with a random obstacle
The animation below shows the simulation when there is a static obstacle in front of the agent:
So there is a lot going on in this animation. First, note that the model is a unconditional model, doesn’t use a kinematic model and it doesn’t know about the existence of the obstacle as well, we are evolving the flow matching latents in order to adapt to the reward that we defined. The blue lines are all the candidates after the end of the inner-loop of the Flow Matching Elites and the orange one is the best candidate that was selected to be executed, this repeats every time there is a replanning, so Flow Matching Elites in this case is the inner-loop of the simulation. Note as well at the end when the agents gets closer to the target it has a lot of difficulty “touching” it, mainly because naturally the dataset doesn’t contain samples with the agent colliding so it is sampling low acceleration trajectories with stops, which is quite nice to see it naturally arising.
Now, what if the obstacle is moving ?
As we can see in the example above, it works just as fine as when it is static to find trajectories that are avoiding the safety margin of the obstacle as well.
These are just simple examples of using Flow Matching Elites for continuous control, but it is not the only task it can solve. The main interesting point here is that it is iterating on the flow matching latents (noise samples) using a population, so there are many knobs as well to improve and trade-off performance vs compute resources. I’m planning to open-source a framework that has a more general loop for experimenting with the algorithm, I think loops like that are very similar as well to what is being used for scientific discovery and I think there is a lot to explore there on other fields as well and also using more techniques from evolutionary computation (EC).
Hope you enjoyed !
– Christian S. Perone
