-
-
Notifications
You must be signed in to change notification settings - Fork 55
Description
Thank you for your work!It helps a lot.
But there's a question that i am confused about.
UniVLA/prismatic/vla/datasets/rlds/oxe/transforms.py
Lines 64 to 90 in 2b780a5
| def bridge_orig_dataset_transform(trajectory: Dict[str, Any]) -> Dict[str, Any]: | |
| """ | |
| Applies to original version of Bridge V2 from the official project website. | |
| Note =>> In original Bridge V2 dataset, the first timestep has an all-zero action, so we remove it! | |
| """ | |
| for key in trajectory.keys(): | |
| if key == "traj_metadata": | |
| continue | |
| elif key == "observation": | |
| for key2 in trajectory[key]: | |
| trajectory[key][key2] = trajectory[key][key2][1:] | |
| else: | |
| trajectory[key] = trajectory[key][1:] | |
| trajectory["action"] = tf.concat( | |
| [ | |
| trajectory["action"][:, :6], | |
| binarize_gripper_actions(trajectory["action"][:, -1])[:, None], | |
| ], | |
| axis=1, | |
| ) | |
| # print(trajectory.keys(), trajectory['observation'].keys()) | |
| trajectory = relabel_bridge_actions(trajectory) | |
| trajectory["observation"]["EEF_state"] = trajectory["observation"]["state"][:, :6] | |
| trajectory["observation"]["gripper_state"] = trajectory["observation"]["state"][:, -1:] | |
| return trajectory |
UniVLA/prismatic/vla/datasets/rlds/utils/data_utils.py
Lines 165 to 172 in 2b780a5
| # === Bridge-V2 =>> Dataset-Specific Transform === | |
| def relabel_bridge_actions(traj: Dict[str, Any]) -> Dict[str, Any]: | |
| """Relabels actions to use reached proprioceptive state; discards last timestep (no-action).""" | |
| movement_actions = traj["observation"]["state"][1:, :6] - traj["observation"]["state"][:-1, :6] | |
| traj_truncated = tf.nest.map_structure(lambda x: x[:-1], traj) | |
| traj_truncated["action"] = tf.concat([movement_actions, traj["action"][:-1, -1:]], axis=1) | |
| return traj_truncated |
I'm using data from https://rail.eecs.berkeley.edu/datasets/bridge_release/data/tfds/bridge_dataset/1.0.0/.
I'd like to ask why state interpolation is used here to represent actions instead of directly using the action values?
I directly downloaded the data and retrieved state0, 1, 2, as well as action0, 1. I then calculated the difference between states and found the values aren't entirely consistent.
Is my understanding incorrect, or is there another consideration at play? I'd be very grateful if you could clarify this.