- move preprocess word embeddings to dataset class - make word embeddings attribute of dataset class - decouple from datamodule logic - Should be identical to sentence embeddings logic - use real word embeddings not sentence transformers (optional, maybe for later)