1. computing dense representation of sparse features
2. allowing interactions between sparse features and wide features
More specifically, the model architecture changes from
dense_score = dense_ftrs -> MLP
sparse_score = sparse_ftrs -> Linear
final_score = dense_score + sparse_score
to
sparse_emb_ftrs = sparse_ftrs -> Dense(sp_emb_size)
all_ftrs = (dense_ftrs, sparse_emb_ftrs) -> Concatenate
final_score= all_ftrs -> MLP