we have asymmetry around reading/writing writetime for both dataframes & RDDs (if I remember correctly).
If I have table like this: (id PK, c), and then have data:
then when I read data with writeTime on c, I’ll get some writetime for 1st row, and null for 2nd. And job will fail if I write read data back with writetime, because writeTime for 2nd row will be null... This often happens with partitions where we have static column, and no rows.
This behaviour isn’t reproduced for TTLs although.
Maybe we should ignore writetime if it’s null, instead of errorring? Or at least make it configurable?