I have to export records from a few database tables into CSV format. It will be part of an existing Maven application so I decided to go with Jackson mapper. The CSV export is working correctly, but the end-user who will receive these CSVs came back with very specific requirements.
Basically I have 4 datatypes to handle:
- String (varchar in Postgres): should be exported as a string, always quoted
- Integer (int or serial in Postgres): should be exported as a number, never quoted
- BigInteger (bigint or bigserial in Postgres): again should be exported as a number, never quoted
- Timestamp (timestamp without timezone in Postgres): should be exported as a date in format YYYYMMddHHmmss, never quoted
The problem is, at least in my understanding, that during export everything above, except Integer, is converted to String. So if I enable the ALWAYS_QUOTE_STRINGS feature, then everything is quoted, including timestamp and biginteger, if I don't enable it, then nothing is quoted.
This is an example from one of my DTO classes which will be exported:
@Getter
@Setter
@JsonPropertyOrder({
"id",
"type",
"direction",
"changedate"
})
public class MyDTO {
@JsonProperty("id")
private BigInteger id;
@JsonProperty("type")
private String type;
@JsonProperty("direction")
private int direction;
@JsonProperty("changedate")
@JsonFormat(shape = JsonFormat.Shape.STRING, pattern = "YYYYMMddHHmmss")
private Timestamp changeDate;
}
My CSV mapper:
CsvMapper csvMapper = CsvMapper.builder()
.enable(CsvGenerator.Feature.ALWAYS_QUOTE_STRINGS)
.enable(CsvGenerator.Feature.ALWAYS_QUOTE_EMPTY_STRINGS)
.build();
My CSV schema:
CsvSchema csvSchema = csvMapper
.schemaFor(MyDTO.class)
.withColumnSeparator('|')
.withHeader();
Exporting this way will result in the following CSV:
"id"|"type"|"direction"|"changedate"
"1"|"type1"|0|"20240219002530"
"2"|"type2"|1|"20240219004433"
"3"|""||"20240219004747"
This is not optimal. Main problem is that the header is quoted, but also the "id" and "changedate" should not be quoted as per the requirement.
I tried to change my DTO to the following:
@Getter
@Setter
@JsonPropertyOrder({
"id",
"type",
"direction",
"changedate"
})
public class MyDTO {
@JsonProperty("id")
@JsonFormat(shape = JsonFormat.Shape.NUMBER) // added this format
private BigInteger id;
@JsonProperty("type")
private String type;
@JsonProperty("direction")
private int direction;
@JsonProperty("changedate")
// changed from STRING to NUMBER
@JsonFormat(shape = JsonFormat.Shape.NUMBER, pattern = "YYYYMMddHHmmss")
private Timestamp changeDate;
}
The column headers will still quoted, the "id" column was still quoted, the date was not quoted but instead of the pattern, it was exported as Unix timestamp:
"id"|"type"|"direction"|"changedate"
"1"|"type1"|0|1708302330000
"2"|"type2"|1|1708303473000
"3"|""||1708303667000
In the final setup, I modified my CSV mapper:
CsvMapper csvMapper = CsvMapper.builder()
//.enable(CsvGenerator.Feature.ALWAYS_QUOTE_STRINGS)
//.enable(CsvGenerator.Feature.ALWAYS_QUOTE_EMPTY_STRINGS)
.enable(CsvGenerator.Feature.STRICT_CHECK_FOR_QUOTING)
.build();
This way is the best so far, but now nothing is quoted:
id|type|direction|changedate
1|type1|0|20240219002530
2|type2|1|20240219004433
3|||20240219004747
As I see it, all of these are valid CSVs, just not matching the requirements. I will try to resist and have them accept this last version, but just asking, is it even possible what they want? This is the wish:
id|type|direction|changedate
1|"type1"|0|20240219002530
2|"type2"|1|20240219004433
3|""||20240219004747
The "type" column is a string, it is always quoted, including empty values, but nothing else is quoted.