Using Java Jackson CSV mapper to quote specific columns only

38 views Asked by At

I have to export records from a few database tables into CSV format. It will be part of an existing Maven application so I decided to go with Jackson mapper. The CSV export is working correctly, but the end-user who will receive these CSVs came back with very specific requirements.

Basically I have 4 datatypes to handle:

  • String (varchar in Postgres): should be exported as a string, always quoted
  • Integer (int or serial in Postgres): should be exported as a number, never quoted
  • BigInteger (bigint or bigserial in Postgres): again should be exported as a number, never quoted
  • Timestamp (timestamp without timezone in Postgres): should be exported as a date in format YYYYMMddHHmmss, never quoted

The problem is, at least in my understanding, that during export everything above, except Integer, is converted to String. So if I enable the ALWAYS_QUOTE_STRINGS feature, then everything is quoted, including timestamp and biginteger, if I don't enable it, then nothing is quoted.

This is an example from one of my DTO classes which will be exported:

@Getter
@Setter
@JsonPropertyOrder({
    "id",
    "type",
    "direction",
    "changedate"
})
public class MyDTO {
    @JsonProperty("id")
    private BigInteger id;

    @JsonProperty("type")
    private String type;

    @JsonProperty("direction")
    private int direction;

    @JsonProperty("changedate")
    @JsonFormat(shape = JsonFormat.Shape.STRING, pattern = "YYYYMMddHHmmss")
    private Timestamp changeDate;
}

My CSV mapper:

CsvMapper csvMapper = CsvMapper.builder()
    .enable(CsvGenerator.Feature.ALWAYS_QUOTE_STRINGS)
    .enable(CsvGenerator.Feature.ALWAYS_QUOTE_EMPTY_STRINGS)
    .build();

My CSV schema:

CsvSchema csvSchema = csvMapper
    .schemaFor(MyDTO.class)
    .withColumnSeparator('|')
    .withHeader();

Exporting this way will result in the following CSV:

"id"|"type"|"direction"|"changedate"
"1"|"type1"|0|"20240219002530"
"2"|"type2"|1|"20240219004433"
"3"|""||"20240219004747"

This is not optimal. Main problem is that the header is quoted, but also the "id" and "changedate" should not be quoted as per the requirement.

I tried to change my DTO to the following:

@Getter
@Setter
@JsonPropertyOrder({
    "id",
    "type",
    "direction",
    "changedate"
})
public class MyDTO {
    @JsonProperty("id")
    @JsonFormat(shape = JsonFormat.Shape.NUMBER) // added this format
    private BigInteger id;

    @JsonProperty("type")
    private String type;

    @JsonProperty("direction")
    private int direction;

    @JsonProperty("changedate")
    // changed from STRING to NUMBER
    @JsonFormat(shape = JsonFormat.Shape.NUMBER, pattern = "YYYYMMddHHmmss") 
    private Timestamp changeDate;
}

The column headers will still quoted, the "id" column was still quoted, the date was not quoted but instead of the pattern, it was exported as Unix timestamp:

"id"|"type"|"direction"|"changedate"
"1"|"type1"|0|1708302330000
"2"|"type2"|1|1708303473000
"3"|""||1708303667000

In the final setup, I modified my CSV mapper:

CsvMapper csvMapper = CsvMapper.builder()
    //.enable(CsvGenerator.Feature.ALWAYS_QUOTE_STRINGS)
    //.enable(CsvGenerator.Feature.ALWAYS_QUOTE_EMPTY_STRINGS)
    .enable(CsvGenerator.Feature.STRICT_CHECK_FOR_QUOTING)
    .build();

This way is the best so far, but now nothing is quoted:

id|type|direction|changedate
1|type1|0|20240219002530
2|type2|1|20240219004433
3|||20240219004747

As I see it, all of these are valid CSVs, just not matching the requirements. I will try to resist and have them accept this last version, but just asking, is it even possible what they want? This is the wish:

id|type|direction|changedate
1|"type1"|0|20240219002530
2|"type2"|1|20240219004433
3|""||20240219004747

The "type" column is a string, it is always quoted, including empty values, but nothing else is quoted.

0

There are 0 answers