How to configure Apache's Xalan-Transformer to not create empty lines?

44 views Asked by At

I use a Xalan transformer to format xml files with indentation. However, when I apply the formatter to an xml file that already contains indentation, it creates lots of empty lines.

E.g. the input

<foo><bar/></foo>

is transformed to

<?xml version="1.0" encoding="UTF-8"?><foo>
    <bar/>
</foo>

(which I'd consider correct), but applying the transformer to this yields

<?xml version="1.0" encoding="UTF-8"?><foo>
        
    <bar/>
    
</foo>

and successive transformations add ever more empty lines.

I do not remember this behaviour from Saxon transformers. How can I setup my Xalan transformer to not create empty lines?

To be more concrete, here is a simple test encoding the example from above, which I would like to pass with a suitably configured transformer:

import java.io.StringReader;
import java.io.StringWriter;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
import org.junit.Assert;
import org.junit.jupiter.api.Test;

class TransformerTest {
    @Test
    void formattingShouldBeIdempotent() throws Exception
    {
      TransformerFactory factory = TransformerFactory.newDefaultInstance();
      Transformer transformer = factory.newTransformer();
      transformer.setOutputProperty(OutputKeys.INDENT, "yes");

      String transformed1 = transform("<foo><bar/></foo>", transformer);
      String transformed2 = transform(transformed1, transformer);

      Assert.assertEquals(transformed1, transformed2);
    }

    private String transform(String input, Transformer transformer) throws TransformerException
    {
      Source source = new StreamSource(new StringReader(input));
      StringWriter outWriter = new StringWriter();
      StreamResult result = new StreamResult(outWriter);
      transformer.transform(source, result);
      return outWriter.getBuffer().toString();
    }
}
1

There are 1 answers

0
Michael Kay On

If you have to use Xalan rather than Saxon, I would suggest preceding the serialization with a simple transformation that essentially just does <xsl:strip-space elements="*"/> and then copies the input unchanged.