How to normalize overflowed state in range asymmetric numeral systems?

42 views Asked by At

The encoded message contains negative integers. Also the buffer of the BitWriter doesn't change.

ChatGPT helped me to fix errors regarding operator precedence.


import java.util.Arrays;

public class RansEncoder {
    private final int[] symbolFrequencies;
    private final int totalFrequency;
    private final int[] cumulativeDistributionFunction;
    private final int numberOfQuantizationBits;
    private final int numberOfNormalizationBits;
    private final int numberOfBitsToReadAndWrite;
    private int state;
    public RansEncoder(int numberOfQuantizationBits, int numberOfNormalizationBits, int numberOfBitsToReadAndWrite, int[] symbolFrequencies) {
        this.numberOfQuantizationBits = numberOfQuantizationBits;
        this.numberOfNormalizationBits = numberOfNormalizationBits;
        this.numberOfBitsToReadAndWrite = numberOfBitsToReadAndWrite;
        this.totalFrequency = 1 << numberOfQuantizationBits;
        this.symbolFrequencies = quantizeSymbolFrequencies(symbolFrequencies);
        this.cumulativeDistributionFunction = getCumulativeDistributionFunction();
        this.state = 1 << numberOfNormalizationBits;
    }
    private int[] quantizeSymbolFrequencies(int[] symbolFrequencies) {
        final int totalFrequencyBeforeQuantization = Arrays.stream(symbolFrequencies).sum();
        return Arrays.stream(symbolFrequencies)
                .mapToDouble(Double::valueOf)
                .map(symbolFrequency -> symbolFrequency / totalFrequencyBeforeQuantization)
                .map(symbolProbability -> symbolProbability * totalFrequency)
                .mapToLong(Math::round)
                .mapToInt(Math::toIntExact).toArray();
    }

    private int[] getCumulativeDistributionFunction() {
        int[] cumulativeDistributionFunction = new int[symbolFrequencies.length + 1];
        for (int i = 0; i < symbolFrequencies.length + 1; i++) {
            cumulativeDistributionFunction[i] = Arrays.stream(symbolFrequencies).limit(i).sum();
        }
        return cumulativeDistributionFunction;
    }
    public void write(int symbol, BitWriter writer) {
        normalizeOverflowedStateByWritingBits(symbol, writer);
        encode(symbol);
    }

    private void normalizeOverflowedStateByWritingBits(int symbol, BitWriter writer) {
        while (state >= symbolFrequencies[symbol] << numberOfNormalizationBits) {
            writer.writeBits(state & ((1 << numberOfBitsToReadAndWrite) - 1), numberOfBitsToReadAndWrite);
            state >>= numberOfBitsToReadAndWrite;
        }
    }

    private void encode(int symbol) {
        int nextState = getNextState(symbol);
        state = nextState;
    }

    private int getNextState(int symbol) {
        int nextState = (Math.floorDiv(state, symbolFrequencies[symbol]) << numberOfQuantizationBits) + cumulativeDistributionFunction[symbol] + state % symbolFrequencies[symbol];
        return nextState;
    }
}

The utility classes can be found here and here. I don't think they cause the issue because they are used in a big open source project.

A potential error could be the initialization of the state or the implementation of the normalizeOverflowedStateByWritingBits method.

The test uses 8, 16, 8 for the number of quantization, normalization and read/write bits. The symbolFrequencies are {5000, 2000, 2000, 1000} and the message to be encoded is an array consisting of 4000 random integers between 0 and 4 exclusive.

I used this implementation as a reference.

0

There are 0 answers