I'm not sure I get how hmmlearn expects the input data. Below is a modified example from the tutorial.
import numpy as np
from hmmlearn import hmm
states = ['up', 'down']
start_probs = np.array([0.6, 0.4])
vocabulary = [0, 1, 2, 3]
emission_probs = np.array(
[
[0.25, 0.1, 0.4, 0.25],
[0.2, 0.5, 0.1, 0.2],
]
)
trans_mat = np.array(
[
[0.8, 0.2],
[0.2, 0.8],
]
)
observations = np.random.choice(vocabulary, (10, 7))
lengths = [len(item) for item in observations]
# Set up model:
model = hmm.MultinomialHMM(
n_components=len(states),
n_trials=len(observations[0]),
n_iter=50,
init_params='',
)
model.n_features = len(vocabulary)
model.startprob_ = start_probs
model.transmat_ = trans_mat
model.emissionprob_ = emission_probs
model.fit(observations, lengths)
logprob, received = model.decode(observations)
At this point I'm getting this error:
in _AbstractHMM._check_and_set_n_features(self, X)
525 if hasattr(self, "n_features"):
526 if self.n_features != n_features:
--> 527 raise ValueError(
528 f"Unexpected number of dimensions, got {n_features} but "
529 f"expected {self.n_features}")
530 else:
531 self.n_features = n_features
ValueError: Unexpected number of dimensions, got 7 but expected 4
I'm not sure why it is expecting 4, which is just the size of the vocabulary. Isn't lengths supposed to be the size of the sequences?