I'm using Facebook's Duckling to parse text. When I pass the text: 13h 47m it correctly classifies the entire text as DURATION (= 13 hours 47 minutes).
However, when I pass the text: 13h 47m 13s it cannot identify the 13s part of the String as being part of the DURATION. I was expecting it to parse it as 13 hours, 47 minutes and 13 seconds but it essentially ignores the 13s part as not being part of the DURATION.
Command: curl -XPOST http://127.0.0.1:0000/parse --data locale=en_US&text="13h 47m 13s"
JSON Array:
[
{
"latent": false,
"start": 0,
"dim": "duration",
"end": 7,
"body": "13h 47m",
"value": {
"unit": "minute",
"normalized": {
"unit": "second",
"value": 49620
},
"type": "value",
"value": 827,
"minute": 827
}
},
{
"latent": false,
"start": 8,
"dim": "number",
"end": 10,
"body": "13",
"value": {
"type": "value",
"value": 13
}
}
]
Is this a bug? How can I update Duckling so that it parses the text as described above?
The documentation seems pretty clear about this:
Taking a look in
Duckling/Duration/Rules.hs, I see:So next I peeked in
Duckling/TimeGrain/EN/Rules.hs(becauseDuckling/TimeGrain/Rules.hsdid not exist), and see:Presumably this means
13h 47m 13secwould parse the way you want. To make13h 47m 13sparse in the same way, I guess the first thing I would try would be to make the regex above a bit more permissive, maybe something likes(ec(ond)?s?)?, and see if that does the trick without breaking anything else you care about.