Facebook's Duckling Cannot Identify Time Dimension Correctly

Question

Facebook's Duckling Cannot Identify Time Dimension Correctly

429 views Asked by Bradford Griggs At 23 March 2022 at 17:44

I'm using Facebook's Duckling to parse text. When I pass the text: 13h 47m it correctly classifies the entire text as DURATION (= 13 hours 47 minutes).

However, when I pass the text: 13h 47m 13s it cannot identify the 13s part of the String as being part of the DURATION. I was expecting it to parse it as 13 hours, 47 minutes and 13 seconds but it essentially ignores the 13s part as not being part of the DURATION.

Command: curl -XPOST http://127.0.0.1:0000/parse --data locale=en_US&text="13h 47m 13s"
JSON Array: 
[
  {
    "latent": false,
    "start": 0,
    "dim": "duration",
    "end": 7,
    "body": "13h 47m",
    "value": {
      "unit": "minute",
      "normalized": {
        "unit": "second",
        "value": 49620
      },
      "type": "value",
      "value": 827,
      "minute": 827
    }
  },
  {
    "latent": false,
    "start": 8,
    "dim": "number",
    "end": 10,
    "body": "13",
    "value": {
      "type": "value",
      "value": 13
    }
  }
]

Is this a bug? How can I update Duckling so that it parses the text as described above?

Original Q&A

There are 1 answers

**Daniel Wagner** · Accepted Answer · 2022-03-23T19:43:31+00:00

The documentation seems pretty clear about this:

To extend Duckling's support for a dimension in a given language, typically 4 files need to be updated:

Duckling/<Dimension>/<Lang>/Rules.hs

Duckling/<Dimension>/<Lang>/Corpus.hs

Duckling/Dimensions/<Lang>.hs (if not already present in Duckling/Dimensions/Common.hs)

Duckling/Rules/<Lang>.hs

Taking a look in Duckling/Duration/Rules.hs, I see:

ruleIntegerUnitofduration = Rule
  { name = "<integer> <unit-of-duration>"
  , pattern =
    [ Predicate isNatural
    , dimension TimeGrain
    ]
  -- ...

So next I peeked in Duckling/TimeGrain/EN/Rules.hs (because Duckling/TimeGrain/Rules.hs did not exist), and see:

grains :: [(Text, String, TG.Grain)]
grains = [ ("second (grain) ", "sec(ond)?s?",      TG.Second)
         -- ...

Presumably this means 13h 47m 13sec would parse the way you want. To make 13h 47m 13s parse in the same way, I guess the first thing I would try would be to make the regex above a bit more permissive, maybe something like s(ec(ond)?s?)?, and see if that does the trick without breaking anything else you care about.

TechQA.

Facebook's Duckling Cannot Identify Time Dimension Correctly

There are 1 answers

Related Questions in HASKELL

Related Questions in NLP

Related Questions in DUCKLING

Popular Questions

Trending Questions