Change the Language Output in textcat

20 views Asked by At

Does anyone know how to change the language code textcat gives as output. My real world output looks something like this:

enter image description here

Reproducible example:

library(textcat)
library(tidyverse)

df <- data.frame(
  text = c("Das ist deutsch","This is english", "C'est francais")
)

df <- df |>
  mutate(
    lang_textcat = textcat(text)
  )

## iso639-1 code 
# german == de
# english == en
# french == fr

df <- df |>
  mutate(
    lang_iso = c("de","en","fr")
  )

What I get from textcat you see in column lang_textcat. But what I want is the output like in column lang_iso. Is there an option to change the output to ISO 639-1? I could manually recode it, but it would be great, if there is an built-in option.

textcat package: https://cran.r-project.org/web/packages/textcat/textcat.pdf

Thanks!

0

There are 0 answers