I'm using nom to create a parser for a specific markdown flavor.
This flavor also includes the basic bold, italic and strike through option for text inside the paragraph. I managed to get this working for non-nested text like
this is a **bold** and __italic__ text
but not for
this is a **__bold and italic__** text
as I would need to call the inline parser recursively.
This is my code. Basically, I use the parse_inline_text function in the main parser at the end to detect paragraphs.
#[derive(Clone, Debug, PartialEq)]
pub enum InlineElement {
Bold(String),
Italic(String),
StrikeThrough(String),
Text(String),
}
fn enclosed<'a>(start: &'a str, end: &'a str) -> impl FnMut(&'a str) -> IResult<&'a str, &str> {
map(tuple((tag(start), take_until(end), tag(end))), |x| (x.1))
}
fn parse_text_bold(i: &str) -> IResult<&str, &str> {
enclosed("**", "**")(i)
}
fn parse_text_italics(i: &str) -> IResult<&str, &str> {
enclosed("__", "__")(i)
}
fn parse_text_strike_through(i: &str) -> IResult<&str, &str> {
enclosed("~", "~")(i)
}
fn parse_text_plain(i: &str) -> IResult<&str, String> {
map(
many1(preceded(
not(alt((
tag("*"),
tag("_"),
tag("~"),
tag("\n"),
))),
take(1u8),
)),
|vec| vec.join(""),
)(i)
}
fn parse_inline(i: &str) -> IResult<&str, InlineElement> {
alt((
map(parse_text_bold, |s: &str| {
InlineElement::Bold(s.to_string())
}),
map(parse_text_italics, |s: &str| {
InlineElement::Italic(s.to_string())
}),
map(parse_text_strike_through, |s: &str| {
InlineElement::StrikeThrough(s.to_string())
}),
map(parse_text_plain, |s| InlineElement::Text(s.to_string())),
))(i)
}
pub fn line_seperator(chr: char) -> bool {
return chr == '\n';
}
fn parse_inline_text<'a>(
i: &str,
) -> Result<(&str, Vec<InlineElement>), nom::Err<nom::error::Error<&str>>> {
terminated(
many0(parse_inline),
tuple((tag("\n"), take_while(line_seperator))),
)(i)
}
At the time being, it returns a Vec<InlineElement> whereby every InlineElement holds a String. When getting it to work recursivly, the enumerations should each contain a Vec<InlineElement> except Text. How to solve this with nom?
This is what I tried so far. Unfortunately, it panicks.
#[derive(Clone, Debug, PartialEq)]
pub enum InlineElement {
Bold(Vec<InlineElement>),
Italic(Vec<InlineElement>),
Complex(Vec<InlineElement>),
Formula(Vec<InlineElement>),
StrikeThrough(Vec<InlineElement>),
Text(String),
}
fn enclosed<'a>(start: &'a str, end: &'a str) -> impl FnMut(&'a str) -> IResult<&'a str, Vec<InlineElement>> {
map(tuple((tag(start), take_until(end), tag(end))), |x| parse_inline_text(x.1).unwrap().1)
}
fn parse_text_bold(i: &str) -> IResult<&str, Vec<InlineElement>> {
enclosed("**", "**")(i)
}
fn parse_text_italics(i: &str) -> IResult<&str, Vec<InlineElement>> {
enclosed("__", "__")(i)
}
fn parse_text_strike_through(i: &str) -> IResult<&str, Vec<InlineElement>> {
enclosed("~", "~")(i)
}
fn parse_inline(i: &str) -> IResult<&str, InlineElement> {
alt((
map(parse_text_bold, |e: Vec<InlineElement>| {
InlineElement::Bold(e)
}),
map(parse_text_italics, |e: Vec<InlineElement>| {
InlineElement::Italic(e)
}),
map(parse_text_strike_through, |e: Vec<InlineElement>| {
InlineElement::StrikeThrough(e)
}),
map(parse_text_plain, |s| InlineElement::Text(s.to_string())),
))(i)
}
My guess is that the parse_inline_text(x.1).unwrap().1 is wrong but I just don't know how to fix the enclosed function to return the result of parse_inline_text.
The problem is that
parse_inline_textexpects the text inside for example asterisks to end with a newline. By changing this:to this:
it works. Here is a full working example (playground):
If it is important that
parse_inline_textexpects a newline in the end, it is incorrect to use this function insideenclosed, unless this should be seen as invalid:**text that does not end in newline**. You should in that case create a new function similar toparse_inline_textthat doesn't expects a newline and use that inenclosed.