We want to replace the default h tags, introduced by markdown using #, with a custom HTML Tag. For Parsing Markdown to HTML we use the Python Library Markdown.
We have tried to register an extension that uses a H1 regex. This extension uses the regexp (#) (.*) for detecting H1 Elements.
import markdown
from markdown.extensions import Extension
from markdown.inlinepatterns import SimpleTagPattern
class CustomHeadings(Extension):
def extendMarkdown(self, md, md_globals):
H1_RE = r'(#) (.*)'
h1_tag = SimpleTagPattern(H1_RE, 'span class="h1"')
md.inlinePatterns['h1'] = h1_tag
md_extensions = [CustomHeadings()]
# [...]
def ds_custom_markdown_parse(value):
return markdown.markdown(value, extensions=md_extensions)
We want to have h{1-6} elements as a span class="h{1-6}". But the Markdown parser still matches the string # This is a h1 to <h1>This is a h1</h1>. We expect the output to be <span class="h1">This is a h1</span>
Headings are block-level elements and therefore are not parsed by
inlinePatterns.Prior to running theinlinePatterns, Python-Markdown runs the BlockParser, which converts all of the block-level elements of the document into an ElementTree object. Each block-level element is then passed through theinlinePatternsone at a time and the the span-level elements are parsed.For example, given your heading
# This is a h1, the BlockParser has already converted it to an H tag<h1>This is a h1</h1>and the inlinePatterns only see the text content of that tagThis is a h1.You have a few options for addressing this:
BlockProcessors which parse headings so that they create the elements you desire from the get-go.Option 2 should be much simpler and is, in fact, the method used by a few existing extensions.
Full discloser: I am the lead developer of the Python-Markdown project.