Not planned
Description
Initial checklist
- I read the support docsI read the contributing guideI agree to follow the code of conductI searched issues and discussions and couldn’t find anything (or linked relevant results below)
Affected package
hast-util-raw
Steps to reproduce
Here is a markdown input containing simple nested html elements are in one line:
const md = `<a href="https://example.com"><figure><img src="image.png" alt=""></figure></a>`;
const unified = require('unified');
const remarkParse = require('remark-parse');
const remarkRehype = require('remark-rehype');
const rehypeRaw = require('rehype-raw');
const rehypeStringify = require('rehype-stringify');
const html = unified()
.use(remarkParse)
.use(remarkRehype, { allowDangerousHtml: true })
.use(rehypeRaw)
.use(rehypeStringify)
.processSync(md);
console.log(html);
Actual behavior
If the html elements are in one line, it produces weird anchor behavior and empty paragraph at the end:
<p><a href="https://example.com"></a></p><figure><a href="https://example.com"><img src="image.png" alt=""></a></figure><p></p>
But if the input is like:
const md = `<a href="https://example.com">
<figure><img src="image.png" alt=""></figure>
</a>`;
it behaves normal and produces the expected result.
Expected behavior
I expect it can handle this kind of simple nested html input in one line and the result to be:
<p><a href="https://example.com"><figure><img src="image.png" alt=""></figure></a></p>
I mean hast-util-raw
should handle nested html elements even if they are in one line.
Runtime
node
Package manager
npm
Operating system
macos
Build and bundle tools
No response
Metadata
Metadata
Assignees
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
wooorm commentedon Apr 18, 2025
Hi! This has to do with markdown. Not with HTML, or this package. Here’s your input pasted on the CommonMark dingus: https://spec.commonmark.org/dingus/?text=%3Ca%20href%3D%22https%3A%2F%2Fexample.com%22%3E%3Cfigure%3E%3Cimg%20src%3D%22image.png%22%20alt%3D%22%22%3E%3C%2Ffigure%3E%3C%2Fa%3E%0A%0A%3Ca%20href%3D%22https%3A%2F%2Fexample.com%22%3E%0A%3Cfigure%3E%3Cimg%20src%3D%22image.png%22%20alt%3D%22%22%3E%3C%2Ffigure%3E%0A%3C%2Fa%3E. You should be able to see the same here on GH too.
github-actions commentedon Apr 18, 2025
talatkuyuk commentedon Apr 18, 2025
When I go to dingus, I see the input pasted; then, when I click the
HTML
tab in the right panel, I see the HTML result for that is in one line:It is expected output as should be (not weird one). Am I doing wrong? or am I right about the issue?
wooorm commentedon Apr 19, 2025
It’s about the
<p>
s being added. There is a difference between the two test caseswooorm commentedon Apr 19, 2025
Perhaps I am unclear what you mean. Can you maybe make your input/actual/expected examples smaller?
talatkuyuk commentedon Apr 19, 2025
It is not about
<p>
is being added or not. It is about anchor<a>
behavior if it is outer. Here are two input/actual/expected examples in a simplest way. Consider the setup is like below:input markdown (in one line, outer
<a>
inner<figure>
):actual output (weird, empty anchor within
<p>
in the beginning, and empty paragraph at the end):expected output (I saw the expected output in dingus
HTML
tab, andhast-util-raw
should ensure that result !):On the other hand, I changed the order of nesting elements to see the behavior:
input markdown (in one line, outer
<figure>
inner<a>
):actual output (not weird, it is expected, and no problem !):
I stress that the two inputs are in one line, just changed the order of nesting.
wooorm commentedon Apr 21, 2025
Thanks for providing more info! I made your example smaller:
Yields:
Perhaps this smaller example will make it more visible: the “problem” is that there is a
<figure>
inside a<p>
. That cannot be. When<figure>
is seen, thea
is first closed, and thep
is closed. Then the figure is opened, closed, the stray</a>
is ignored, and the stray</p>
first causes it to be opened and then immediately closed.You can see the same behavior in a browser by pasting this in an empty new tab:
document.body.innerHTML = '<p><a><figure></figure></a></p>'
. And then inspecting the DOMtalatkuyuk commentedon May 4, 2025
Thank you @wooorm; you are right.
<figure>
inside<a>
in<p>
<p><a><figure><img></figure></a></p>
<figure>
inside just<a>
<a><figure><img></figure></a>
The issue is related with default html parsing behavior. Thanks again. 🙏