Regex or string replace to add <p> html tag

Question

bprosic 6 Light Poster

4 Years Ago

Hi,
how can I use regex or string replace to add missing "p" tags to sentences without tags.
I tried matching* and splitting first the whole string matching "h" and "pre" tags but dont know how to merge it.

*let regexRule = /<pre>(.|\n|\r\n)[\s\S]*?<\/pre>/g;

Example - input

let someVariable = "Basket
<h1>Fruits</h1>
<pre>Apple
Juice</pre>
<pre>Kiwi</pre>
PVC thing
Trash
<h1>...</h1>";

How can I add "p" tag to Basket, PVC thing and Trash so output would be:

someVariable = "<p>Basket</p>
<h1>Fruits</h1>
<pre>Apple
Juice</pre>
<pre>Kiwi</pre>
<p>PVC thing</p>
<p>Trash</p>
<h1>...</h1>";

Thank you

html-css javascript regex

4 Contributors
5 Replies
1K Views
1 Week Discussion Span
Latest Post 4 Years Ago Latest Post by Diafol_2

All 5 Replies

Dani 4,675 The Queen of DaniWeb

4 Years Ago

I'm not too good with regex, but if you're using jQuery, there's a built-in utility where you pass it an HTML string, and it converts that string to valid dom nodes for you to do what you want with. Hope maybe this will help with whatever it is you're trying to do.

bprosic commented: Thanks for reply. Instead of regex, how can I achieve my goal using cheerio or node-html-parser or with jquery? +2

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

Thanks for reply. Instead of regex, how can I achieve my goal using cheerio or node-html-parser or with jquery?

bprosic 6 Light Poster · Answer 1 · 2020-08-18T09:40:49+00:00

I used cheerio in node.js

let someVariable = "Basket<h1>Fruits</h1><pre>AppleJuice</pre><pre>Kiwi</pre>PVC thing Trash<h1>...</h1>";
let addBodyTag = cheerio.load(someVariable).html(); // add html/body tag
let $ = cheerio.load(addBodyTag, {
    normalizeWhitespace: true,
    xmlMode: true
}); // load one more time without html/body tag

$('body').contents().each(function(i, el) {
    if (el.type == 'text') { // in jQuery 'text' condition is nodeType == 3
    // if there is untagged elements, then add <p> tag
        $(this).wrap("<p>");
    }
})
let returnResult = $.html();
// could not put in one line, this doesnt work:
// returnResult = returnResult.substr(returnResult.indexOf("<body>") + 6, returnResult.length - 14);

returnResult = returnResult.substr(returnResult.indexOf("<body>") + 6, returnResult.length);
returnResult = returnResult.substr(0, returnResult.length - 14);
console.log(returnResult); // got: <p>Basket</p><h1>Fruits</h1><pre>AppleJuice</pre><pre>Kiwi</pre><p>PVC thing Trash</p><h1>...</h1>

Dani 4,675 The Queen of DaniWeb Administrator Featured Poster Premium Member · Answer 2 · 2020-08-18T17:40:06+00:00

So what I was suggesting was something like this:

// Random valid or invalid HTML string (in this case, missing a closing </p>
var html = '<strong><p>Foo</strong>';

// Create a jQuery DOM element whose contents are the HTML string
// The DOM element will manipulate the invalid HTML and make it valid
var element = $(html);

// Inject the contents of the DOM element into a jQuery selector
// The element with ID 'selector' will now contain Foo in a bold paragraph with valid HTML
$('#selector').html(element);

score 2 · Answer 3 · 2020-08-18T19:57:36+00:00

I don't think a regex will do what you want. I don't see how you can define a pattern, based on your example, that could distinguish things you want to <p></p>ify and things you do not. How would you determine that you want to modify basket, PVC thing, and Trash, but not let or someVariable. Also, removing the word let is something else entirely.

I had a problem I wanted to solve with a regex... now I have two problems.

Diafol_2 0 Newbie Poster · Answer 4 · 2020-08-29T16:02:33+00:00

Is the HTML in a string or part of the DOM? If a string, I think the best method would be to change it to nodes via .innerHTML on a temporary parent tag (like a div). Then you extract the Nodes through .firstChild.

var tmpDiv = document.createElement('div');
tmpDiv.innerHTML= '<div><a href="#"></a><span></span></div>';
var HTML = tmpDiv.firstChild;

nodeType = 3 identifies Text: Reference (Mozilla)

Sorry, just remembered about jQuery .contents() - I really think this is all you need.

Regex or string replace to add <p> html tag

Recommended Answers Collapse Answers

All 5 Replies

Recommended Answers