Manupilating a string in PHP

Question

Gurjit_2 0 Newbie Poster

6 Years Ago

Hello everyone. I have a string that looks like this

(Mango, fruits, and), (Maize, cereals, and), (Mango juice, beverages, and)

I would like to convert the above string using php to something similar to this:

(Mango[fruits]) AND (Maize[cereals]) AND (Mango juice[beverages])

How can i achieve this in php. I have looked at PHP string replace, substr functions but cannot figure out how to use them to solve my problem. Thanking you in advance

php

Edited 6 Years Ago by Gurjit_2 because: correctin a typo

5 Contributors
27 Replies
884 Views
2 Days Discussion Span
Latest Post 6 Years Ago Latest Post by alan.davies

All 27 Replies

JamesCherrill 4,733 Most Valuable Poster

6 Years Ago

Are you missing a ] in the converted version, or is that deliberate?

JamesCherrill 4,733 Most Valuable Poster

6 Years Ago

You could use a strategy like
Remove all the blanks. Remove the first and last paren.
Split on ),( to give you each block of 3 words
For each block of 3 words: split on comma
Now you have just the words, grouped in threes, so it’s easy to concatenate them in the right order with the desired brackets etc

Alternatively there’s bound to be a single regex that does it, if you like long incomprehensible undebuggable strings of bizarre character sequences.

Edited 6 Years Ago by JamesCherrill

alan.davies 185 What's this?

6 Years Ago

This may be difficult to predict. Your 'and' in the original string - could this be anything else e.g. 'or'? Will there always be a space after the comma outside the brackets? If strings always have the exact same pattern, then a simple function or series of simple functions could do it. However if there are variations, then regex (preg functions) will need to be used.

//3dit

On second thoughts, I don't think you nedd regex

Edited 6 Years Ago by alan.davies because: Rethink

JamesCherrill 4,733 Most Valuable Poster

6 Years Ago

You need to loop through that array processing one line at a time. Each line can be exploded into the 3 words so you can put them all back together in the desired order with the desired punctuation.

Edited 6 Years Ago by JamesCherrill

pty 882 Posting Pro

6 Years Ago

I quite enjoy puzzles like this. Of course, I suspect that your problem will get much harder once you start adding stuff like (Banana, fruits, or) to the equation, as then you'll need to worry about brackets and presedence.

However, for this simple version here's how I tackled it in Ruby. You can pretty much translate this to PHP but it won't be as succinct or elegant.

puts "(Mango, fruits, and), (Maize, cereals, and), (Mango juice, beverages, and)"
  .scan(/\((.*?)\)/)                                                    # grab all the text that appears inside brackets
  .flatten                                                              # scan yields an array of arrays, flatten it
  .map{|chunk| chunk.split(",")}                                        # split each trio into a word array
  .map{|fruit, category, operation| "(#{fruit}[#{category.strip}])" }   # build the keywords into strings in the desired format
  .join(" AND ")                                                        # join the built strigns with 'AND'

(Mango[fruits]) AND (Maize[cereals]) AND (Mango juice[beverages])

Edited 6 Years Ago by pty

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

Gurjit_2 0 Newbie Poster · Answer 1 · 2018-07-19T07:55:48+00:00

Gurjit_2 0 Newbie Poster

6 Years Ago

It was a typo error not deliberate, I have corrected it

Edited 6 Years Ago by Gurjit_2

Gurjit_2 0 Newbie Poster · Answer 2 · 2018-07-19T09:37:52+00:00

@alan.davies Yes. My 'and' could be anything else e.g., 'or'. Yes, there will always be space after the bracket and the strings will always have the same exact pattern.

I have tried to solve the problem as suggested by @JamesCherrill above and here is what i have come with:

$var1 = str_replace(' ', '', $mystring);
$var2 = trim($var1, '()');
$var3 = explode('),(', $var2);

I get stuck here. I cannot figure how to manipulate the data when in groups of 3. If i run

print_r

on

var3

i get an array that looks like so:

Array
(
    [0] => Mango, fruits, and
    [1] => Maize, cereals, and
    [2] => Mango juice, beverages, and
)

How do i loop this array to get this output below.

 (Mango[fruits]) AND (Maize[cereals]) AND (Mango juice[beverages])

Sorry if the solution to this is obvious but somehow i cannot get my head around it

Gurjit_2 0 Newbie Poster · Answer 3 · 2018-07-19T11:32:25+00:00

Thank you all for taking your time to try and solve this problem. Based on suggestions by @JamesCherrill this is how i solved the problem..

   $var1 = str_replace(' ', '', $mystring);
    $var2 = trim($var1, '()');
    $var3 = explode('),(', $var2);
        foreach($var3 as $var4){
            $var5 = explode(',', $var4);
           echo '(' . $var5[0] . '[' . $var5[1] . ']'. ')' ."&nbsp". $var5[2] ."&nbsp";

        }

This returns a string that looks like so:

(Mango[fruits]) AND (Maize[cereals]) AND (Mango juice[beverages])

alan.davies 185 What's this? · Answer 4 · 2018-07-19T12:17:05+00:00

Well done however it doesn't solve your hangover, that is the additional AND at the end of the string.

JamesCherrill 4,733 Most Valuable Poster Team Colleague Featured Poster · Answer 5 · 2018-07-19T12:35:08+00:00

the additional AND at the end of the string.

Yes, I was curious about that as well. The input string has n boolean ops but the output string has n-1. How does that work? (I mean "what's the logic?" not "how do you code it?".)

Gurjit_2 0 Newbie Poster · Answer 6 · 2018-07-19T13:16:05+00:00

True observation @alan.davies. I welcome any suggestions on how to around getting rid of the last boolean....

JamesCherrill 4,733 Most Valuable Poster Team Colleague Featured Poster · Answer 7 · 2018-07-19T13:19:54+00:00

Just to be sure... is it the last boolean you want to drop or is it the first boolean?

Gurjit_2 0 Newbie Poster · Answer 8 · 2018-07-19T13:31:11+00:00

Gurjit_2 0 Newbie Poster

6 Years Ago

It is the last boolean

JamesCherrill 4,733 Most Valuable Poster Team Colleague Featured Poster · Answer 9 · 2018-07-19T13:53:34+00:00

To drop the last boolean you could use strrpos to find the position of the last space (ie the character before the last boolean) and substr to extract all the string up to that last space.

pty 882 Posting Pro · Answer 10 · 2018-07-19T14:43:41+00:00

Like I said in my post, this is deceptively easy. We're assuming they're all and, and under that assumption you may as well just join.

Once you start adding or into the query, the whole thing becomes much more difficult.

JamesCherrill 4,733 Most Valuable Poster Team Colleague Featured Poster · Answer 11 · 2018-07-19T15:32:18+00:00

We're assuming they're all and

Not at all. The algorithm he has used copies whatever is in the boolean position in the input: and, or, xor(?) etc (with maybe a conversion to upper case)

alan.davies 185 What's this? · Answer 12 · 2018-07-19T15:32:56+00:00

I'd do this:

function convertString( $string )
{
    $dirty = preg_replace_callback('/\(([a-z A-Z0-9]+), (\w+), (\w+)\),*/', function ($m) {
        return $m[1] . '[' . $m[2] . '] ' . strtoupper($m[3]);
    }, $string);
    return substr($dirty, 0, strrpos($dirty, ' '));
}

$str = '(Mango, fruits, and), (Maize, cereals, and), (Mango juice, beverages, and)';
echo convertString($str);

//Mango[fruits] AND Maize[cereals] AND Mango juice[beverages]

After a bit of thinking, I thought it would be easier to go regex (which I hate btw - coz I don't really understand it well enough). The logic for getting rid of the last boolean is simply search for the last space from the end of the string and truncate the whole string to that position. Not very elegant but works.

However - preg_* functions are notoriously slow - so even a one-liner can be slower than 4 or 5 "regular" string functions. Do some tests if you think this may be an issue.

The callback function is prefereable to /e in preg (as /e is dangerous and deprecated). Notice I used an anonymous function in the callback - this is just a preference - you could create a separate function.

BTW - not sure if you notice "maize juice" is 2 words - should this be allowed if you are using using what seems to be an array? Anyhow I allowed for this with the expanded regex of [a-z A-Z0-9]+ instead of just \w+.

//EDIT: Heh heh just read James' post on strrpos! Yep, I agree - I think the most convenient way

pty 882 Posting Pro · Answer 13 · 2018-07-20T07:55:55+00:00

Not at all. The algorithm he has used copies whatever is in the boolean position in the input: and, or, xor(?) etc (with maybe a conversion to upper case

Yes I see that but once you start adding or statements to a query you have to take precedence and additional brackets into account. It can get very difficult very fast.

Source, I wrote a query builder a few years ago, thought it'd take a couple of days but ended up being more than a week. Should have just taught the users SQL!

alan.davies 185 What's this? · Answer 14 · 2018-07-20T08:11:07+00:00

Agree with pty on the complexity. An SQL parser, if that's what it is, can be ridiculously complicated. You only have to look at 'eloquent' packages in laravel. Even that gives up the ghost after a while and says, 'sod it, this is too complicated, just type in your raw sql'. Anyhow difficult to see how this 3 term items work. As mentioned by somebody earlier, shouldn't the first boolean be dropped instead of the last one? Nesting or setting precedence is another rabbit hole.

JamesCherrill 4,733 Most Valuable Poster Team Colleague Featured Poster · Answer 15 · 2018-07-20T08:56:59+00:00

Aren't we having fun trying to guess what the real scope and spec of the OP's problem really is!
Anyway, my 2p's worth:
I don't worry about OR. I happy to guess that both the input and output formats follow the usual precedence riule (AND higher than OR), so simply copying is OK.
I would be very worried about the possibility of extra bracketing. There isn't any visible or implied in the OPs posts, but if it were possible then the current solution is a non-starter and he will need a proper parser.
I too have no idea why the input format seems to have a redundant boolean operator, and it worries me. Why would someone design a syntax like that? It's one of those loose ends that when pulled can unravel the whole thing.

Gurjit_2 0 Newbie Poster · Answer 16 · 2018-07-20T14:20:21+00:00

I guess the fundamental mistake is in the design of the input form. I have tried to get rid of the boolean AND at the end of the form with no success.Just in case you are wondering how the input form looks like here it is..

<select name = "choice2[]" class="btn btn-outline-secondary dropdown-toggle" type="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false" >
                <div class="dropdown-menu">
                    <option value="and" name="and">AND</option>
                    <option value="or" name="or">OR</option>
                    <option value="not" name="not">NOT</option>
                </div>
            </select>
        </div>

        <div class="input-group-prepend">

            <select name="choice1[]" class="btn btn-outline-secondary dropdown-toggle" type="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">
                <div class="dropdown-menu">
                    <option value = "fruits" name = "fruits">Fruits</option>
                    <option value = "cereals" name = "cereals">Cereals</option>
                    <option value = "beverages" name = "beverages">Beverages</option>   
                </div>
            </select>
            </div>
    <input type="text" name="item_name[]" class="form-control" aria-label="Text input with dropdown button">

I guess i have to think long and hard on how best to design this form to avoid the problem of the hanging boolean

score 2 · Answer 17 · 2018-07-20T14:34:30+00:00

If you use the regular expression

\((.+?),\s*(.+?),\s*(and|or)\),\s*?\((.+?),\s*(.+?),\s*(and|or)\),\s*?\((.+?),\s*(.+?),.*

and a replacement string of

\($1[$2]\) $3 \($4[$5]\) $6 \($7[$8]\)

then you get what you want except that and & or will be in the original case rather than upper case.

JamesCherrill 4,733 Most Valuable Poster Team Colleague Featured Poster · Answer 18 · 2018-07-20T16:25:41+00:00

Are you quite sure its
\((.+?),\s*(.+?),\s*(and|or)\),\s*?\((.+?),\s*(.+?),\s*(and|or)\),\s*?\((.+?),\s*(.+?),.*
and not
\((.+?),\s*(.+?),\s*(and|or)\),\s*?\((.+?),\s*(.+?),\s*(and|or\)),\s*?\((.+?),\s*(.+?),.*
?

(just kidding, but it's a great example of why regex syntax is someone's attempt at a joke that backfired with terrible long-term consequences)

score 1 · Answer 19 · 2018-07-20T16:40:55+00:00

Actually it can be shortened (yeah, right) to

\((.+?),\s*(.+?),\s*(and|or)\),\s*\((.+?),\s*(.+?),\s*(and|or)\),\s*\((.+?),\s*(.+?),.*

which breaks down to

 \(         opening `(`
 (.+?),     shortest string up to `,` (group $1)
 \s*        0 or more spaces 
 (.+?),     shortest string up to `,` (group $2)
 \s*        0 or more spaces
 (and|or)   logical operator (group $3)
 \)         closing ')'
 ,\s*          `,` followed by 0 or more spaces
 \(         opening `(`
 (.+?),     shortest string up to `,` (group $4)
 \s*        0 or more spaces 
 (.+?),     shortest string up to `,` (group $5)
 \s*        0 or more spaces
 (and|or)   logical operator (group $6)
 \)         closing ')'
 ,\s*       `,` followed by 0 or more spaces
 \(         opening `(`
 (.+?),     shortest string up to `,` (group $7)
 \s*        0 or more spaces 
 (.+?),     shortest string up to `,` (group $8)
 .*         remainder of string

It looks hideous but it's mostly three simple patterns repeated. If you punch it into rexexpr you will see it as a graphic. It's too wide to insert here.

Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems. - Jamie Zawinski

alan.davies 185 What's this? · Answer 20 · 2018-07-20T22:45:27+00:00

Heh heh. Well done RJ. From the form snippet included, it appears that the op expects 3 sets of data. If only two sets are included, is there're a regex for this?that is a variable number of sets of data. That's where I was going with my 'replace each occurrence'. Genuinely interested. Am a complete duffer when it comes to regex.

score 2 · Answer 21 · 2018-07-21T14:01:55+00:00

Am a complete duffer when it comes to regex.

So was I until about a week ago. By coincidence I had just finished working through Beginning Regular Expressions by Andrew Watt and this seemed like a good opportunity to show off before I forget it all ^_^

alan.davies 185 What's this? · Answer 22 · 2018-07-21T14:30:19+00:00

alan.davies 185 What's this?

6 Years Ago

Cheers RJ, will give it a look

Manupilating a string in PHP

Recommended Answers Collapse Answers

All 27 Replies

Recommended Answers