0

I am not a computer professional, only like to develop my own toys.

I am using Globi Flow to get the info I need from the XML file. All is done, except for one instance where I do not have the closing tag

Regular node with closing tag:

<proper price>$4</price>

Below is the code used for all data extraction.

preg_match_all_gf("/<city>(.*?)<\/city>/ism", [(variable)token],4)

This would get the 4th occurrence.

And this is what I currently need and it does not have the closing tag:

<comp score:"9.0">

I managed to get the first digit (in the example above, 9) using Regex, as it can be seen here.
The code used is:

  preg_match_all_gf('/(?<=score=")[^"]*(?=.0")', [(Variable) xml],2)

If you remove "*" it gets only the first occurence:

preg_match_all_gf('/(?<=score=")[^"](?=.0")', [(Variable) xml],2)

So now the problem: the iteration can go up to 25.

It takes ALL the correct numbers (score) I need. Either altogether, or just the first one.

Final goal:

city1, address1, state1, price1, etc.1, score1 (current problem).

city2, address2, state2, price2, etc.1, score2
And so forth.

I am not able to pull one score at a time for the right order, because it seems not to accept the the pre-match syntax - more specifically, the offset.

Any idea? Thank you so much for any help!

Edited by Gloak: cleaning text

4
Contributors
7
Replies
76
Views
2 Months
Discussion Span
Last Post by Gloak
0

Globiflow has limited PHP function. Here is the list.
I believe it has to be with preg_match, preg_match_all.
I have heard that this is not the best way to parse a file, but that is what we have today.

2

Are you sure about <proper price>$4</price>? To me it seems to violate the basic XML rules .... A 'decent XML' tag could be <proper_price>$4</proper_price>, or <price type="proper">$4</price>. Also when a closing tag is missing it should be made 'known' by using <comp score="9.0"/> (known as "implicit closing tag") instead of <comp score:"9.0">. Also note I use = instead of :. (Perhaps I have missed that in a new version the XML rules have been extended and tags can now be assigned using either = or :?)
How on earth could a parser know that the next tag isn't embedded without closing the tag? Unless it would have (to keep) a whole list of tags which are violating XML rules by missing a closing tag.... (In other words if you want to continue to use this 'incorrect XML', then you have to 'catch' all those violations. If you decide to switch to correct XML then likely any xml-parser could be used ... )

0

I am so sorry, the full code for the nodes with closing tag is:

preg_match_all_gf("/<amount currency=\"USD\">(.*?)<\/amount>/ism",[(Variable) token], 4)

This would get the 4th occurrence.

Here is a partial copy of the xml file.
zillowxml.jpg

Edited by Gloak

0

Thank you, @pty, I had read that. Globiflow is what we have today. This is to push date onto Podio, using Globiflow, which has its own set of PHP function.
I have no idea how to do this using regular code (not the code per se, but how to operate Podio CRM using regular PHP).

Edited by Gloak

1

Here is the solution that worked:

preg_match_all ('/score="([0-9]{1,2})./ism', [(Variable) token],5)

Thank you all.

Edited by Gloak

Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.