0

I have several strings that I want to split using different delimiters such as ",{}()

$ppl[0] = "Balko, Vlado		\"Panelбk\" (2008) {Byt na tretom (#1.55)}";
$ppl[1] = "'Abd Al-Hamid, Ja'far	A Two Hour Delay (2001)";
$ppl[2] = "'t Hoen, Frans		De reьnie (1963) (TV)";
$ppl[3] = "1, Todd			\"5 Deadly Videos\" (2004)";
$ppl[4] = "			\"School voor volwassenen\" (1960) {(#1.7)}";

So far I was using this pattern for 1 and 2:

$pattern = '#[,\t()]+#'

although it's leaving some undesired entries at the end [ 4--> ) ]:

Array ( [0] => 'Abd Al-Hamid [1] => Ja'far [2] => A Two Hour Delay [3] => 2001 [4] => )

But when I try to adapt it capture text inside quotes I get a lot of empty spaces in the middle of the array which mess up the rest of the things. I could of course remove the empty spaces from the array and fix up the keys but I'd rather have a correct regex.
The delimiters I need are

" , \t () {}

.
I tried

["(.*?)"]

to get text between quotes but that leaves me with empty values in my array. Any ideas?

2
Contributors
4
Replies
5
Views
5 Years
Discussion Span
Last Post by Swiftle
1

Perhaps replacing first and then splitting?

$str = preg_split("/\s+/",trim(preg_replace("/(\s|\"|,|\(|\)|\{|\}| )/",' ',$str)));
print_r($str);

where $str is "'Abd Al-Hamid, Ja'far A Two Hour Delay (2001)"

//EDIT - just noticed you don't want to split on space. going to bed - at least gives you an idea.

Edited by diafol: n/a

0

I like the idea of replacing the delimiters by a single char and than splitting on it. I'm just trying to figure out how to avoid empty spaces when having something like this:

Part1$Part2$$Part3

will produce

[0]->Part1[1]->Part2[2]->[3]->part3

where $ is the new delimiter

Edited by Swiftle: n/a

0

OK, maybe this?

$str = "'Abd Al-Hamid, Ja'far	A    	{help}	#	Two Hour Delay (2001)";
$str = preg_split("/[\$]+/",trim(preg_replace("/(\t|\"|,|\(|\)|\{|\})/",'$',$str)));
$new_r = array_filter($str);
print_r($new_r);
0

Thanks ardav, but your first solution was spot on. The thing is that the strings use a strange format. For example sometimes there is a space between the some of the elements instead of a tab which messes your replace. At first I thought about removing the white spaces from everywhere except from the episode's name (the on in between {}) but instead I just used what you gave me and then removed the empty spaces from the array. Thanks again for the solution, its way easier to do it your way instead of trying to find a super duper regex that takes all the different delimiters.

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.