RSS Forums RSS
Please support our Python advertiser: Programming Forums
Views: 298 | Replies: 6 | Thread Tools  Display Modes
Reply
Join Date: Nov 2008
Posts: 30
Reputation: adam291086 is an unknown quantity at this point 
Rep Power: 1
Solved Threads: 0
adam291086 adam291086 is offline Offline
Light Poster

Regular expressions

  #1  
Dec 1st, 2008
Hello,

I have a group of regular expressions

<Element Generation at d66238>
<Element Vitals at d662b0>
<Element Network at d66670>
<Element Hardware at d66eb8>
<Element Memory at d6ac88>
<Element Swap at d6e0a8>
<Element Swapdevices at d6e238>
<Element FileSystem at d6e5d0>
<Element Vitals at d662b0>

i need to get the word after elemenet i.e Vitals, Network ect ect

How would i do it?
AddThis Social Bookmark Button
Reply With Quote  
Join Date: Jul 2008
Posts: 351
Reputation: Gribouillis is on a distinguished road 
Rep Power: 1
Solved Threads: 53
Gribouillis's Avatar
Gribouillis Gribouillis is offline Offline
Posting Whiz

Re: Regular expressions

  #2  
Dec 1st, 2008
Here is a way
  1. import re
  2.  
  3. data = """
  4. <Element Generation at d66238>
  5. <Element Vitals at d662b0>
  6. <Element Network at d66670>
  7. <Element Hardware at d66eb8>
  8. <Element Memory at d6ac88>
  9. <Element Swap at d6e0a8>
  10. <Element Swapdevices at d6e238>
  11. <Element FileSystem at d6e5d0>
  12. <Element Vitals at d662b0>
  13. """
  14. expr = re.compile(r"<Element ([^\s]+)")
  15.  
  16. if __name__ == "__main__":
  17. for match in expr.finditer(data):
  18. print match.group(1)
Reply With Quote  
Join Date: Nov 2008
Posts: 30
Reputation: adam291086 is an unknown quantity at this point 
Rep Power: 1
Solved Threads: 0
adam291086 adam291086 is offline Offline
Light Poster

Re: Regular expressions

  #3  
Dec 1st, 2008
Can you explain that to me? I want to understant it
Reply With Quote  
Join Date: Jul 2008
Posts: 351
Reputation: Gribouillis is on a distinguished road 
Rep Power: 1
Solved Threads: 53
Gribouillis's Avatar
Gribouillis Gribouillis is offline Offline
Posting Whiz

Re: Regular expressions

  #4  
Dec 1st, 2008
Well, in the regular expression language, the string "<Element ([^\s]+)" means the character < followed by the character E ... followed by t followed by a single space followed by a group (...) which will be refered to later as group(1) . The \s means a whitespace character, [^\s] means any non whitespace character and the + means one or more such non whitespace characters. Finally, the r"..." syntax means don't interprete the backslashes in the string. Now the statement expr = re.compile(r"...") creates a regular expression object with my string, which has methods to search the occurences of the expression in a string. One of these methods is expr.finditer which iterates over all the matches found in the string data. For each such match, a match object match is created, which contain methods to access the occurrence found in the string. match.group(1) retrieves the part of the string which corresponds to the group ([^\s]+) .
Last edited by Gribouillis : Dec 1st, 2008 at 11:03 am.
Reply With Quote  
Join Date: Jul 2008
Location: Durham, NC
Posts: 360
Reputation: jlm699 is on a distinguished road 
Rep Power: 1
Solved Threads: 59
jlm699's Avatar
jlm699 jlm699 is offline Offline
Posting Whiz

Re: Regular expressions

  #5  
Dec 1st, 2008
Here's some good reading on regular expressions.
Let's Go Pens!

** Just because I reply to your question does not invite you to PM me. Keep discussions on the thread of topic, I will not answer your questions over PM. **
Reply With Quote  
Join Date: Nov 2008
Posts: 30
Reputation: adam291086 is an unknown quantity at this point 
Rep Power: 1
Solved Threads: 0
adam291086 adam291086 is offline Offline
Light Poster

Re: Regular expressions

  #6  
Dec 2nd, 2008
Thanks guys really helpful.

The method wont work on the following script and i get the error

Traceback (most recent call last):
File "/Users/adamplowman/Desktop/getting_xml_info copy.py", line 34, in <module>
for match in expr.finditer(txt):
TypeError: expected string or buffer


#!/usr/bin/env python

from xml.etree import ElementTree as ET
import os
import urllib
import re
info={}
test={}
def find_text(element):
    if element.text is None:
        for subelement in element:
            for txt in find_text(subelement):
                yield txt
                
    else:
        info[element] = element.text
        
        
data = " "
      

feed = urllib.urlopen("http://server-up.theatticnetwork.net/demo/")
try:
    tree = ET.parse(feed)
		
except Exception, inst:
    print "Unexpected error opening %s: %s" % (tree, inst)
    
root= tree.getroot()
text = root.getchildren()
for txt in text:
    expr = re.compile(r"<Element ([^\s]+)")
    if __name__ == "__main__":
        for match in expr.finditer(txt):
            print match.group(1)

i am not sure why though?
Reply With Quote  
Join Date: Jul 2008
Posts: 351
Reputation: Gribouillis is on a distinguished road 
Rep Power: 1
Solved Threads: 53
Gribouillis's Avatar
Gribouillis Gribouillis is offline Offline
Posting Whiz

Re: Regular expressions

  #7  
Dec 2nd, 2008
It's because the items in root.getchildren are not strings but Element objects. You could replace the end of your program with
  1. root= tree.getroot()
  2. text = root.getchildren()
  3. expr = re.compile(r"<Element ([^\s]+)")
  4. for element in text:
  5. txt = str(element)
  6. for match in expr.finditer(txt):
  7. print match.group(1)
However, this is not very useful because the data can readily be obtained as a field of the Element object, so you could simply write
  1. root= tree.getroot()
  2. text = root.getchildren()
  3. for element in text:
  4. print element.tag
Reply With Quote  
Reply

Only community members can participate in forum threads. You must register or log in to contribute.



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)

 

Thread Tools Display Modes
Forums | Blogs | Tutorials | Code Snippets | Whitepapers | RSS Feeds | Advertising
All times are GMT -4. The time now is 9:03 pm.
Newsletter Archive - Sitemap - Privacy Statement - Acceptable Use Policy - Contact Us
Forum system based on vBulletin Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
©2003 - 2008 DaniWeb® LLC