A string filtering function based on patterns.

Gribouillis Gribouillis is offline Offline Jan 13th, 2009, 9:22 am |
0
This snippet defines a function patfilter(pattern, rule, sequence) which implements filtering a sequence of strings according to various criteria. The output is either a subsequence of strings, or a sequence of match objects.
Quick reply to this message  
Python Syntax
  1. #!/usr/bin/env python
  2. # patfilter.py
  3. # Copyright (c) Gribouillis at www.daniweb.com
  4. import re
  5. from fnmatch import fnmatch ,fnmatchcase ,filter as fnfilter
  6. try :# python 2.5
  7. from itertools import ifilter as filter
  8. except ImportError :# python 3.0
  9. pass
  10.  
  11. def patfilter (pattern ,rule ,sequence ):
  12. """patfilter(pattern, rule, sequence_of_strings) --> sequence
  13. patfilter.rules() -> the set of accepted rules
  14. ARGUMENTS:
  15. pattern <- a regular expression (re object or string)
  16. rule <- a string
  17. sequence <- an iterable sequence of strings
  18. OUTPUT:
  19. depending on the rule
  20. "m" -> the strings in the sequence which match the pattern
  21. "s" -> the strings which contain the pattern
  22. "!m" -> the strings which don't match the pattern
  23. "!s" -> the strings which don't contain the pattern
  24. "@m" -> the match objects for all the matches in the sequence
  25. "@s" -> the match objects for at most one search per string
  26. "@a" -> the match objects for all the searches in the sequence
  27. (subsequent match objects may concern the same string)
  28. "f" -> the strings which match the pattern in the sense of
  29. the fnmatch module (*)
  30. "!f" -> the strings which dont fnmatch the pattern (*)
  31. "F" -> the string which fnmatch, case sensitive (*)
  32. "!F" -> the strings which don't fnmatch, case sensitive (*)
  33.  
  34. (*) the pattern must be a string for fnmatch rules.
  35. """
  36. if rule not in _PatFilter ._rules :
  37. raise ValueError("Unknown rule.")
  38. if rule [-1 ]not in "fF":
  39. pattern =re .compile (pattern )
  40. return getattr (_PatFilter ,_PatFilter ._rules [rule ])(pattern ,sequence )
  41.  
  42. def rules ():
  43. "rules() -> the set of rules accepted by patselect."
  44. return set (_PatFilter ._rules )
  45.  
  46. patfilter .rules =rules
  47.  
  48. __all__ =["patselect","rules"]
  49.  
  50. class _PatFilter (object ):
  51. _rules ={
  52. "m":"match",
  53. "s":"search",
  54. "!m":"nomatch",
  55. "!s":"nosearch",
  56. "@m":"matches",
  57. "@s":"searches",
  58. "@a":"allsearches",
  59. "f":"fnmatch",
  60. "!f":"nofnmatch",
  61. "F":"fnmatchcase",
  62. "!F":"nofnmatchcase"
  63. }
  64. @staticmethod
  65. def matches (pat ,seq ):
  66. return filter (None ,(pat .match (x )for x in seq ))
  67. @staticmethod
  68. def searches (pat ,seq ):
  69. return filter (None ,(pat .search (x )for x in seq ))
  70. @staticmethod
  71. def allsearches (pat ,seq ):
  72. return (mo for x in seq for mo in pat .finditer (x ))
  73. @staticmethod
  74. def match (pat ,seq ):
  75. return filter (lambda x :pat .match (x ),seq )
  76. @staticmethod
  77. def search (pat ,seq ):
  78. return filter (lambda x :pat .search (x ),seq )
  79. @staticmethod
  80. def nomatch (pat ,seq ):
  81. return filter (lambda x :not pat .match (x ),seq )
  82. @staticmethod
  83. def nosearch (pat ,seq ):
  84. return filter (lambda x :not pat .search (x ),seq )
  85. @staticmethod
  86. def fnmatch (pat ,seq ):
  87. return fnfilter (seq ,pat )
  88. @staticmethod
  89. def nofnmatch (pat ,seq ):
  90. return filter (lambda x :not fnmatch (x ,pat ),seq )
  91. @staticmethod
  92. def fnmatchcase (pat ,seq ):
  93. return filter (lambda x :fnmatchcase (x ,pat ),seq )
  94. @staticmethod
  95. def nofnmatchcase (pat ,seq ):
  96. return filter (lambda x :not fnmatchcase (x ,pat ),seq )
  97.  
  98. if __name__ =="__main__":
  99. L =list (open (__file__ ))
  100. for r in patfilter .rules ():
  101. pat ="*seq*"if r [-1 ]in "fF"else "seq"
  102. print ("======= RULE '%s' ==== PATTERN '%s' ===="%(r ,pat ))
  103. for item in patfilter (pat ,r ,L ):
  104. if r [0 ]=="@":
  105. item =item .string
  106. print (item .rstrip ())
0
leegeorg07 leegeorg07 is online now Online | Jan 15th, 2009
how do you fill the pattern, rule and sequence sections?

patfilter(pattern, rule, sequence)???
 
0
Gribouillis Gribouillis is offline Offline | Jan 16th, 2009
For example
  1. lines = iter(open(filename)) # this is a sequence of strings (here, an iterator)
  2. for line in patfilter("<a\s*href", "s", lines): # filter the lines which contain a href
  3. myfunction(line)
or
  1. filenames = os.listdir(os.getcwd())
  2. for name in patfilter("*.pyc", "f", filenames): # filter the compiled python files
  3. os.unlink(name)
 
 

Message:


Thread Tools Search this Thread



About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC