DaniWeb IT Discussion Community

DaniWeb IT Discussion Community (http://www.daniweb.com/forums/index.php)
-   Python (http://www.daniweb.com/forums/forum114.html)
-   -   Code Snippet: A string filtering function based on patterns. (http://www.daniweb.com/forums/thread217237.html)

Gribouillis Jan 13th, 2009 9:22 am
A string filtering function based on patterns.
 
This snippet defines a function
patfilter(pattern, rule, sequence)
which implements filtering a sequence of strings according to various criteria. The output is either a subsequence of strings, or a sequence of match objects.

  1. #!/usr/bin/env python
  2. # patfilter.py
  3. # Copyright (c) Gribouillis at www.daniweb.com
  4. import re
  5. from fnmatch import fnmatch ,fnmatchcase ,filter as fnfilter
  6. try :# python 2.5
  7. from itertools import ifilter as filter
  8. except ImportError :# python 3.0
  9. pass
  10.  
  11. def patfilter (pattern ,rule ,sequence ):
  12. """patfilter(pattern, rule, sequence_of_strings) --> sequence
  13. patfilter.rules() -> the set of accepted rules
  14. ARGUMENTS:
  15. pattern <- a regular expression (re object or string)
  16. rule <- a string
  17. sequence <- an iterable sequence of strings
  18. OUTPUT:
  19. depending on the rule
  20. "m" -> the strings in the sequence which match the pattern
  21. "s" -> the strings which contain the pattern
  22. "!m" -> the strings which don't match the pattern
  23. "!s" -> the strings which don't contain the pattern
  24. "@m" -> the match objects for all the matches in the sequence
  25. "@s" -> the match objects for at most one search per string
  26. "@a" -> the match objects for all the searches in the sequence
  27. (subsequent match objects may concern the same string)
  28. "f" -> the strings which match the pattern in the sense of
  29. the fnmatch module (*)
  30. "!f" -> the strings which dont fnmatch the pattern (*)
  31. "F" -> the string which fnmatch, case sensitive (*)
  32. "!F" -> the strings which don't fnmatch, case sensitive (*)
  33.  
  34. (*) the pattern must be a string for fnmatch rules.
  35. """
  36. if rule not in _PatFilter ._rules :
  37. raise ValueError("Unknown rule.")
  38. if rule [-1 ]not in "fF":
  39. pattern =re .compile (pattern )
  40. return getattr (_PatFilter ,_PatFilter ._rules [rule ])(pattern ,sequence )
  41.  
  42. def rules ():
  43. "rules() -> the set of rules accepted by patselect."
  44. return set (_PatFilter ._rules )
  45.  
  46. patfilter .rules =rules
  47.  
  48. __all__ =["patselect","rules"]
  49.  
  50. class _PatFilter (object ):
  51. _rules ={
  52. "m":"match",
  53. "s":"search",
  54. "!m":"nomatch",
  55. "!s":"nosearch",
  56. "@m":"matches",
  57. "@s":"searches",
  58. "@a":"allsearches",
  59. "f":"fnmatch",
  60. "!f":"nofnmatch",
  61. "F":"fnmatchcase",
  62. "!F":"nofnmatchcase"
  63. }
  64. @staticmethod
  65. def matches (pat ,seq ):
  66. return filter (None ,(pat .match (x )for x in seq ))
  67. @staticmethod
  68. def searches (pat ,seq ):
  69. return filter (None ,(pat .search (x )for x in seq ))
  70. @staticmethod
  71. def allsearches (pat ,seq ):
  72. return (mo for x in seq for mo in pat .finditer (x ))
  73. @staticmethod
  74. def match (pat ,seq ):
  75. return filter (lambda x :pat .match (x ),seq )
  76. @staticmethod
  77. def search (pat ,seq ):
  78. return filter (lambda x :pat .search (x ),seq )
  79. @staticmethod
  80. def nomatch (pat ,seq ):
  81. return filter (lambda x :not pat .match (x ),seq )
  82. @staticmethod
  83. def nosearch (pat ,seq ):
  84. return filter (lambda x :not pat .search (x ),seq )
  85. @staticmethod
  86. def fnmatch (pat ,seq ):
  87. return fnfilter (seq ,pat )
  88. @staticmethod
  89. def nofnmatch (pat ,seq ):
  90. return filter (lambda x :not fnmatch (x ,pat ),seq )
  91. @staticmethod
  92. def fnmatchcase (pat ,seq ):
  93. return filter (lambda x :fnmatchcase (x ,pat ),seq )
  94. @staticmethod
  95. def nofnmatchcase (pat ,seq ):
  96. return filter (lambda x :not fnmatchcase (x ,pat ),seq )
  97.  
  98. if __name__ =="__main__":
  99. L =list (open (__file__ ))
  100. for r in patfilter .rules ():
  101. pat ="*seq*"if r [-1 ]in "fF"else "seq"
  102. print ("======= RULE '%s' ==== PATTERN '%s' ===="%(r ,pat ))
  103. for item in patfilter (pat ,r ,L ):
  104. if r [0 ]=="@":
  105. item =item .string
  106. print (item .rstrip ())
leegeorg07 Jan 15th, 2009 9:28 am
how do you fill the pattern, rule and sequence sections?

patfilter(pattern, rule, sequence)???

Gribouillis Jan 16th, 2009 7:00 pm
For example
lines = iter(open(filename))  # this is a sequence of strings (here,  an iterator)
for line in patfilter("<a\s*href", "s", lines): # filter the lines which contain a href
    myfunction(line)
or
filenames = os.listdir(os.getcwd())
for name in patfilter("*.pyc", "f", filenames): # filter the compiled python files
    os.unlink(name)


All times are GMT -4. The time now is 10:07 am.

Forum system based on vBulletin Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
©2003 - 2009 DaniWeb® LLC