Building argv from command line

Gribouillis Gribouillis is offline Offline 25 Days Ago, 5:47 pm |
0
Some time ago, I was writing a small command line interpreter, with the help of the standard module cmd which offers minimal support for such tasks, and I had the problem that this module doesn't have a function to parse a command line to produce a list argv which can be passed to optparse for example.
I found a first solution with the third party module pygments which contains parsers for different languages, and I parsed the command line using pygment's bash parser.
However, I was not completely happy with this solution and I started looking for the C function wich builds argv for C programs. I finally found such a function in GNU libiberty library which is used by the gcc compiler.
This function was simple enough and I decided to write a pure python implementation of this function, using python's regular expressions and following closely the syntax rules used in libiberty's buildargv function (a 100% compatibility is not at all guaranteed)
This snippet is the result of this coding. The class Argv, which subclasses list, contains methods to transform a command line into a list of arguments with a behaviour similar to that of a C compiler. It also contains methods to write the argument list to a response file and read files to extract command arguments.

Here is the code, enjoy
Last edited by Gribouillis; 25 Days Ago at 6:03 pm.
Quick reply to this message  
Python Syntax
  1. # argv.py
  2. """
  3. This module implements a class Argv which parses a command line
  4. to produce a list of arguments similar to the arguments passed
  5. to a program called from a command shell.
  6.  
  7. Note that this class doesn't handle arguments expansion made by
  8. a specific shell (for example bash replaces $HOME by the path
  9. to your home directory before passing the command line to the
  10. program).
  11.  
  12. An effort was made to follow as closely as possible the algorithm
  13. used in GNU libiberty's buildargv function, which is used by the
  14. gnu C compiler for example.
  15.  
  16. Other functions are provided as method which behaviour imitates
  17. functions of libiberty.
  18.  
  19. * Written by Gribouillis for the python forum at Daniweb.com.
  20. Use this code freely, copy it, modify it, redistribute it.
  21. """
  22.  
  23. __all__ = [
  24. "Argv", "buildargv", "writeargv",
  25. "expandargv", "dupargv", "freeargv"
  26. ]
  27.  
  28. import re
  29. import string
  30.  
  31. def end_of_string(eos):
  32. return "(?P<{eos}>$)".format(eos=eos)
  33.  
  34. def one_or_more(pat):
  35. return r"(?:{pat})+" .format(pat=pat)
  36.  
  37. def zero_or_more(pat):
  38. return r"(?:{pat})*" .format(pat=pat)
  39.  
  40. def one_of(*pats):
  41. return "(?:{pats})".format(pats = "|".join(pats))
  42.  
  43. def all_of(*pats):
  44. return "".join(pats)
  45.  
  46. escaped_char = r"\\."
  47. escaped_opt = r"\\.?"
  48. non_space_or_quote = r"[^{0}\'\"]".format(repr(string.whitespace))
  49. one_space_eoarg = r"(?P<eoarg>[{0}])".format(repr(string.whitespace))
  50. s_quote = r"[\']"
  51. d_quote = r'[\"]'
  52. special_re = re.compile(r"[{0}\'\"\\]".format(repr(string.whitespace)))
  53.  
  54. def non_quote(quote):
  55. return r"[^" + quote[1:]
  56.  
  57. def quoted(quote, eos):
  58. return all_of(
  59. quote,
  60. zero_or_more(one_of(escaped_char, non_quote(quote))),
  61. one_of(end_of_string(eos), quote)
  62. )
  63.  
  64. item_regex = one_of(
  65. one_or_more(
  66. one_of(escaped_char, non_space_or_quote),
  67. ),
  68. quoted(s_quote, "eoss"),
  69. quoted(d_quote, "eosd"),
  70. one_space_eoarg,
  71. )
  72.  
  73. item_re = re.compile(item_regex)
  74. escaped_re = re.compile(escaped_opt)
  75.  
  76. class Argv(list):
  77. def __init__(self, *args):
  78. list.__init__(self, *args)
  79.  
  80. def _repl_func(self, mo):
  81. if mo.group("eoarg") is not None:
  82. s = ''.join(self[-1])
  83. self[-1] = s
  84. self.append([])
  85. else:
  86. s = mo.group(0)
  87. if s[0] == "'":
  88. self[-1].append(s[1: -1 if mo.group("eoss") is None else len(s)])
  89. elif s[0] == '"':
  90. self[-1].append(s[1: -1 if mo.group("eosd") is None else len(s)])
  91. else:
  92. self[-1].append(s)
  93.  
  94. @classmethod
  95. def build(cls, command_line):
  96. """Argv.build(command_line) --> a new Argv object containing
  97. arguments extracted from the command line"""
  98. self = cls([[]])
  99. item_re.sub(self._repl_func, command_line)
  100. s = ''.join(self[-1])
  101. self[-1] = s
  102. items = (escaped_re.sub(lambda x: x.group(0)[1:], s) for s in self)
  103. self [:] = [s for s in items if s]
  104. return self
  105.  
  106. def write(self, fileobj):
  107. """Write the argv to a file object, with one argument per line"""
  108. for arg in self:
  109. s = special_re.sub(lambda m: "\\" + m.group(0), arg)
  110. fileobj.write(s)
  111. fileobj.write("\n")
  112.  
  113. def expand(self):
  114. """Expand argv: each argument starting with @ is supposed to
  115. be the path to a 'response' file containing other arguments.
  116. If possible, this file is opened and the arguments read
  117. in the file are inserted into self."""
  118. result = self.__class__()
  119. for arg in self:
  120. if arg.startswith("@"):
  121. try:
  122. s = open(arg[1:]).read()
  123. except OSError:
  124. continue
  125. result.extend(Argv.build(s).expand())
  126. else:
  127. result.append(arg)
  128. return result
  129.  
  130. def free(self):
  131. """Empties the Argv object"""
  132. self[:] = []
  133.  
  134. def dup(self):
  135. return self.__class__(self)
  136.  
  137. # a few wrapper functions to provide an interface similar to
  138. # gnu libiberty's argv interface
  139.  
  140. def buildargv(line):
  141. return Argv.build(line)
  142.  
  143. def writeargv(argv, fileobj):
  144. argv.write(fileobj)
  145.  
  146. def expandargv(argv):
  147. return argv.expand()
  148.  
  149. def dupargv(argv):
  150. return argv.dup()
  151.  
  152. def freeargv(argv):
  153. argv.free()
  154.  
  155.  
  156. # A simple test function
  157. # we use the same test command lines as argv.c in gnu libiberty.
  158.  
  159. def tests():
  160. lines = [
  161. "a simple command line",
  162. "arg 'foo' is single quoted",
  163. "arg \"bar\" is double quoted",
  164. "arg \"foo bar\" has embedded whitespace",
  165. "arg 'Jack said \\'hi\\'' has single quotes",
  166. "arg 'Jack said \\\"hi\\\"' has double quotes",
  167. "a b c d e f g h i j k l m n o p q r s t u v w x y z 1 2 3 4 5 6 7 8 9",
  168. # This should be expanded into only one argument.
  169. "trailing-whitespace ",
  170. "",
  171. ]
  172. for line in lines:
  173. print(line)
  174. print(Argv.build(line))
  175. print
  176.  
  177. if __name__ == "__main__":
  178. tests()
  179.  
  180. """ test code output --->
  181. a simple command line
  182. ['a', 'simple', 'command', 'line']
  183.  
  184. arg 'foo' is single quoted
  185. ['arg', 'foo', 'is', 'single', 'quoted']
  186.  
  187. arg "bar" is double quoted
  188. ['arg', 'bar', 'is', 'double', 'quoted']
  189.  
  190. arg "foo bar" has embedded whitespace
  191. ['arg', 'foo bar', 'has', 'embedded', 'whitespace']
  192.  
  193. arg 'Jack said \'hi\'' has single quotes
  194. ['arg', "Jack said 'hi'", 'has', 'single', 'quotes']
  195.  
  196. arg 'Jack said \"hi\"' has double quotes
  197. ['arg', 'Jack said "hi"', 'has', 'double', 'quotes']
  198.  
  199. a b c d e f g h i j k l m n o p q r s t u v w x y z 1 2 3 4 5 6 7 8 9
  200. ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '1', '2', '3', '4', '5', '6', '7', '8', '9']
  201.  
  202. trailing-whitespace
  203. ['trailing-whitespace']
  204.  
  205.  
  206. []
  207. """

Tags
argv, cmd, command, gnu, line

Message:


Thread Tools Search this Thread



About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC