I'm trying to design a lambda function that returns any line in a file that begins with any digit [0-9]. The point would be to return the whole line if it does, and skip any line if it doesn't. This is for log analysis where I want all the lines that begin with an IP address.

I'm beginning to think this is too complex for a lambda expression. Any suggestions?

Edited 2 Years Ago by chophouse

It shouldn't be too much for a lambda expression, so long as you recall a) that you can use a boolean expression such as in in a regular statement, and b) the string library includes a handy set of constant strings with the sets of letters, digits, and whitespace. All you should need to do is import string and apply in to your string's first character, and string.digits, and you should be able to use that as the lambda expression.

(OK, I'm sandbagging on this, as I've tested it and it works fine in Python 3.4, but I feel it is better to lead you to the code than simply give it to you. In this case, the explanation should be more than adequate for you to get the appropriate code worked out.)

Lambda expressions are not used so extensively and are kind of limitted in Python compared to functional programming like Lisp/Scheme. I would use simple basic expression, maybe list comprehension form:

[line for line in myfile if not line[0].isdigit()]

Why lambda expression?
It can make code less readable.

All you should need to do is import string and apply in to your string's first character, and string.digits, and you should be able to use that as the lambda expression.

I agree,but there is no need to import string.
All string method are always available.
import string is only needed for make alfabet.

Just a test,and when the topic is lambda expression.

>>> print((lambda x: x[0].isdigit())('1hello'))
True
>>> print((lambda x: x[0].isdigit())('hello'))
False

>>> print((lambda x: x.startswith(('1','2','3')))('2hello'))
True
>>> print((lambda x: x.startswith(('1','2','3')))('3hello'))
True
>>> print((lambda x: x.startswith(('1','2','3')))('4hello'))
False

>>> import re
>>> print((lambda x: re.match(r'\d+', x))('987hello'))
<_sre.SRE_Match object at 0x02B01D40>
>>> print((lambda x: re.match(r'\d+', x))('hello'))
None

chophouse first you make a ordinary function,because that is eaiser.
Then if that dos not do it for you look into lambda expression or functools.partial
Or the more pythonic way as PyTony suggests.

Just one more,here a generator(yield).
Wish make code memory efficiency(only 1 line in memory).

def foo():
    with open('start_digit.txt') as f:
        for line in f:
            if line[0].isdigit():
                yield line.strip()

for line in foo():
    print line

Edited 2 Years Ago by snippsat

Comments
+1 for function foo()

It is a good idea to strip() a record first, unless you are really really sure of the format.

Lambda is a bad habit IMHO. Guido wanted to omit lambda (and map, reduce, & filter) from Python3 because list comprehension is faster and the code is easier to understand. So form the habit of using list comprehension, or partial when you want to pass arguments to a function.

Edited 2 Years Ago by woooee

Still working this thru. The reason for lambda is that it will be part of a pyspark filter statement, i.e.

RDD2 = RDD1.filter(lambda line: "take line only if it starts with digit")

where "take line only if it starts with digit" is the part I'm trying to figure out

You can use lambda line: bool(re.match(r"^\d", line)), but following woooee's idea of not using lambda, you can also write

def starts_with_digit(line):
    return bool(re.match(r"^\d", line))

RDD2 = RDD1.filter(starts_with_digit)

Got this to work:

import string

lambda line: line if line[0] in string.digits else None

Don't forget the case where there is no line[0]:

lambda line: bool(line and (line[0] in string.digits))

Edited 2 Years Ago by Gribouillis

This question has already been answered. Start a new discussion instead.