I have some cnf files, which are txt files that are in a specific format for conjunctive normal form formulas. I want to take this file, which looks something like this:

c The random seed used to shuffle this instance was seed=1755086696
p cnf 1200 4919
-35 491 -1180 0
479 1074 -949 0
609 1177 -67 0... (etc.)

and turn it into this:

formula=[[(35,0), (491,1),(1180,0)],[(479,1),(1074,1),(949,0)],[(609,1),(1177,1),(67,0)]... ]

Notice that the absolute value of the integer is used, but if it was negative, there is a 0 in the tuple, if it was positive, there is a 1 in the tuple.

Also notice that the 0 at the end of each line doesn't make a difference.

I need to do this as fast as possible because the formulas can get large. Linear time would be great.

Anyone know a good solution?

Edited 6 Years Ago by jmark13: n/a

I have some cnf files, which are txt files that are in a specific format for conjunctive normal form formulas. I want to take this file, which looks something like this:

c The random seed used to shuffle this instance was seed=1755086696
p cnf 1200 4919
-35 491 -1180 0
479 1074 -949 0
609 1177 -67 0... (etc.)

and turn it into this:

formula=[[(35,0), (491,1),(1180,0)],[(479,1),(1074,1),(949,0)],[(609,1),(1177,1),(67,0)]... ]

Notice that the absolute value of the integer is used, but if it was negative, there is a 0 in the tuple, if it was positive, there is a 1 in the tuple.

Also notice that the 0 at the end of each line doesn't make a difference.

I need to do this as fast as possible because the formulas can get large. Linear time would be great.

Anyone know a good solution?

I found out more info, the file is known as a DIMACS CNF file...

It's pretty straight forward, and since I'm sick and not doing much today...

test_data = [ "c The random seed used to shuffle this instance was seed=1755086696",
              "p cnf 1200 4919",
              "-35 491 -1180 0",
              "479 1074 -949 0",
              "609 1177 -67 0" ]

formula = []
for rec in test_data:
   rec=rec.strip()
   junk_list = []

   ## record has to start with a number or a minus sign
   if (rec[0].isdigit()) or (rec[0].startswith("-")):
      substrs = rec.split()
      for num in substrs:
         try:
            num_int = int(num)
            if num_int < 0:
               junk_list.append( (abs(num_int), 0) )
            elif num_int > 0:         ## eliminate zero
               junk_list.append( (num_int, 1) )
         except:
            print "error converting", rec

      formula.append(junk_list)
print formula
Comments
Fast and accurate code. Thanks!

This would be great, except it requires that I put a quote around every line, and the ones I am working with are thousands of lines. It would take me months of puting quotes around lines to do everything I need to do. Does anyone know how to work with the Dimacs CNF format as an input in python? It's a standard format, so I'm hoping someone out there can help. I mean, is it usually read in C++ ? maybe there is a C++ program that can convert it into a usable list for python, but I don't know anything about C++.

Great. The combination of the two suggestions should work. I will see if I can figure everything out and close this thread when I'm done or come back with more questions.

Thanks

Just read it, line by line, as a normal file, unless it is not a text file, in which case you would have to read and convert to text. See 2.4.5 here http://www.rexx.com/~dkuhlman/python_101/python_101.html#SECTION004450000000000000000

You, are an amazing person for taking time to do this. Thank you. Problem solved! I will be publishing a paper on the work that requires this input, if you would like to be mentioned with thanks for helping me with input formatting, I would be more than happy to. If not, Thanks again!

feel free to contact me through twitter and mention this thread. Twitter name: jmarkinman

Best to you!

-Mark

This question has already been answered. Start a new discussion instead.