Hi !

I am new to DaniWeb. I searched everything to find an answer but i can't. This is for my final project and i need it to get my bachelor degree.

I have a text file that has this type of files:

2001.7.1.407 изутрината во тетовски
2003.5.3.20083 кзк ја штити таканаречен
2001.8.7.1830 винарските визби во македонија и оваа година

I need to separate the number after the second dot ex: 1, 3, 7 and place the text after the last number in a new file which is according to the numbers (note that the numbers are categories from 1-7) so each text has to go in its own text file which will be named by its category.

I hope you understand what i mean...i can't explain it better sorry :(

here is some of the code i wrote(it can be completely wrong):

DB = open('DB.txt', 'r')
kat1 = open('kat1.txt', 'a')

poz = 0
while True:
start = DB.find ('2', poz)
if start == -1: break
najdi = DB.find ('.', start)
end = DB.find ('.', najdi)

br = DB[start:najdi:end]
poz = br [br.find('.') + 1]

THANK YOU IN ADVANCE!!!

Recommended Answers

All 8 Replies

Heh heh, this is essentially what I do every day at work, and I feel like you're making it much more complicated than it really is.

Here's what I would suggest: split your text file into a linelist (maybe using file.readlines())

then, write a function that takes a line and outputs what you want, run this function for every line in your linelist. so:

do('2001.7.1.407 изутрината во тетовски') might return a tuple:

(numberspart,letterspart)

the second number would be numberspart.split('.')[1]

Open a file using that variable as the filename and write the textpart!

I knew that it was simpler then i tought...but it seemed so difficult to me...anyways

thank you
J

For future reference, you do not want to assume that the characters you search for are found. This is better IMHO, although I too would prefer to use .split(".")

if start == -1: 
   break
najdi = DB.find ('.', start)
if najdi > -1:
      end = DB.find ('.', najdi)
      if end > -1:
         br = DB[start:najdi:end]

There is a ploblem. You see, there are files that are like the one that i showed you '2001.7.1.407 and text' (2001 is the year, 7 is the month, 1 is the category and 407 is the id of the news) but, months can be with either 1 or 2 numbers so i have to do it probably by counting the dots or i don't know what!!!
pls help me!!!
J

zachabesh meant something like this, using 2 splits per record

test_data = [
"2001.7.1.407 изутрината во тетовски\n",
"2003.5.3.20083 кзк ја штити таканаречен\n",
"2001.6.1.407 test rec #1\n",
"2003.10.3.20083 test rec #2\n",
"2001.11.7.1830 винарските визби во македонија и оваа година\n"]

for rec in test_data:
   rec=rec.strip()
   dots_split = rec.split(".")
   print "\ndots_split =", dots_split
   if len(dots_split) > 3:
      print "year=%s,  month=%s,  catagory=%s" % \
            (dots_split[0], dots_split[1], dots_split[2])
      space_split = dots_split[3].split()
      if len(space_split) > 1:
         print "     id=%s,  name=%s" % (space_split[0], " ".join(space_split[1:]))
         
      else:
         print "space split error", dots_split[3]
   else:
      print "data error", rec

Thank you very much...this helped a lot!!! huh

One more question: Is it possible to separate text from numbers???

thanks again

You can use string_var.isdigit() and string_var.isalpha(), or use a try/except.

try:
   float_var=float(string_var)
   print float_var, "is a number"
except:
   print string_var, "is a string"
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.