Hi !

I am new to DaniWeb. I searched everything to find an answer but i can't. This is for my final project and i need it to get my bachelor degree.

I have a text file that has this type of files:

2001.7.1.407 изутрината во тетовски
2003.5.3.20083 кзк ја штити таканаречен
2001.8.7.1830 винарските визби во македонија и оваа година

I need to separate the number after the second dot ex: 1, 3, 7 and place the text after the last number in a new file which is according to the numbers (note that the numbers are categories from 1-7) so each text has to go in its own text file which will be named by its category.

I hope you understand what i mean...i can't explain it better sorry :(

here is some of the code i wrote(it can be completely wrong):

DB = open('DB.txt', 'r')
kat1 = open('kat1.txt', 'a')

poz = 0
while True:
start = DB.find ('2', poz)
if start == -1: break
najdi = DB.find ('.', start)
end = DB.find ('.', najdi)

br = DB[start:najdi:end]
poz = br [br.find('.') + 1]


Recommended Answers

All 8 Replies

Heh heh, this is essentially what I do every day at work, and I feel like you're making it much more complicated than it really is.

Here's what I would suggest: split your text file into a linelist (maybe using file.readlines())

then, write a function that takes a line and outputs what you want, run this function for every line in your linelist. so:

do('2001.7.1.407 изутрината во тетовски') might return a tuple:


the second number would be numberspart.split('.')[1]

Open a file using that variable as the filename and write the textpart!

I knew that it was simpler then i tought...but it seemed so difficult to me...anyways

thank you

For future reference, you do not want to assume that the characters you search for are found. This is better IMHO, although I too would prefer to use .split(".")

if start == -1: 
najdi = DB.find ('.', start)
if najdi > -1:
      end = DB.find ('.', najdi)
      if end > -1:
         br = DB[start:najdi:end]

There is a ploblem. You see, there are files that are like the one that i showed you '2001.7.1.407 and text' (2001 is the year, 7 is the month, 1 is the category and 407 is the id of the news) but, months can be with either 1 or 2 numbers so i have to do it probably by counting the dots or i don't know what!!!
pls help me!!!

zachabesh meant something like this, using 2 splits per record

test_data = [
"2001.7.1.407 изутрината во тетовски\n",
"2003.5.3.20083 кзк ја штити таканаречен\n",
"2001.6.1.407 test rec #1\n",
"2003.10.3.20083 test rec #2\n",
"2001.11.7.1830 винарските визби во македонија и оваа година\n"]

for rec in test_data:
   dots_split = rec.split(".")
   print "\ndots_split =", dots_split
   if len(dots_split) > 3:
      print "year=%s,  month=%s,  catagory=%s" % \
            (dots_split[0], dots_split[1], dots_split[2])
      space_split = dots_split[3].split()
      if len(space_split) > 1:
         print "     id=%s,  name=%s" % (space_split[0], " ".join(space_split[1:]))
         print "space split error", dots_split[3]
      print "data error", rec

Thank you very much...this helped a lot!!! huh

One more question: Is it possible to separate text from numbers???

thanks again

You can use string_var.isdigit() and string_var.isalpha(), or use a try/except.

   print float_var, "is a number"
   print string_var, "is a string"
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, learning, and sharing knowledge.