Hi I'm new to python and this forum and I am trying to work on a program that splits the html text file into its components:

The HTML file looks something like this:

"
Hello World #title

Today is a Friday.
The weekend is coming.
Lets have fun. #summary

1923 #date

John Doe # Name
"

I'd like the output to look like

data = {'title' : 'hello world', 'summary': ['Today is a Friday.','The weekend is coming.','Lets have fun.'], 'date': 1923 , 'Name': 'John Doe'}

My current code is:

def parse(file):
data = defaultdict(list)
data = {}
f= open(filename, 'r').readlines()
for line in f:
if line != '':
d['Title'].append(line)
elif line == '':

I have difficulties trying to write the code whereby when the function meets an empty line, it would replace it with d['summary'].append(line), and when it meets the next empty line, it will be replaced with d['date'].append(line) and when it meets with an empty line again it will be replaced with d['name'].append(line). is there any way that, once reading an empty line, ask the function to read the next line?

Another point is, for the summary, is it possible to join all the lines together so that it would look like
'summary': 'Today is a Friday. The weekend is coming. Lets have fun.'
instead of
'summary': ['Today is a Friday.','The weekend is coming.','Lets have fun.'] ?

Any help will be greatly appreciated!

edit: How do i show the tab of my code? It doesn't show up on the article.

When you post, highlight your code and click on the code tab above, something like this:

def add(x, y):
    return x+y

print(add(2, 3))
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.