Find() Question

Question

zublacko 0 Newbie Poster

16 Years Ago

Is there a way to find something from html code like <img and then find the src and alt of an image in the <img line? I can get the <img line out but i dont understand how to use find to find the start index and end index. For example, i use the blah.find('src="') which is the start index but how do i get the end index when it could always be different depending on the website?

python

3 Contributors
3 Replies
115 Views
3 Days Discussion Span
Latest Post 16 Years Ago Latest Post by jlm699

All 3 Replies

jlm699 320 Veteran Poster

16 Years Ago

After you perform x = s.find("<img") you will receive an index. Feed this index into the find function when looking for your closing parenthesis and it will give you the ending index. You can then use the two indexes to slice out the image source URL here's an example (off the cuff)

img_tag_idx = s.find("<img")
start_idx = s.find("\"", img_tag_idx + 1)
end_idx = s.find("\"", start_idx + 1)
url = s[ start_idx + 1 : end_idx ]

But really this is not a very good method, as you should use a regex or an HTML parsing module, as the many variations in how people code HTML could easily throw this off.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

adam1122 7 Junior Poster · Answer 1 · 2009-04-13T00:24:58+00:00

Search for the next " after the one in src="

As a side note, you would be better off using regex or an HTML parser.

zublacko 0 Newbie Poster · Answer 2 · 2009-04-13T01:42:51+00:00

I was reading through the find documentation but I didnt understand how the you can find the start and end index.
I wrote x = s.find("<img","/>") it gave me an error saying TypeError: slice indices must be integers or None or have an __index__ method. How do i get an index of the end i can get the start just not the end.

Find() Question

Recommended Answers Collapse Answers

All 3 Replies

Recommended Answers