Niloofar24 15 Posting Whiz

Well, i stil have problem with this program, what should i type in line 11, for the update function?!

All my code that should be updated every seconds, are in the build function in the CountdownApp class.

Niloofar24 15 Posting Whiz

That's ok @AleMonteiro.
I didn't find the answer on those 3 links unfortunatelly. So i have to use a host instead of on my localhost. Thank you anyway.

Niloofar24 15 Posting Whiz

@iJunkie22, can you explain your last post please? about str.istitle(), and would be good if give me a little example.

Niloofar24 15 Posting Whiz

Well, it seems my question is basically wrong.
Forget that question, sorry!

Niloofar24 15 Posting Whiz

I wanted to post this question into a new Discussion but as it was related to this discussion so i will ask here. My code:

from bs4 import BeautifulSoup
import urllib2

mylist = []

url = 'http://www.niloofar3d.ir/try.html'
html = urllib2.urlopen(url).read()
soup = BeautifulSoup(html)
tag_li = soup.find_all('li')
for tag in tag_li:
    if tag.text.startswith('A'):
        mylist.append(tag.text)
if 'A' in mylist[0]:
    if 'A' in mylist[1]:
        if 'A' in mylist[2]:
            print mylist
else:
    'sorry!' 

The output must be the else message but it print this output:

[u'Apple', u'Age', u'Am']

What is the problem? I want the script to check if the first 3 words (indexes) of mylist start with the letter 'A', print the list, but if not, print 'sorry!'. But as you can see here, it has printed even the index[4]!

And one more question, how i can remove those u letters that has printed into output?

Niloofar24 15 Posting Whiz

Thank you @snippsat. Your example was exactly what i was looking for.

And thank you @Slavi for your answer and explanation.

Niloofar24 15 Posting Whiz

I tried this for testing:

>>> import urllib2
>>> import re
>>> html = 'https://www.daniweb.com/software-development/python/threads/492669/how-to-print-only-the-content-of-all-tags-from-a-url-page'
>>> re.findall(r'<p>(.+),/p>', html)

But the output was:

[]

I tried other tags too but all outputs was [], what's the problem?

Niloofar24 15 Posting Whiz

Hi.
How i can ask my crawler to print only the text of all <li></li> tags in a url page?
I want to save the text of all <li></li> tags in a text file (without<li></li> words.)

Niloofar24 15 Posting Whiz

Thank you @DaniUserJS for introducing those websites. Of course i can't use pay sites because of payment systems problems, so i have to look for a free tutorial. I'm familiar with wordpress as i'm using it for my personal 3D website. But i want to learn how to create my own CMS. I know HTML, CSS and JavaScript too.

Niloofar24 15 Posting Whiz

Hello.
I'm trying to create a simple CMS, and looking for a good tutorial, any idea?!

Niloofar24 15 Posting Whiz

Well, ok i will try this way, but i think there should be an other more simple way that we have not find it yet.

Niloofar24 15 Posting Whiz

Well, @David W, i can check every word in that page to see if the words startwords that starts with latter "A", are the names of singers or any other word that starts with "A"??!!
How my crawler should recognize human names fom any other word starts with letter "A"?!

What do you mean by look it up in a dictionary of just names? In wich dictionary you mean?

Niloofar24 15 Posting Whiz

Hello.
I have a homework. I have asked to create a web crawler that be able to enter into a music website and then for the first step, collect the name of singers that their names starts with the letter "A".

Now i need a little help for this step. How my crawler should understand wich words in that page are the singers names?! The crawler should find their names in a special tag, correct?! But what kind of tag?! Their names could be in any tag like <h4></h4> for example or in a single <p></p> tag or in a <b></b> or in <ul></ul> or any other tag!
So i just need a hint to find the way, any idea?!

Niloofar24 15 Posting Whiz

So why it works with 'http://www.python.org' as the main url i give to programe, but when i tried it with another url, the result was what i posted in my previous post?!

Niloofar24 15 Posting Whiz

Hello my friends.
Look at this please:

>>> from bs4 import BeautifulSoup
>>> import urllib2
>>> url = urllib2.urlopen('https://duckduckgo.com/?q=3D&t=canonical&ia=meanings')
>>> soup = BeautifulSoup(url)
>>> links = soup('a')
>>> print links
[<a class="header__logo-wrap" href="/?t=canonical" tabindex="-1"><span class="header__logo">DuckDuckGo</span></a>, <a class="search__dropdown" href="javascript:;" id="search_dropdown" tabindex="4"></a>, <a href="https://duckduckgo.com/html/?q=3D">here</a>]
>>> 

I used this https://duckduckgo.com/?q=3D&t=canonical&ia=meanings as the url, i thought the code above shoud do like this:

Find all the links in that page of the internet, but you can see the result! As there are many links to different websites on that url page, so why it didn't print the url of each website into output?!

Niloofar24 15 Posting Whiz

Thank you @Anders 2.

Thank you @Vegaseat.

Niloofar24 15 Posting Whiz

Was completely clear, Thank you @Slyte!

Niloofar24 15 Posting Whiz

Hello @Slyte, thank you for your explanation and your other ideas!

Well, let me ask you some more question.
And also ask you to make some parts more clear for me, because my English is not very well, so sometimes i need more clear explanation, so will be happy if you help me understand better the parts i didn't get well! Thank you in advance :)

The second paragraph (record text found....); can you explain it more please? What kind of word i should record when i visit a wabpage for example? What do you mean by in a dictionary with individual words as keys and values? And what is it's usage?

And about your other ideas:
Can you explaine the first idea more clear please? I did'nt understand your purpose exactly but it seems interesting idea to me.

And the second idea; i did'nt understand it, what do you mean?!

Your explanation and ideas, made my mind to start some other new idea :)

Niloofar24 15 Posting Whiz

Thank you @Grebouillis.

Thank you @snippsat.

Niloofar24 15 Posting Whiz

Hello.
I'm trying to create a web crawler. I've read about web crawler's duty and about how it works and what he does.
But just need more information. Could you please tell me what does a web crawler can do? What kind of duty i can define for my web crawler? What can i ask it to do?

Niloofar24 15 Posting Whiz

@Slyte, what is that dt in line 10 for?

Niloofar24 15 Posting Whiz

Hi everybody.
What is the usage of urljoin?
An example:

>>> from urlparse import urljoin
>>> url = urljoin('http://python.org/','about.html')
>>> url
'http://python.org/about.html'

I think the answer is that when we take a link from here 'http://www.python.org/ for example , it looks like this <a href="/about/>about</a>.
So if i take the href part which is /about/ here and use urljoin to join this string (of course with .html) to the main url which is 'http://python.org/' here. Correct?!

Of course i should delete those / and / from /about/ first.

Niloofar24 15 Posting Whiz

Hello, me again :)
With this code:

>>> from BeautifulSoup import BeautifulSoup
>>> import urllib2
>>> url = urllib2.urlopen('http://www.python.org').read()
>>> soup = BeautifulSoup(url)
>>> links = soup('a')
>>> print links

A list of links printed into the terminal. I want to send the list into a text file, i tried this:

>>> with open('python-links.txt.', 'w') as f:
...     f.write(links)

But there was an error:

  File "<stdin>", line 2, in <module>
TypeError: expected a character buffer object
What is the problem? How can fix that?

And one more question; as that list looks like this: (I will copy only small part of the list)

[<a href="#content" title="Skip to content">Skip to content</a>, <a id="close-python-network" class="jump-link" href="#python-network" aria-hidden="true">
<span aria-hidden="true" class="icon-arrow-down"><span>&#9660;</span></span> Close
                </a>, <a href="/" title="The Python Programming Language" class="current_item selectedcurrent_branch selected">Python</a>, <a href="/psf-landing/" title="The Python Software Foundation">PSF</a>,

So how can i drop each link into a new line?
I tried this:

>>> text = '\n'.join(links)

But i got this error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: sequence item 0: expected string, Tag found

How can i do that?

Niloofar24 15 Posting Whiz

I've test the code again. i gave it a url and then the output was:

**********************************************************************
Scanning depth 1 web
**********************************************************************
**********************************************************************
Scanning depth 2 web
**********************************************************************
**********************************************************************
Scanning depth 3 web
**********************************************************************
**********************************************************************
Scanning depth 4 web
**********************************************************************
**********************************************************************
Scanning depth 5 web
**********************************************************************
****************************************
RESULTS
****************************************
http:// the main url i gave to the programe in line 47 / was found 1 time.

But there were many linkes on that page of the website, so why none of them printed into the terminal??!! Is it a web crawler?!

I thought a web crawler enter into a page that we gave its url first, then it will find all linkes on that page and will print those links then will enter into each linkes again to do all again, but here i got a different result.

Niloofar24 15 Posting Whiz

oopss! Forgot to send the code!

# -*- coding: utf-8 -*-
from HTMLParser import HTMLParser
from urllib2 import urlopen

class Spider(HTMLParser):
    def __init__(self, starting_url, depth, max_span):
        HTMLParser.__init__(self)
        self.url = starting_url
        self.db = {self.url: 1}
        self.node = [self.url]

        self.depth = depth # recursion depth max
        self.max_span = max_span # max links obtained per url
        self.links_found = 0

    def handle_starttag(self, tag, attrs):
        if self.links_found < self.max_span and tag == 'a' and attrs:
            link = attrs[0][1]
            if link[:4] != "http":
                link = '/'.join(self.url.split('/')[:3])+('/'+link).replace('//','/')

            if link not in self.db:
                print "new link ---> %s" % link
                self.links_found += 1
                self.node.append(link)
            self.db[link] = (self.db.get(link) or 0) + 1

    def crawl(self):
        for depth in xrange(self.depth):
            print "*"*70+("\nScanning depth %d web\n" % (depth+1))+"*"*70
            context_node = self.node[:]
            self.node = []
            for self.url in context_node:
                self.links_found = 0
                try:
                    req = urlopen(self.url)
                    res = req.read()
                    self.feed(res)
                except:
                    self.reset()
        print "*"*40 + "\nRESULTS\n" + "*"*40
        zorted = [(v,k) for (k,v) in self.db.items()]
        zorted.sort(reverse = True)
        return zorted

if __name__ == "__main__":
    spidey = Spider(starting_url = 'http://www.python.org', depth = 5, max_span = 10)
    result = spidey.crawl()
    for (n,link) in result:
        print "%s was found %d time%s." %(link,n, "s" if n is not 1 else "")
Niloofar24 15 Posting Whiz

Hello.
I was looking for a tutorial or any example of creating web crawler that i found this code somewhere and copied and pasted to test it:

First, it is a web crawler, right? Because when i gave it a url of a website, the output was some linkes were published on the terminal.

Second, if you test it yourself, you will see that linkes will divided into some parts with the title Scanning depth 1 web and so on (the number will change). What is that for? What does it mean? What does depth number web means?

Third, i want to send exactly everything i see that will be printed into terminal, into a textfile, so where should i put this code:

with open('file.txt', 'w') as f:
    f.write()

And what shoul i type in the ( )?

and finally i have a request.
could you explain each line of code for me please, if you are familiar with any line? Even afew lines of code explanation will be really helpful because i don't understand it clear and i want to learn it well. It's a request only and will be happy if you help me with understanding it.
Thank you in advance :)

Niloofar24 15 Posting Whiz

Thank you @Schol-R-LEA.

Niloofar24 15 Posting Whiz

Thank you @Slyte, but unfortunatelly when i checked the link, i faced with error 403 :(

Niloofar24 15 Posting Whiz

Thank you @vegaseat, was helpful.

Niloofar24 15 Posting Whiz

Thank you @Andrae.

Thank you @snippsat.

Niloofar24 15 Posting Whiz

Hi friends!
I want to create a countdown program. Here is my code:

from kivy.app import App

from kivy.uix.boxlayout import BoxLayout
from kivy.uix.label import Label

import datetime

class CountdownApp(App):
    def build(self):

        delta = datetime.datetime(2015, 3, 21, 2, 15, 11) - datetime.datetime.now()
        days = delta.days
        days = str(days)
        self.label_days = Label(text=days + "  days")

        hour_string = str(delta).split(', ')[1]
        hours = hour_string.split(':')[0]
        self.label_hours = Label(text=hours + "  hours")

        minuts = hour_string.split(':')[1]
        self.label_minuts = Label(text=minuts + "  minuts")

        seconds = hour_string.split(':')[2]
        self.label_seconds = Label(text=seconds + "  seconds")



        b = BoxLayout(orientation="vertical")
        b.add_widget(self.label_days)
        b.add_widget(self.label_hours)
        b.add_widget(self.label_minuts)
        b.add_widget(self.label_seconds)
        return b

if __name__ == "__main__":
    CountdownApp().run()

I want the program to update itself every seconds and then the label wich shows seconds should be updated every seconds....
How can i do that?

Niloofar24 15 Posting Whiz

Yes it works, thank you @vegaseat.

Niloofar24 15 Posting Whiz

Hi again.
I want to create a robot or spider or crawler with python urllib. Still couldn't find any good tutorial. Any suggestion?!

Niloofar24 15 Posting Whiz

Hi friends!

    import urllib
    url = 'http://www.python.org'
    text = urllib.urlopen(url).read()

I have typed the code above on the terminal and in the next line with print text an html file printed there.
I want to send it to a text file, how can i do that?

Niloofar24 15 Posting Whiz

@AleMonteiro, i didn't ask you google for me.
I know how to install sqlite3 for php but the problem was thatwhen i want to connect it with PDO i still get the error "can't find Driver".

When i have a problem, i always search for it on the net first, then if i couldn't find my answer, i will ask for help here in DANIWEB, maybe users know the answer. I just asked if you know how i can install the driver on my os, but didn't ask you to google it for me.

Anyway, thank you for all answers.

Niloofar24 15 Posting Whiz

Well, i'm not sure if i installed it correctly or not.
How can i install python urllib in my LInux Ubuntu?

Niloofar24 15 Posting Whiz

Hello.
I want to learn python urllib. I have installed it and now looking for a good tutorial, any suggestion?

Niloofar24 15 Posting Whiz

Well, thank you @vegaseat, it works.
But still don't know how to set my code wich has some different labels, here with your code i can print only one label with return label, but i have some values that i need to put in different labels.

Niloofar24 15 Posting Whiz

@AleMonteiro, as we see in the Requirements part of that page you gaved the link, PDO Driver for SQLite 3.x is needed.
Well, can you tell me how can i install that Driver please?! I don't know waht to do. I'm using Linux Ubuntu.

Niloofar24 15 Posting Whiz

I checked and it was int. I changed it into str and chenged afew parts and clear some extra part of code, it's more clear now:

from kivy.app import App

from kivy.uix.boxlayout import BoxLayout
from kivy.uix.label import Label

def timer(self):
    delta = datetime.datetime(2015, 3, 21, 2, 15, 11) - datetime.datetime.now()
    days = delta.days

    new_days = str(days)
    self.l_days.text = new_days

class CountdownApp(App):
    def build(self):
        b = BoxLayout()
        self.l_days = Label(text = "days")
        b.add_widget(self.l_days)
        return b

if __name__ == "__main__":
    CountdownApp().run()

As you see here:

        new_days = str(days)
        self.l_days.text = new_days

I can't set the label text with a varible.

Niloofar24 15 Posting Whiz

I want to exchange line 18 text with line 23 lable text, but here line 18 doesn't work in my code? Any idea?!

from kivy.app import App

from kivy.uix.boxlayout import BoxLayout
from kivy.uix.label import Label

def timer():
    delta = datetime.datetime(2015, 3, 21, 2, 15, 11) - datetime.datetime.now()
    days = delta.days
    hour_string = str(delta).split(', ')[1]
    hours = hour_string.split(':')[0]
    minuts = hour_string.split(':')[1]
    seconds = hour_string.split(':')[2]
    seconds_1 = hour_string.split(':')[2].split('.')[0]
    #print ("%s days" % days)
    #print ("%s hours" % hours)
    #print ("%s minuts" % minuts)
    #print ("%s seconds" % seconds)
    self.l_days.text = days

class CountdownApp(App):
    def build(self):
        b = BoxLayout()
        l_days = Label(text = "days")
        l_hours = Label(text = "hours")
        l_minuts = Label(text = "minuts")
        l_seconds = Label(text = "seconds")

        b.add_widget(l_days)
        b.add_widget(l_hours)
        b.add_widget(l_minuts)
        b.add_widget(l_seconds)
        return b


if __name__ == "__main__":
    CountdownApp().run()
Niloofar24 15 Posting Whiz

I wanted to create db with sqlite3 which i always use. Can i connect to it with PDO?!

Niloofar24 15 Posting Whiz

Hi friends.
With datetime.datetime.now() or datetime.datetime.today() i can get the current date (English calendar) for my program but what about if i want to get the current date (from Persian calendar) for my program; then what should i do? As my pc os date is set to English calendar so what can i do?

Niloofar24 15 Posting Whiz

Hello.

I want to create a database on my pc localhost and then want to use PDO to connect with that database to creating table and so on...

What should i do?

Niloofar24 15 Posting Whiz

I understood, thank you @fonzali.

Niloofar24 15 Posting Whiz

Hello.
I have copied these 2 files of code from a website.

main.py:

from kivy.app import App
from kivy.uix.label import Label
from kivy.uix.boxlayout import BoxLayout
from kivy.clock import Clock
from kivy.properties import StringProperty

import datetime

class Counter_Timer(BoxLayout):
    def update(self, dt):
            delta = datetime.datetime(2015, 9, 13, 3, 5) - datetime.datetime.now()
            self.days = delta.days
            hour_string = str(delta).split(',')[1]
            self.hours = hours_string.split(':')[0]
            self.minuts = hours_string.split(':')[1]
            self.seconds = hours_string.split(':')[2].split('.')[0]
            return days, hours, minuts, seconds

class Counter(App):
    def build(self):
        counter = Counter_Timer
        Clock.schedule_interval(counter.update, 1.0)
        days = StringProperty()
        hours = StringProperty()
        minutes = StringProperty()
        seconds = StringProperty()
        return

if __name__ == "__main__":
    Counter().run()

There was written:
"Let's add them to the Counter_timer class:"

days = StringProperty()
hours = StringProperty()
minutes = StringProperty()
seconds = StringProperty()

So i added it into the def build. correct place,right?
And i add it myself, but i'm not sure if it's correct or not:
from kivy.properties import StringProperty

And here is the next file counter.kv:

<Counter_Timer>:
    orientation: 'vertical'
    Label:
        text: 'Vacation starts in:'
        font_size: '46dp'
    Label:
        text: root.days + ' Days'
        font_size: '46dp'
    Label:
        text: root.hours + ' Hours'
        font_size: '38dp'
    Label:
        text: root.minuts + ' Minuts'
        font_size: '30dp'
    Label:
        text: root.seconds + ' Seconds'
        font_size: '22dp'

When i run it, i get this error:

[INFO   ] Kivy v1.8.0
Purge log fired. Analysing...
Purge 4 log files
Purge finished !
[INFO   ] [Logger      ] Record log in /home/niloofar/.kivy/logs/kivy_15-02-24_5.txt
[INFO   ] [Factory     ] 157 symbols loaded
[DEBUG  ] [Cache       ] register <kv.lang> with …
Niloofar24 15 Posting Whiz

And with this:

hours = hour_string.split(':')[0]
print hours

I get this: 18
But with this:

hours = hour_string.split(':')
print hours

I get this: ['18', '19', '45.495552']

So it means divid everything we have in the string 200 days, 18:37:08.889568 to indexes of the list according to : sign, means split words where ever meet : sign on the string, correct?!

Niloofar24 15 Posting Whiz

@vegaseat, i have a question.
With this:

delta = datetime.datetime(2015, 9, 13, 3, 5) - datetime.datetime.now()
print delta

I will get this: 200 days, 18:37:08.889568

And then with this:

hour_string_2 = str(delta).split(', ')
print hour_string_2

I will get this: ['200 days, 18:37:08.889568']

I tryed this too:

hour_string_3 = str(delta).split()
print hour_string_3

And i got this: ['200', 'days,', '18:37:08.889568']

Why? What , does exactly do in .split(', ')?
Why i get 3 indexes in the list with .split() but i get 2 indexes in the list with .split(', ')?

Niloofar24 15 Posting Whiz

hi @fonzali and thank you for explanation.
So .format(differ.seconds) calculate only 18:53:17.230488 and doesn't attention to the days, correct? But differ.total_seconds() calculate days too, right?

Niloofar24 15 Posting Whiz

Thank you @vegaseat. Of course the code you typed calculate seconds, but not counting down, but it's ok, i can do it myself, the code was helpful.