Hey guys, how is it going?

I am wondering whether this my idea is possible to achieve .. I am looking at this search engine
https://datacvr.virk.dk/data/

and it can take names,cvr number, address etc as input and return a facinating output. I think they are using elasticsearch to do all the background work and I've had some experience with it, it's quite amazing. Anyway, I was wondering if I have a list of say 100 or 1000 things that I want to search for, how should I proceed(Automate it by script)?

Recommended Answers

All 2 Replies

If you search for cola you see this url.
https://datacvr.virk.dk/data/visninger?soeg=cola&type=Alle
Generate own search url's.

>>> url = 'https://datacvr.virk.dk/data/visninger?soeg={}&type=Alle'
>>> search_lst = ['car', 'train']
>>> for item in search_lst:
...     print(url.format(item))
...     
https://datacvr.virk.dk/data/visninger?soeg=car&type=Alle
https://datacvr.virk.dk/data/visninger?soeg=train&type=Alle

E.g of take out some data,here all title text when search for cola.

import requests
from bs4 import BeautifulSoup

url = 'https://datacvr.virk.dk/data/visninger?soeg=cola&type=Alle'
page = requests.get(url)
soup = BeautifulSoup(page.content)
tile_tag = soup.find_all('h2', {'class': 'name'})
for name in tile_tag:
    print(name.text.strip())

"""Output-->
COCA-COLA NORDIC SERVICES ApS
Cola klubben
Coca-Cola Service S.A.
The Coca-Cola Company
CARLSBERG DANMARK A/S
Abby Thai Kropsmassage v/Ornanong Johansen
Karim Rahima
Coca-Cola Service S.A.
CARLSBERG DANMARK A/S Coca -Cola Tapperierne
COCA-COLA NORDIC SERVICES ApS
"""
commented: Thank you +6

Hey snip, thank you for looking into my issue =]

That is actually the first thing I thought of and did as well =]

The issue that I faced was that if I search a number, the search query is different and also depending on the number (I guess the length is what identifies it) it has another different query. At this point I thought I would ask in here and see if anyone has an idea that i could try out as well. But I think, the solution I came to was to just validate each entry and then parse it to the current url. I am sorry for not closing the thread, I just got the time to get online again
Thank you again, great answer!

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.