Hey guys,

I am looking into xbrl files and I need to extract certain data from each of them however, I can't find much information on the existing python-xbrl library, perhaps someone in here has an experience with it?
Here's an xbrl file example
Click Here
Any ideas/solutions on how to parse a certain field and get it's value?

or maybe I should implement my own parser using "re"?

I did this now though just to test it out ...

xmlContent = (requests.get("http://regnskaber.virk.dk/14502803/eGJybHN0b3JlOi8vWC1GMDk4RkNDNi0yMDE0MTIzMV8wOTE2MjFfMDk1L3hicmw.xml").content)

print "Date: " +re.findall(r">(.+)<", re.findall(r"gsd:ReportingPeriodStartDate.+", xmlContent)[0])[0]

and it works though I am not sure how efficient it is because I need to parse thousands of documents

Thanks in advance =]

Edited by Slavi

2 Years
Discussion Span
Last Post by megaflo

Please put the shovel down before the hole gets too big and you can't climb out :) Regular expressions are not the way to go for something that is XML based. The simplest way is to grab libraries and play with them at a Python interactive prompt. Besides python-xbrl there is also http://arelle.org/documentation/api/ which seems popular. Give them a go and if you run into problems please get back to us with more detailed questions.

Votes + Comments
good link

that's great ,thanks for the link!
Although I seem to be unable to read the docs, they won't load/open, is that the case for you too?

Or .. do you by any chance have an example, such as how would you parse an xbrl document and extract a field "startDate" ?

Edited by Slavi

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.