| | |
Python data validation. Idioms and ideas.
![]() |
•
•
Join Date: Oct 2009
Posts: 61
Reputation:
Solved Threads: 7
So I'm interested in learning about data validation. Especially in Python.
Python all ready has several common idioms for data validation.
There are several statements that evaluate data. For instance, isinstance(object, classinfo) will check that the given object is an instance of the class or type in classinfo.
One idiom is that it is better to ask forgiveness than permission, which basically means it is better to try something and catch the errors instead of trying to force the data to be valid. The main language feature that supports this idiom is the try statement and its associated statements.
It can be very bulky to write. I wonder if it would be possible to write a function wrapper to simplify the writing of such a check or if such wrapper would be worth it or appropriate.
In addition, it can only catch general errors. Sometimes what is technically valid input is invalid input according to design specifications. Now I've read that there is a way for a user to create their own errors and exceptions. Now I don't know if this is true or not, but if it is that would seem the most natural thing to do.
However, another option might be to create custom data validation functions. I'm not talking about the poorly coded "if x != 1 and x != 2 and x != 3", but powerful yet general functions to ensure that data is valid.
Here's the Wikipedia entry for data validation.
http://en.wikipedia.org/wiki/Data_validation
What are your thoughts?
Python all ready has several common idioms for data validation.
There are several statements that evaluate data. For instance, isinstance(object, classinfo) will check that the given object is an instance of the class or type in classinfo.
One idiom is that it is better to ask forgiveness than permission, which basically means it is better to try something and catch the errors instead of trying to force the data to be valid. The main language feature that supports this idiom is the try statement and its associated statements.
It can be very bulky to write. I wonder if it would be possible to write a function wrapper to simplify the writing of such a check or if such wrapper would be worth it or appropriate.
In addition, it can only catch general errors. Sometimes what is technically valid input is invalid input according to design specifications. Now I've read that there is a way for a user to create their own errors and exceptions. Now I don't know if this is true or not, but if it is that would seem the most natural thing to do.
However, another option might be to create custom data validation functions. I'm not talking about the poorly coded "if x != 1 and x != 2 and x != 3", but powerful yet general functions to ensure that data is valid.
Here's the Wikipedia entry for data validation.
http://en.wikipedia.org/wiki/Data_validation
What are your thoughts?
Last edited by lrh9; Oct 23rd, 2009 at 8:17 pm.
0
#2 Oct 24th, 2009
The Pmw megawidgets toolkit for tkinter implements a mechanism for data validation. For example the background color of an entry field turns pink if you enter bad input. Although this module is old, it could be a good starting point for experimenting with data validation. You can create your own validators, and you can also browse the pmw source code to understand how it's implemented, which could give you ideas to develop your own system.
Last edited by Gribouillis; Oct 24th, 2009 at 1:56 am.
0
#3 Oct 24th, 2009
Why yes, that is poorly coded. Good thing this is Python!
Data validation is extremely important for code that is distributable.
If you're writing scripts for you and only you that will never be used by anybody else than you can skip this; however when you have other users you should consider that they did not sit with you while you designed your program. They might completely miss the concept of what you're trying to achieve. If they provide improper input it is highly likely that your program will crash.
There's also going to be malicious users. These 1337 h4x0rs could intentionally try to either crash the system distributing your code or piggy back through your code into the guts of your system, causing all kinds of havoc.
Just my $0.02
python Syntax (Toggle Plain Text)
>>> x != 1 and x != 2 and x != 3 True >>> x not in [1,2,3] True
If you're writing scripts for you and only you that will never be used by anybody else than you can skip this; however when you have other users you should consider that they did not sit with you while you designed your program. They might completely miss the concept of what you're trying to achieve. If they provide improper input it is highly likely that your program will crash.
There's also going to be malicious users. These 1337 h4x0rs could intentionally try to either crash the system distributing your code or piggy back through your code into the guts of your system, causing all kinds of havoc.
Just my $0.02
•
•
Join Date: Sep 2007
Posts: 33
Reputation:
Solved Threads: 8
0
#4 Oct 24th, 2009
If data is truly coming from an untrusted source, then validate it to death as soon as the dataappears in your program, but don't sprinkle checks thoughout your source. Assume fellow programmers know what they are doing - don't add 'defensive code' to check arguments for example - assume that the arguments to functions are fair. This will allow for the later use of duck typing for example, and it is best to treat fellow programmers as competent.
- Paddy.
- Paddy.
•
•
Join Date: Dec 2006
Posts: 1,008
Reputation:
Solved Threads: 285
0
#5 Oct 24th, 2009
When validating, you want to include, not exclude. So you want a list of data that is acceptable, and the input has to be in the list. That way, when another unit or type of data is introduced it is automatically an error. If your program excludes, then the new unit would be automatically included, since you did not explicitly exclude it, which is not a good idea.
Does anyone have a good synonym for exclude or include. This post could certainly use one.
Does anyone have a good synonym for exclude or include. This post could certainly use one.
Last edited by woooee; Oct 24th, 2009 at 5:11 pm.
Linux counter #99383
•
•
Join Date: Oct 2009
Posts: 61
Reputation:
Solved Threads: 7
0
#6 Oct 24th, 2009
•
•
•
•
Does anyone have a good synonym for exclude or include. This post could certainly use one.
Permit and deny might be good synonyms for include and exclude in this instance.
Duck typing. If it looks like a duck and quacks like a duck, it's a duck. When validating data using duck typing, you perform a set of actions to test whether the data acts a certain way. Unlike checking whether the data is an instance of a certain object, this allows developer and user created objects to work so long as they operate like the a certain object.
•
•
Join Date: Sep 2007
Posts: 33
Reputation:
Solved Threads: 8
1
#7 Oct 24th, 2009
•
•
•
•
...When validating data using duck typing, you perform a set of actions to test whether the data acts a certain way.
- Paddy.
![]() |
Similar Threads
- Starting Python (Python)
- How can show/view Infopath dynamic files(xml) without infopath in a computer (XML, XSLT and XPATH)
- Data validation- want only Int's (C++)
- making python code and java code work together (Python)
- Data Validation checks (MS Access and FileMaker Pro)
- Help with data validation on booking trip application... (MS Access and FileMaker Pro)
- data validation (MS Access and FileMaker Pro)
- Data validation (C++)
Other Threads in the Python Forum
- Previous Thread: Can someone explain how this works?
- Next Thread: How to find minimum value from list that is higher than 0
| Thread Tools | Search this Thread |
abrupt ansi anti apache approximation array assignment avogadro backend beginner binary bluetooth book builtin calculator character code converter countpasswordentry curved customdialog dan08 dictionaries dictionary dynamic examples exe file float format function gnu graphics gui heads homework ideas import inches input java launcher library line lines linux list lists loop mouse mysqlquery number numbers numeric output parsing path phonebook plugin pointer port prime programming progressbar projects py2exe pygame python random recursion redirect scrolledtext software statictext statistics string strings sum table terminal text textarea thread threading time tlapse trick tricks tuple tutorial twoup ubuntu unicode urllib urllib2 variable wordgame write wxpython xlib






