Part 7: Regression testing
(A series of blog posts about the tech behind lagen.nu. Earlier parts are here: first, second, third, fourth, fifth and sixth)
Like most developers that have been Test infected, I try to create regression tests whenever I can. A project like lagen.nu, which has no GUI, no event handling, no databases, just operations on text files, is really well suited for automated regression testing. However, when I started out, I didn’t do test-first programming since I didn’t really have any idea of what I was doing. As things solidified, I encountered a particular section of the code that lended itself very nicely to regression testing.
Now, when I say regression testing, I don’t neccesarily mean unit testing. I’m not so concerned with testing classes at the method level as my “API” is really document oriented; a particular text document sent into the program should result in the return of a particular XML document. Basically, there are only two methods that I’m testing:
- The lawparser.parse() method: Given a section of law text, returns the same section with all references marked up, as described in part 5.
- Law._txt_to_xml(), which, given a entire law as plaintext, returns the xml version, as described in part 4
Since both these tests operate in the fashion “Send in really big string, compare result with the expected other even bigger string”, I found that pyunit didn’t work that well for me, as it’s more centered around testing lots of methods in lots of classes, where the testdata is so small that it’s comfortable having them inside the test code.
Instead, I created my tests in the form of a bunch of text files. For lawparser.parse, each file is just two paragraphs, the first being the indata, and the second being the expected outdata:
Vid ändring av en bolagsordning eller av en beviljad koncession
gäller 3 § eller 4 a § i tillämpliga delar.
Vid ändring av en bolagsordning eller av en beviljad koncession
gäller <link section="3">3 §</link> eller <link section="4a">4 a §</link> i tillämpliga delar.
The test runner then becomes trivial:
def runtest(filename,verbose=False,quiet=False):
(test,answer) = open(filename).read().split("\n\n", 1)
p = LawParser(test,verbose)
res = p.parse()
if res.strip() == answer.strip():
print "Pass: %s" % filename
return True
else:
print "FAIL: %s" % filename
if not quiet:
print "----------------------------------------"
print "EXPECTED:"
print answer
print "GOT:"
print res
print "----------------------------------------"
return False
Similarly, the code to test Law._txt_to_xml() is also pretty trivial. There are two differences: Since the indata is larger and already split up in paragraphs, the indata and expected result for a particular test is stored in separate files. This also lets me edit the expected results file using nXML mode in Emacs.
Comparing two XML documents is also a little trickier, in that they can be equivalent, but still not match byte-for-byte (since there can be semantically insignificant whitespace and similar stuff). To avoid getting false alarms, I put both the expected result file, as well as the actual result, trough tidy. This ensures that their whitespacing will be equivalent, as well as easy to read. Also, a good example of piping things to and from a command in python:
def tidy_xml_string(xmlstring):
"""Neatifies a XML string and returns it"""
(stdin,stdout) = os.popen2("tidy -q -n -xml --indent auto --char-encoding latin1")
stdin.write(xmlstring)
stdin.close()
return stdout.read()
If the two documents still don’t match, it can be difficult to pinpoint the exact place where they match. I could dump the results to file and run command-line diff on them, but since there exists a perfectly good diff implementation in the python standard libraries I used that one instead:
from difflib import Differ
differ = Differ()
diff = list(differ.compare(res.splitlines(), answer.splitlines()))
print "\n".join(diff)+"\n"
The result is even easier to read than standard diff output, since it points out the position on the line as well (maybe there’s a command line flag for diff that does this?):
[...]
suscipit non, venenatis ac, dictum ut, nulla. Praesent
mattis.</p>
</section>
- <section id="1" element="2">
? ^^^
+ <section id="1" moment="2">
? ^^
<p>Sed semper, ante non vehicula lobortis, leo urna sodales
justo, sit amet mattis felis augue sit amet felis. Ut quis
[...]
So, that’s basically my entire test setup for now. I need to build more infrastructure for testing the XSLT transform and the HTML parsing code, but these two areas are the trickiest.
Since I can run these test methods without having a expected return value, they are very useful as the main way of developing new functionality: I specify the indata, and let the test function just print the outdata. I can then work on new functionality without having to manually specifying exactly how I want the outdata to look (because this is actually somewhat difficult for large documents), I just hack away until it sort of looks like I want, and then just cut’n paste the outdata to the “expected result” file.
Tags: lagen.nu, mjukvarutestning, programmering, python