What is unit testing ?
Unit testing is a programming method which consists of writing simple test(s) for each function. Each unit test is going to check compare the results of a function for a set of parameters to the expected results.
Ideally, you want to have tests for each of your functions. The percentage of function tested is referred as code coverage.
Why unit tests? My code is working!
Unit testing makes you code more robust
Robust code:
- will not break easily upon changes (e.g., new R version, package updates, bug fixes, new features, etc.)
- can be refactored simply
- can be extended without breaking the rest
- can be tested
With a growing codebase, you will need unit tests or you are going to spend a loot of time with the debugger afterward. On the long-run unit tests will save you a loot of time. You can even do test-driven design (TDD), which consist of writing your test first and then stop writing (improving) your function when it passes the test.
Note that a 100% is not necessarily a goal to reach because for some functions don’t need to be covered with unit tests, like functions with edge effect. For example, it’s difficult to write unit tests to check graphical output.
Writing Unit Tests
You don’t want to have to write unit tests for your unit test so, it’s best to trust well-written unit tests libraries like testthat
in R
or doctest
in Python
. You can find unit testing library for almost every language.
Unit test with R
library(testthat)
<- function(x){
square * x
x
}
test_that("single number", {
expect_equal(square(2), 4)
})
Most of the time you are going to write different tests for your function to check different properties.
library(testthat)
<- function(x){
square * x
x
}
test_that("single number", {
expect_equal(square(2), 4)
expect_equal(square(3), 4)
expect_equal(square(-2), 4)
})
test_that("vectors", {
expect_equal(square(c(2,4)), c(4,16))
})
test_that("test NA", {
expect_true(is.na(square(NA)))
})
Of course in real life you don’t want to run your tests when you run your code. So you are going to write your test in a separate R files”:
A square_code.R
file:
<- function(x){
square * x
x }
A tests/test_square_code.R
file:
source("../square_code.R", chdir = TRUE)
library(testthat)
test_that("single number", {
expect_equal(square(2), 4)
expect_equal(square(3), 4)
expect_equal(square(-2), 4)
})
test_that("vectors", {
expect_equal(square(c(2,4)), c(4,16))
})
test_that("test NA", {
expect_true(is.na(square(NA)))
})
You can put all of your test code files (with name starting with test_
) in a tests
folder and cal
::test_dir('tests') testthat
Unit test with Python
In Python
the doctest
library allows you to write your test directly in the documentation of function. The advantage is two-fold: you write your unitest and example of your function usage for its documentation.
def square(x):
"""
function returning x^2
:param x: number
:return number
>>> square(2)
4
"""
return x * x
if __name__ == "__main__":
import doctest
doctest.testmod()
We can also run different test for each function
def square(x):
"""
function returning x^2
:param x: number
:return number
>>> square(2)
4
>>> square(3)
4
>>> square(-2)
4
>>> import numpy as np
>>> square(
... np.array([2,4]))
array([ 4, 16])
>>> np.isnan(square(float("nan")))
True
"""
return x * x
if __name__ == "__main__":
import doctest
doctest.testmod()
Of course in real life you don’t want to run your tests when you run your code. And you may need the __main__
function to do something other than testing. So like for R you call your test form a separate test file:
import doctest
import square_code
doctest.testmod(square_code)
Property based testing
Unit Tests are great but you may want to cover more cases than what a few examples can give you. Instead of checking example, you may want to check for property. When you do property based test, you are going to generate a range of examples and test each of them. If your property fail for a given case, the property-based testing will return the counterexample.
The computational load of property-based testing can be significantly higher than the one of unit testing.
Property based testing With R
In R
, you can use the hedgehog
library to perform property based testing. hedgehog
nicely overloads the testthat
packages function.
<- function(x){
rev_two_time return(rev(rev(x)))
}
library(hedgehog)
test_that( "Reverse of reverse is identity",
forall(
gen.c( gen.element(1:100) ),
function(xs){expect_equal(rev_two_time(xs), xs)}
) )
Here we generate 100 vectors and test that reverting a vector twice gives us back the original vector.
You have to read the documentation for a list of the different generator.
Generate uniformly distributed numbers:
<- function(x){
dummy_comp return(log(x)/x)
}
library(hedgehog)
test_that( "Reverse of reverse is identity",
forall(
gen.unif(from=-1, to=100),
function(xs){expect_false(is.nan(dummy_comp(xs)))}
) )
Generate data.frame
<- function(n)
gen.df.of generate(for (x in
list( as = gen.c(of = n, gen.element(1:10) )
bs = gen.c(of = n, gen.element(10:20) )
,
)as.data.frame(x)
)
)
<-
gen.df generate(for (e in gen.element(1:100)) {
gen.df.of(e)
})
test_that( "All data frames are of length 1",
forall( gen.df, function(x){expect_equal(nrow(x), 1)})
)
Property based testing With Python
In Python
, you can use the hypothesis
library to perform property based testing. Hypothesis
use decorators to specify the generator to generate the examples to run.
from hypothesis import given
import hypothesis.strategies as st
@given(st.integers(), st.integers())
def test_ints_are_commutative(x, y):
assert x + y == y + x + 1
@given(x=st.integers(), y=st.integers())
def test_ints_cancel(x, y):
assert (x + y) - y == x
@given(st.lists(st.integers()))
def test_reversing_twice_gives_same_list(xs):
# This will generate lists of arbitrary length (usually between 0 and
# 100 elements) whose elements are integers.
= list(xs)
ys
ys.reverse()
ys.reverse()assert xs == ys
if __name__ == "__main__":
test_ints_are_commutative()
test_ints_cancel() test_reversing_twice_gives_same_list()