Measuring scalability – Part 1

 

In Units of Time
As developers, we are continuously challenged by complexity. Our job definition always has that
word in one form or another. We tame complexity as firefighters tame fire. The journey to solve a business
problem does not end with writing code and taming complexity. Next, we need to ship our work to as many
as possible. Larger the audience for our work bigger the reward. To make it large, we end up taming
another beast which honestly seldom few amongst us have conquered—the beast of scalability. Often it is
buried under the words like non-functional and overlooked until bitten. Thus, it is no shame to admit we
prefer to neglect the beast’s existence by ignoring it until it becomes unavoidable. Often times it becomes
unavoidable as a big problem.
We do not want to glorify ourselves more but in reality, we must take pride in what we do. Isn’t that what
characterises craftsmen? Enough of words and let us turn our attention to the how. How do we tame that
beast of scalability to reach to ever-growing demand of a larger audience? Oh, we sense unrest. It is not
the developer who can address it. It is rather sales and marketing executives who can address that. True to
the word. As it is, the problem and the beast we talked about is the efficiency at the heart of that question.
The efficiency to deliver value without having to make significant changes to the work for a larger audience.
Thus, the question of How what we floated earlier has to do with how to serve a larger audience without
incurring significant costs as the number grows. You cannot agree more than this that it can only be solved
if we know how our work i.e., the system will respond as the audience grows. What could be better than
say a mathematical equation which tells you how long it might take to get a response if you plug in the
number of users connected to the system? Something like this –
3.5E-06 * x^0.97 (sec)

Needless to say, that x signifies the number of users in the above equation. So, if at a given time there are
20 concurrent users connected to the system, using the equation above we know that response on average
across the users will take 6.4E-05. It will correspond crudely to the 50 th percentile of the response time.
That is a fancy way of saying average. We can be confident that it can be even the 75 th percentile. i.e., 75%
of users could potentially be served within that time. How does this help you ask? Well, now you know
potentially how long it will take. Based on whether it is acceptable to your business or not you can decide to
make changes. In other words, if the business is happy then you are working efficiently as the system did
not have to undergo change to cater to 2-3 users (during development and testing times) to 20 users. Now,
if the business is not happy then you have a problem that can be addressed in one of the two ways. One is
to tune the system to either configure or optimise the code execution path. The next is to increase the
power of the runtime by getting a bigger machine to host the system. Wouldn’t such a life be blissful?
If you are still around with the promise of what you can get from that equation, we are happy to share how
we got there. We used a library called big_O which is a Python library. Let us get the obvious steps and
activities out. Like that of installation
pip install big_O

Post installation in a place where you will typically import the component write this function
from big_o import big_o
fitted_curve = big_o(find_connected_paths, generate_parameters, min_n=5, max_n=20, n_measures=5,
n_repeats=20, verbose=True)
The variable fitted_curve contains the equation we published above.
Now, to the explanation. The first parameter find_connected_paths to the big_o call is the target function
that we want to test the scalability. It is assumed that you have implemented that function somewhere and
have imported. The next parameter generate_parameters is responsible for generating data that is passed
as parameter to the test subject i.e., find_connected_paths. The next parameter is named parameter min_n
which signifies the size of parameter that generate_parameters must generate to induce complexity inside
the find_connected_paths. The harder workload better the test, isn’t it? The n_measures parameters
indicate the number of times the performance will be measured. Greater the number better the curve of
scalability. In any empirical analysis more the number of observations better is the curve. There is the risk
of overfitting but for the case at hand that is a blessing in disguise. Moving on the critical parameter
indicating the number of users is the n_repeats. It indicates how many times the function is called and the
time to perform the operation is measured.
The result fitted_curve will be a tuple with the first element being a string. The string represents the best fit
value for type of curve. i.e., linear, quadratic, polynomial, logarithmic etc. The best fit is determined by
minimising the residuals. As you can see there is no complex algorithms involved in this library. It is a
simplistic residual based computation. You can get the value of the best fitted curve’s name like this –
fitted_curve[0]
Based on the example above you will get Polynomial along with the equation we mentioned above.
The next element in the tuple contains the other possible fittings e.g., Exponential, Logarithmic etc. The
are stored in case you want to investigate. They can be obtained by accessing the 1 index of the
fitted_curve variable.
If you are wondering this is a simplistic measurement and cannot be used on a system that; let us, say
handles web requests, makes an API call, queries the database etc. Then think again. Everything can be
represented as function. Instead of the simple function that we used as example i.e., find_connected_paths
you can invoke the function that handles the web request with sufficient parameters. You can do the same
with the API by invoking it as a client and that function call all the dependencies. Let that collect and
compose the results for us which will mimic a user sitting at the end of a browser or any other app to get
something done. This resulting best fit equation if we could can also be called as scalability equation. It is a
composite big O representation of the activities.
In case you are in early stage of analysis and design then instead of working with tuple and the dictionary
you could look at a library function with the library to print the report of all curves by doing
from big_o import big_o, reports
best, rest = big_o(find_connected_paths, generate_parameters, min_n=5, max_n=20, n_measures=5,
n_repeats=20, verbose=True)
print(big_o_report(best, rest))

From this you will get a report resembling this
Best : Polynomial: time = …
Constant: time = … (sec) (res: 0.09)
Linear: time = … (res: 0.001)

Remember this is a string and if you want to work with objects, you should not call the report. The three
dots above expand to other types of curves.
You can then use the best curve to determine the response time at say 100 users load or can plot it in a
graph to show the response time behaviour over load.