7 Strategies For Optimizing Your Code
When people are just starting to write their first programs, they are primarily concerned with the question, whether they are going to work as they should, or not? But soon, when the skill level starts to grow, it’s no longer enough - they are trying to improve their code - in terms of readability for themselves, understandability for other programmers, and productive efficiency.
In this article we'll review 7 strategies that will help you optimize your code in terms of performance, which will allow the program to run faster, consume less memory, and be more efficient in a number of other parameters.
1. Use PyPy
PyPy is a Python interpreter that serves as an alternative to the standard CPython. In a previous article about translating code from Python 2 to Python 3, we’ve mentioned that many projects still use the second version of the language. Luckily for them, PyPy supports both the second version (2.7.13) and the third (3.5.3).
PyPy developers themselves say that there are two situations in which the effectiveness of this interpreter won’t be noticeable:
- processes that take a little time (less than a few seconds);
- if a large part of the time spent on the program execution will be spend on the run-time libraries, rather than on the direct execution of Python code.
Thus, it’s a good idea to use PyPy in situations where the execution of the program takes a long time and its significant part goes directly to the execution of Python code.
2. Use Numba
The Numba library provides a JIT (just-in-time) capability for compiling code written in Python into the bytecode, which in its performance is comparable to C or Fortran code (which in turn are 100 to 10,000 times faster than Python, depending on the tests). Also, the distinctive feature of Numba is that it supports compiling and running Python code not only on the CPU, but on the GPU as well, while the style and type of the program using this library remains Python.
Below you can see an example of a program written using Numba.
from numba import jit from numpy import arange # jit decorator tells Numba to compile this function. # The argument types will be inferred by Numba when function is called. @jit def sum2d(arr): M, N = arr.shape result = 0.0 for i in range(M): for j in range(N): result += arr[i,j] return result a = arange(9).reshape(3,3) print(sum2d(a))
3. Use generators
Generators help in optimizing the memory required to run the program, since they allow you to create functions that return one item at a time, rather than the entire set of items at once, as is the case, for example, with list comprehension. A good example is the situation where you create a large list of numbers and add them together or raise each number in the list to a power.
In practice, it’ll look like this:
# Intialize the list my_list = [1, 3, 6, 10] a = (x**2 for x in my_list)
Instead of using list comprehension
(a = [x**2 for x in my_list]) and creating a list of all the elements at once, we create a generator object and can create individual elements on demand.
next(a) # output: 1 next(a) # output: 9 next(a) # output: 36 next(a) # output: 100 next(a) # output: StopIteration
4. Use sorting with keys
When you sort items in the list, try using keys and the standard sort() method wherever possible. This recommendation works equally well for both numbers and rows.
Below you can see an example of using keys to sort the list of tuples:
import operator somelist = [(1, 5, 8), (6, 2, 4), (9, 7, 5)] somelist.sort(key=operator.itemgetter(0)) somelist #output = [(1, 5, 8), (6, 2, 4), (9, 7, 5)] somelist.sort(key=operator.itemgetter(1)) somelist #output = [(6, 2, 4), (1, 5, 8), (9, 7, 5)] somelist.sort(key=operator.itemgetter(2)) somelist #output = [(6, 2, 4), (9, 7, 5), (1, 5, 8)]
5. Avoid unnecessary cycles
As you know, an excessive number of cycles in any programming language is a negative phenomenon, which leads to the unnecessary server load where the program is running. Small tricks, like writing a list length into a separate variable, instead of determining the length at each iteration of the loop, can be the beginning of a long way to optimize code performance and, in the end, make it much more efficient.
Also try using the intersections and unions where it can be done.
For example, instead of using the following cycle to find the common elements in a pair of sets:
for x in a: for y in b: if x == y: yield (x,y)
It’s much more efficient to use this code:
return set(a) & set(b)
6. Don’t join strings directly
Perhaps this advice needs some explanation. We’ll consider it in the examples after a small theoretical introduction.
The strings in Python are the immutable data type. This fact often disorients the novice programmers who are still learning the language. Immutability has certain advantages and disadvantages. For example, among the advantages are that strings can be used as keys in dictionaries and individual copies of strings can be divided between different variables.
The downside is that you can’t say something like: "replace all 'a' with 'b' in all input strings", and expect that such substitutions will occur directly in the strings passed as the input. Instead, you need to create a new string with the specified properties. This creation of new strings can lead to great inefficiencies of Python programs.
Try to avoid this code:
s = ‘’ for substring in list: s += substring
s = ‘’.join(list). The example above is a very common and extremely undesirable error that occurs in situations where it’s necessary to create a large string.
Here are a few more examples of how you should and shouldn’t work with strings.
s = ‘’ for x in list: s += some_function(x)
slist = [some_function(elt) for elt in somelist] s = ‘’.join(slist)
out = ‘’ + head + prologue + query + tail + ‘’
It’ll be better to use:
out = ‘%s%s%s%s’ % (head, prologue, query, tail)
7. Maximize the use of functions
If you have a choice - whether to describe a specific procedure inside a function or simply in the text of a program - it’s much more preferable to put the code inside the function. This approach has several advantages: first of all, it’ll be easier to use the code repeatedly (if you wrote a function that does certain useful actions and this functionality is required in another project, it won’t be a problem to transfer the code). Secondly, this approach is more effective for TDD (test driven development). And, finally, it’ll just allow your programs to run faster. This is due to the peculiarities of CPython and the fact that writing local variables is faster than global ones. If you want to know the details of why it works like this, and not the other way round - we recommend you to get familiar with this discussion.
As a result, the code:
def main(): for i in range(10**8): pass main()
will be working faster than a similar code outside the function:
for i in range(10**8): Pass
In order to make the most of this advice - you can break your procedure into as small parts as possible. And use the following pattern:
def solution(args): # write the code pass def main(): # write the input logic to take the input from STDIN input_args = "" solution(input_args) if __name__ == "__main__": main()
As you can see, there are various ways to make your code written in Python work faster. Despite the fact that in comparison with other languages (for example, C, C++, Java) Python is considered slow, it still has the potential to improve performance.
Tell us, what program optimization methods do you use in your projects?
Welcome to CheckiO - games for coders where you can improve your codings skills.
The main idea behind these games is to give you the opportunity to learn by exchanging experience with the rest of the community. Every day we are trying to find interesting solutions for you to help you become a better coder.Join the Game