Python in science
When Python and science are being mentioned in one sentence, it usually evokes associations with machine learning, neural networks and artificial intelligence. Nevertheless, there are other scientific areas where Python is used very effectively. For example, physics, mathematics, astronomy and others.
In this article we'll talk about the sciences where using Python is a pretty great idea.
It's no secret that technologies that help to observe stars and remote space objects are developing like everything else. Once they were just the small telescopes, which could be raised by one person, but gradually they became multi-ton stationary facilities.
Nevertheless, in comparison with the centuries-old history of the telescope development, only relatively recently the optical telescopes have started to be replaced by the electronic ones. Therefore, the data obtained this way could be easily processed electronically, which raised the question of choosing a programming language for these exact purposes.
One of the Python’s undoubted advantages is its readability (many of its constructions and expressions look like ordinary English), as well as a low entry threshold. Thus, it would be much easier for the scientists involved in the processing of information obtained from telescopes to learn the basics of Python at the level required for work than any other language with more complex syntax and structure.
For example, one can cite part of the program code written for analyzing the data coming from the Kepler telescope:
def get_data(datadir='', mission='kepler', quarters=None): ''' Obtain the FGS data for Kepler or K2 directly from MAST mission = kepler,k2 ''' if isinstance(quarters, int): quarters = [quarters] #Find the data for the right mission if mission == 'kepler': prefix = 'kplr' pref = 'q' if mission == 'k2': prefix = 'ktwo' pref = 'c' if mission != 'kepler' and mission != 'k2': print("Choose 'kepler' or 'k2' for mission.") return
As you can see, this part is quite simple and most likely written by the scientist himself for the necessary research. If this code were to be written in C/C++ or Fortran, it would not be so easy to read, and probably the scientist would have to contact the developer regularly to make changes, instead of doing it himself.
By the way, the telescope Kepler, about which we spoke above, looks like this (yes, it’s not on Earth, it’s floating in outer space):
Also on GitHub is an open repository of the project and you can participate in the improvement of the code, thereby partially partaking in the exploration of outer space).
If you want to learn about other astronomical and astrophysical studies that use Python, you can get acquainted with the performance of Jake Vanderplas.
The queen of sciences also has come a long way through the world history and now she operates with such complex abstractions that some of them a human brain can’t even imagine (like 10-dimensional space).
Before we move on to Python, we'll analyze what problems the researchers might encounter by choosing a different language, for example Java, and consider the situation when they need to work with multidimensional arrays. They aren’t part of the language or the standard library, but can be implemented on top of them with reasonable effort. In fact, there are several implementations in Java, from which you can choose. Therein lies a problem. Suppose you want to use a library for fast Fourier transform (FFT), based on the implementation of arrays "A", together with a linear algebra library based on the implementation of arrays "B". Tough luck - the "A" and "B" arrays are of different types, so you can’t use the output of the FFT as an input to a system of solving linear equations. It doesn’t matter that they are based on the same abstractions, and even that the implementations have much in common. For the Java compiler the types don’t match, end of story.
Python isn’t completely free of this problem. It’s quite possible to write code in Python or code in a module in C, which expects an exact data type as an input, and otherwise throws an exception. But for Python code this will be considered a bad style, and in C modules for Python too, except for those that require performance or compatibility with other C code. Where possible, Python programmers are expected to use standard interfaces for the work with data. For example, the iteration and indexing work the same for arrays and built-in lists. For operations not supported by the standard interfaces, Python programmers are expected to use Python methods, also subject to "duck typing". In practice, the independently implemented Python types are much more interoperable than the independently implemented Java types. In the specific case of n-dimensional arrays, Python had the chance to accept the overwhelming majority of a single implementation, which is related to questions rather social and historical than technical ones.
Finally, even though Python is a pretty good choice for system integration in scientific computing, it also has limitations: combining Python code with code in other languages with managed memory, say R or Julia, requires a lot of work, and even after that, it remains fragile, because it requires tricks based on undocumented details of implementation. I suspect that the only solution may be the appearance of language-neutral data objects with the possibility of unmanaged access at the byte level, as in C.
In any case, the NumPy library provides excellent opportunities for working with multidimensional arrays and, as mentioned earlier, mastering it for the professional mathematician, who either has no programming experience, or it’s minimal, will be much easier than similar tools of other languages.
If the analysis also assumes visualization (for example, drawing function graphs with the indication of local minimums/maximums or extremes), Python can offer the Matplotlib library, which is definitely up to the task.
This is a broad enough field, which includes both medical research and projects in the areas of genetic engineering and other cutting-edge industries. As you can imagine, many experiments cannot be carried out in practice for different reasons (financial, ethical or purely technical). In this case, such tools as TensorFlow and SciPy can come to the rescue.
They can help researchers analyze the accident claims figures of diseases in a certain region in order to find the outbreak of the emerging epidemic and prevent it from happening, thus avoiding serious consequences.
Also, high-quality and well-designed simulations can replace experiments on animals or testing new types of drugs on volunteers. In addition, recent studies show that a specially trained neural network has learned to determine the first stages of cancer more accurately than doctors with many years of experience, which undoubtedly will save a huge number of human lives, because in the treatment of cancer the most dangerous is detecting it at a late stage (which, unfortunately, happens very often).
Separately, should be highlighted such field as prosthetics. There are 2 areas in which Python can greatly facilitate the researchers’ work. The first is the analysis of prosthetic strength, which is critically important, since many of them help people with certain traumas engage in normal activities, and if a breakdown occur, especially far from civilization, it can be fatal. In this situation might help the simulation of the objects interaction, which will allow to replace the real-world and crash tests, saving the material necessary for making the test samples.
The second, and equally important moment is the analysis of the biofilm behavior (a bacterial colony on the prosthesis surface). By gathering the statistics on the reactions of these microorganisms to various environmental influences, these data can be analyzed and extrapolated using, for example, the pandas library, which will save a huge amount of time and finances.
In addition, it’s worth mentioning such a tool as Biopython, intended for scientists working in the field of molecular biology. And also the Python for biologist website, which allows modern biologists to learn the basics of programming and make their research more fruitful.
Humankind has always been interested in the possibility of predicting the future, but, at the moment, the only thing that can be predicted more or less accurately (less, if you ask me) is the weather. And even these non-ideal forecasts require enormous efforts - the manufacture and launching of weather balloons, the construction of meteorological stations and so on. It’d be much more efficient to collect less information, but to process it using more advanced developments, such as the Prophet library (although we added it to this point, the scope of its application is much wider - it's the finance, economy, political processes, and linguistics).
Python can also be used by scientists who are studying such large-scale issues as global warming, melting of ice at the North and South Poles, sea-level raising, changing of ocean currents and winds, etc.
It's safe to say that Python popularity is growing and, given how wide the range of its applications is, it’ll grow further. We’ve made a far from complete list of sciences in whose studies Python is being used, but you can already see that this language firmly integrates into the activities of scientists.
Have you had to develop programs that are being used in scientific research or perhaps you know the examples of how the implementation of Python in data processing and analysis significantly helped to increase the effectiveness of research projects?
- No Ads
- No Limits
- More Content
Welcome to CheckiO - games for coders where you can improve your codings skills.
The main idea behind these games is to give you the opportunity to learn by exchanging experience with the rest of the community. Every day we are trying to find interesting solutions for you to help you become a better coder.Join the Game