Introduction to Profiling
- Profiling is a relatively quick process to analyse where time is being spent and bottlenecks during a program’s execution.
- Code should be profiled when ready for deployment if it will be running for more than a few minutes during it’s lifetime.
- There are several types of profiler each with slightly different
purposes.
- function-level:
cProfile
(visualised withsnakeviz
) - line-level:
line_profiler
- timeline:
viztracer
- hardware-metric
- function-level:
- A representative test-case should be profiled, that is large enough to amplify any bottlenecks whilst executing to completion quickly.
Function Level Profiling
- A python program can be function level profiled with
cProfile
viapython -m cProfile -o <output file> <script name> <arguments>
. - The output file from
cProfile
can be visualised withsnakeviz
viapython -m snakeviz <output file>
. - Function level profiling output displays the nested call hierarchy, listing both the cumulative and total minus sub functions time.
Break
Line Level Profiling
- Specific methods can be line-level profiled if decorated with
@profile
that is imported fromline_profiler
. -
kernprof
executesline_profiler
viapython -m kernprof -lvr <script name> <arguments>
. - Code in global scope must wrapped in a method if it is to be
profiled with
line_profiler
. - The output from
line_profiler
lists the absolute and relative time spent per line for each targeted function.
Profiling Conclusion
What profiling is:
- The collection and analysis of metrics relating to the performance of a program during execution .
Why programmers can benefit from profiling:
- Narrows down the costly areas of code, allowing optimisation to be prioritised or decided to be unnecessary.
When to Profile:
- Profiling should be performed on functional code, either when concerned about performance or prior to release/deployment.
What to Profile:
- The collection of profiling metrics will often slow the execution of code, therefore the test-case should be narrow whilst remaining representative of a realistic run.
How to function-level profile:
- Execute
cProfile
viapython -m cProfile -o <output file> <script name> <arguments>
- Execute
snakeviz
viapython -m snakeviz <output file>
How to line-level profile:
- Import
profile
fromline_profiling
- Decorate targeted methods with
@profile
- Execute
line_profiler
viapython -m kernprof -lvr <script name> <arguments>
Introduction to Optimisation
- The knowledge necessary to perform high-level optimisations of code is largely transferable between programming languages.
- When considering optimisation it is important to focus on the potential impact, both to the performance and maintainability of the code.
- Many high-level optimisations should be considered good-practice.
Data Structures & Algorithms
- List comprehension should be preferred when constructing lists.
- Where appropriate, tuples should be preferred over Python lists.
- Dictionaries and sets are appropriate for storing a collection of unique data with no intrinsic order for random access.
- When used appropriately, dictionaries and sets are significantly faster than lists.
- If searching a list or array is required, it should be sorted and
searched using
bisect_left()
(binary search).
Break
Understanding Python (NumPy/Pandas)
- Python is an interpreted language, this adds an additional overhead at runtime to the execution of Python code. Many core Python and NumPy functions are implemented in faster C/C++, free from this overhead.
- NumPy can take advantage of vectorisation to process arrays, which can greatly improve performance.
- Pandas’ data tables store columns as arrays, therefore operations applied to columns can take advantage of NumPys vectorisation.
Keep Python & Packages up to Date
- Where feasible, the latest version of Python and packages should be used as they can include significant free improvements to the performance of your code.
- There is a risk that updating Python or packages will not be possible to due to version incompatibilities or will require breaking changes to your code.
- Changes to packages may impact results output by your code, ensure you have a method of validation ready prior to attempting upgrades.
Understanding Memory
- Sequential accesses to memory (RAM or disk) will be faster than
random or scattered accesses.
- This is not always natively possible in Python without the use of packages such as NumPy and Pandas
- One large file is preferable to many small files.
- Memory allocation is not free, avoiding destroying and recreating objects can improve performance.
Optimisation Conclusion
- Data Structures & Algorithms
- List comprehension should be preferred when constructing lists.
- Where appropriate, Tuples and Generator functions should be preferred over Python lists.
- Dictionaries and sets are appropriate for storing a collection of unique data with no intrinsic order for random access.
- When used appropriately, dictionaries and sets are significantly faster than lists.
- If searching a list or array is required, it should be sorted and
searched using
bisect_left()
(binary search).
- Minimise Python Written
- Python is an interpreted language, this adds an additional overhead at runtime to the execution of Python code. Many core Python and NumPy functions are implemented in faster C/C++, free from this overhead.
- NumPy can take advantage of vectorisation to process arrays, which can greatly improve performance.
- Pandas’ data tables store columns as arrays, therefore operations applied to columns can take advantage of NumPys vectorisation.
- Newer is Often Faster
- Where feasible, the latest version of Python and packages should be used as they can include significant free improvements to the performance of your code.
- There is a risk that updating Python or packages will not be possible to due to version incompatibilities or will require breaking changes to your code.
- Changes to packages may impact results output by your code, ensure you have a method of validation ready prior to attempting upgrades.
- How the Computer Hardware Affects Performance
- Sequential accesses to memory (RAM or disk) will be faster than
random or scattered accesses.
- This is not always natively possible in Python without the use of packages such as NumPy and Pandas
- One large file is preferable to many small files.
- Memory allocation is not free, avoiding destroying and recreating objects can improve performance.
- Sequential accesses to memory (RAM or disk) will be faster than
random or scattered accesses.