As working with colleagues and moving from one job to another, I have seen a lot of code, and here are 5 undervalued Python skills.
Mastering these capabilities will — dare I say it — make you an even sexier Developer/Programmer/Coder.
5. Setting up a virtual environment
A virtual environment sets up an isolated workspace for your Python project. Whether you’re working solo or with collaborators, having a virtual environment is helpful for the following reasons:
- Avoiding package conflicts
- Providing a clear line of sight on where packages are being installed
- Ensuring consistency in package version utilized by the project
The use of a virtual environment allows you (and your teammates) to have different dependencies for different projects. Within the virtual environment, you can test install packages without polluting the system install.
Here's some of the recommended VirtualEnv & their activation process.
# Package Install
pip install virtualenv
# Creating Virtual Env
python3 -m venv env
# Activate
source env/bin/activate
# Package Install
pip install pipenv
# Create a new environment
pipenv install
# Activate
pipenv shell
4. Commenting according to PEP8
Write good comments for improved confidence and collaborative abilities. In Python, that means following the PEP8 style guide.
Comments should be declarative, like:
# Fix issue with utf-8 parsing
NOT: fixes issue
Here’s an example with a docstring, a special type of comment that is used to explain the purpose of a function:
def persuasion():
"""Attempt to get point across."""
print('Following this advice about writing proper Python comments will make you popular at parties')
Docstrings are particularly useful because your IDE will recognize this string literal as the definition associated with a class.
4. Adding Visualizations
Visualizations aren’t just for business intelligence dashboards. Throwing in some helpful charts and graphs will reduce speed to insight as you investigate a new dataset.
2. Measuring & Optimize runtime
The performance of a program should be assessed in terms of time, space, and disk use — keys to scalable performance.
Python offers some profiling utilities to showcase where your code is spending time. To support the monitoring of a function’s runtime, Python offers the timeit function.
import timeit
import_module = "import random"
testcode = '''
def test():
return random.randint(10, 100)
'''
print(timeit.repeat(stmt=testcode, setup=import_module))
1. Understanding the main function
Using if __name__ == '__main__'
provides the flexibility to write code that can be executed from the command line or imported as a package into an interactive environment. This conditional statement controls how the program will execute given the context.
You should expect that a user running your code as an executable has different goals than a user importing your code as a package. The if __name__ == ‘__main__'
statement provides control flow based on the environment in which your code is being executed.
__name__
is a special variable in the module’s global namespace- It has a repr() method that is set by Python
- The value of
repr(__name__)
depends on the execution context - From the command line,
repr(__name__)
evaluates to‘__main__’
— therefore any code in the if block will run - Imported as a package,
repr(__name__)
evaluates to the name of the import — therefore code in the if block will not run
0. Knowing when not to use Python
As a full-time Python programmer, sometimes I wonder if I’m overly dependent on this tool for scientific computing. Python is a delightful language. It’s straightforward and low maintenance, and its dynamic structure is well suited to the exploratory nature of data science pursuits.
Still, Python is definitely not the best tool to approach every aspect of the broadly defined machine learning workflow. For example:
- SQL is essential for ETL processes that move data into a data warehouse where it’s queryable by data analysts and data scientists