Subscribed unsubscribe Subscribe Subscribe

SE Can't Code

A Tokyo based Software Engineer. Not System Engineer :(

Python Best Practice.

In my carrier as a software engineer, Python programming is the longest experience of developing in all programming language. It's a same for my private, I've developed some programs using Python: a search engine on XML-RPC server, a web-application using Django, a blog system using flask and so on. I stored some knowledge of Python, so I'm gonna introduce some best practice of Python.

Not use None, but use Exception.

Python programmer often use None when they want to return. So None is gave special mean from them.

def divide(a, b):
  try:
    return a / b
  except ZeroDivisionError:
    return None

result = divide(x, y)
if result is None:
  print("Invalid inputs")

But None is dangerous because of missing for rating. In Python, None are rated in the same other values(ex, zero and null string..). So these cause error from reason of returning all False. If you want to avoid error, you should use exception process.

def divide(a, b):
  try:
    return a / b
  except ZeroDivisionError as e:
    raise ValueError("Invalid inputs") from e

x, y = 5, 2
try:
  result = divide(x, y)
except ValueError:
  print("Invalid inputs")
else:
  print("Result is %.1f" % result)

>>>
Result is 2.5

Instead of using None, You should use exception process and give especially conditions.


Different of bytes and str, unicode in python.

In Python3, There are bytes and str to use string. We know that bytes's instance includes 8bit values and str's instance includes unicode. In Python2, There are str and unicode to use string. In contrast Python3, str's intance includes 8bit values and unicode's intance unicode. It is important to know that Python3's str and Python2's unicode can't convert binary code. So, We have to encode and decode. We need helper function that converting and checking kind of input values.
In Python3, when input data is str or bytes, we need methods that always returns str and bytes.

def to_str(bytes_or_str):
  if isintance(bytes_or_str, bytes):
    value = bytes_or_str.decode("utf-8")
  else:
    value = bytes_or_str
  return value

def to_bytes(bytes_or_str):
  if isintance(bytes_or_str, str):
    value = bytes_or_str.encode("utf-8")
  else:
    value = bytes_or_str
  return value

In Python2, when input data is str or unicode, we need methods that always returns unicode and str.

def to_unicode(unicode_or_str):
  if isintance(unicode_or_str, str):
    value = unicode_or_str.decode("utf-8")
  else:
    value = unicode_or_str
  return value

def to_str(unicode_or_str):
  if isintance(unicode_or_str, unicode):
    value = unicode_or_str.encode("utf-8")
  else:
    value = unicode_or_str
  return value


How to fix the correct PYTHON PATH.

In case of installing new python library or Python version (ex,version 2 to 3), you hope to change PYTHONPATH permanently. We can check PYTHONPATH below:

>>> import sys
>>> print(sys.path)
['', '/var/lib/:/usr/local/lib/python2.7/dist-packages/', '/home/sotoshigoto/anaconda3/bin', '/usr/local/heroku/bin', '/usr/local/sbin', '/usr/local/bin', '/usr/sbin', '/usr/bin', '/sbin', '/bin', '/usr/lib/jvm/java-8-oracle/bin', '/usr/lib/jvm/java-8-oracle/db/bin', '/usr/lib/jvm/java-8-oracle/jre/bin', '/home/sotoshigoto/anaconda3/lib/python3.5', '/home/sotoshigoto/anaconda3/lib/python3.5/site-packages']

So, you can remove /var/lib/:/usr/local/lib/python2.7/dist-packages/ from PYTHONPATH bellow:

>>> sys.path.remove('/var/lib/:/usr/local/lib/python2.7/dist-packages/')
>>> print(sys.path)
['', '/home/sotoshigoto/anaconda3/bin', '/usr/local/heroku/bin', '/usr/local/sbin', '/usr/local/bin', '/usr/sbin', '/usr/bin', '/sbin', '/bin', '/usr/lib/jvm/java-8-oracle/bin', '/usr/lib/jvm/java-8-oracle/db/bin', '/usr/lib/jvm/java-8-oracle/jre/bin', '/home/sotoshigoto/anaconda3/lib/python3.5', '/home/sotoshigoto/anaconda3/lib/python3.5/site-packages']

But it is not permanent. Once you exit Python interpreter, there is a path that you removed. A way of removing permanently is bellow process.
First, you grab everything in your path by using from terminal.

$ env | grep PYTHONPATH

Then, export your path and manually remove anything you no longer need:

$ export PYTHONPATH=[this is where you paste the corrected paths, no square brackets needed]

If you restart your session and you haven't modified anything in .bashrc, you can simply close and reopen your session.

And if you used Anaconda that is package of some python's library, there is a case in missing PYTHONPATH. When you installed Python's library by pip packaging installer, You confront an error of import library to set anaconda's path to PYTHONPATH not path that is specified by pip. So you have to use pip of anaconda , do command bellow:

$ python -m pip install (name of library)

You can solve a problem of Python's install.


In case of big list comprehensions, We consider of generator.

What list comprehensions having problem is causing crash of consuming memory when input data is bigger. For example, you open file and return number of string of each rows in file. So If it's file is very big size, It is crucial problem to use list comprehensions because you need have length of each rows in file.

value = [len(x) for x in open('my_file.txt')]
print(value)

>>>
[100, 57, 15, 13, 67, 33, 82, 98, 89, 111, 342, 211...]

Python provide us generator expression to solve this problem. Generator expression don't actually generate whole of output sequence but use yield to return value one by one.

it = (len(x) for x in open('my_file.txt'))
print(it)

>>>
[generator object <genexpr> at 0x101b81480]

So, We use next() to proceed step one by one. It prevent us from over the memory by big data.

print(next(it))
print(next(it))

>>>
100
57

And another merit of generator expression is combination of other generator object. We can give output returned from one generator object as input to input of other generator. It's very faster.

roots = ((x, x**0.5) for x in it)
print(next(roots))

>>>
(15, 3.872983346207417)


Star argument make simply and clearly complex code.

There are some case of non argument or needing argument. So, Bellow's code that checks log of debug information have useless empty list:

def log(message, values):
  if not values:
    print(message)
  else:
    values_str = ", ".join(str(x) for x in values)
    print("%s: %s" % (message, values_str))

log("My numbers are", [1, 2])
log("Hi there", [])

>>>
My numbers are: 1, 2
Hi there

This isn't cool to give empty list to argument. In Python, using star argument, we can cut waste argument.

def log(message, *values):
  if not values:
    print(message)
  else:
    values_str = ", ".join(str(x) for x in values)
    print("%s: %s" % (message, values_str))

log("My numbers are", [1, 2])
log("Hi there")

>>>
My numbers are: 1, 2
Hi there

It's so cool!


Using zip when multi-process iterator.

In Python, zip module returns list of tuples from two lists. Zip generator make tuple from next values be got from each iterators, and yield. So compare with getting from some lists using subscripts, it seem very beautiful code.

names = ["Cecilia", "Lisa", "Marie"]
letters = [len(n) for n in names]

longest_name = None
max_letters = 0

for name, count in zip(names, letters):
  if count > max_letters:
    longest_name = name
    max_letters = count

>>>
Cecilia

There are two problems. First, zip is not generator in Python2. To return all lists until finish to process given iterator, it causes crash. In Python2, if you hoped to use big iterator, you should use izip in itertools library. Second, you must care length of input iterator. If length of iterators be difference, zip processes until finish yield short iterator. So, long iterator's value isn't processed, maybe. In this case, you should use zip_longest in itertools library.

Remove all ads