Familiarize Yourself With Python Modules and Libraries


Now, suppose you need to calculate the square root of a number in one of your notebooks. There is no native square root function in Python. You could of course write it yourself, but hey, there's probably been a bunch of people who have asked themselves the same question. And guess what? One of them has already written the function and saved it in a module!

A Module in Python

module is a Python file containing a set of predefined and operational functionsclasses, and variables, which you can use as you wish in your code!

For example, if you are working on a problem involving geometry, you might need:

  • classes:

    • Square—defined by the length of its side

    • Triangle—defined by the length of its three sides

    • Circle—defined by its radius

    • Etc.

  • variables:

    • Pi: constant necessary for calculating the area of a circle, equal to 3.1415...

    • Phi: constant that represents the golden ratio, equal to 1.6180...

  • functions:

    • Area: takes as parameter a geometrical object (square, triangle, etc.) and calculates its area

    • Angles: takes a triangle as a parameter, and calculates its internal angles

    • Etc.

You can of course define all these things in your notebook, but that would only make it more cumbersome. The best is to store all this in an external Python file, which you will then import into your notebook: it's a module!

Your Geometry Module
Your geometry module

Here is a simplified example of a geometry module:

'''
Module geometry.py
'''
# variables
pi = 3.14159265359
phi = 1.6180
# function that calculates the area
def area(obj):
if type(obj) == square:
return obj.a**2
# definitions of some classes
class square(object):
def __init__(self,a):
self.a = a
class triangle(object):
def __init__(self,a,b,c):
self.a = a
self.b = b
self.c = c

To import a module, you will need the  import  keyword. Here is an example with our geometry module:

import geometry

After doing this, you can use the different items defined in your module:

squa = geometry.square(4)
tri = geometry.triangle(3, 6, 5)
print(geometry.pi) # -> 3.14159265359
geometry.area(squa) # -> 16

All items included in the geometry module can be used via the   moduleName.  notation, i.e.,  moduleName.function()  or   moduleName.variable. So, in the above example, we can use  geometry.area()  or  geometry.pi. If you don't want to rewrite geometry every time, you have two other options:

  • Either give an alias to the name of your module, so you only have to write the alias:

import geometry as geo # we can now access geo.area() or geo.pi
  • Or, import specific functions that you can then use as native Python functions/variables (without the  .  notation):

from geometry import pi
print(pi) # -> 3.14159265359

A particular case of this last method is to import in one line all the objects contained in a module via the  *  notation. However, this is not the recommended method, in order to avoid, for example, conflicts between several modules that might have identical function names.

from geometry import *

When a Module is Not Enough: Packages

package (sometimes called a library) is a collection, a set of Python modules. As you have seen above, a module is a Python file. A package is simply a folder containing several Python files (.py) and an additional file named   __init__.py. This differentiates a package from an ordinary folder containing only Python codes.

For example, you could have stored your geometry module in three different files instead of just one:

  • One for classes: classes.py

  • One for variables: variables.py

  • One for functions: functions.py

In this case, we would have the following file:

Organization of geometry package
Organization of geometry package

You will need to use the . operator to access the module after importing the package:

import geometry # import all the geometry package
print(geometry.variables.pi) # -> 3.1415...
squa = geometry.classes.square(4)
geometry.functions.area(squa) # -> 16

Or, you can also import only one module from the package:

import geometry.variables as var # import only what is defined in variables.py
print(var.pi) # -> 3.1415...

Packages in Data Analysis

Packages are ubiquitous in data analysis with Python. Indeed, many packages have been created specifically to address the issues that this subject involves. As you progress, you will be required to:

  • manipulate your data to facilitate analysis.

  • make various relevant graphs representing the behavior of your data.

  • use statistical methods.

  • run machine learning algorithms of varying complexity.

  • etc.

And to achieve all this, you will need to master the various objects and functions from the corresponding packages.

To come back to your initial problem (having a square root function), there is for example the numpy package which offers the necessary function—and many other things!

import numpy as np
np.sqrt(16) # -> 4.0

In order to solidify the concept of packages, we will look at a concrete example of the use of therandom package in the next chapter.

Let's Recap

In this chapter, together we have seen the basics of using modules and packages:

  • A module is a file containing Python code (.py extension) that can define functionsclasses, and/or variables.

  • You can import any Python module via the import keyword.

  • To use a function class or a variable within a module, you must use the  .  operator.

  • package is a set of several Python modules.

  • There are many packages specifically created for data analysis.

Now that you know what a module is in Python, follow me to the next chapter to discover how the random module works.

Post a Comment

Previous Post Next Post