= "hello world"
x type(x)
str
At the End of the Appendix, Students should be Able to -
Gain an Understanding about Python
Gain an Understanding about the Data Types and Data Structures in Python
Gain an Understanding about Arrays in Numpy
, Indexing and Slicing of Arrays, and Operations of Arrays
Gain an Understanding about for
Loop function, map
function, and User Defined Function in Python
According to www.python.org “Python is an interpreted, object-oriented, high-level programming language with dynamic semantics. Its high-level built in data structures, combined with dynamic typing and dynamic binding, make it very attractive for Rapid Application Development, as well as for use as a scripting or glue language to connect existing components together.” It further explains - “Python’s simple, easy to learn syntax emphasizes readability and therefore reduces the cost of program maintenance. Python supports modules and packages, which encourages program modularity and code reuse.”
Data has different types. When dealing with data, we need to know the types of the data because different data types can do different things. There are six basic data types in python. They include - int
, float
, complex
, bool
, str
, and bytes
. We use type ()
function to know the types of the data. However, most commmonly used data types are int
, float
, str
, and bool
.
= "hello world"
x type(x)
str
= 25
x type(x)
int
= 25.34
x type(x)
float
= True
x type(x)
bool
=7j
x type(x)
complex
= b"Hello World"
x type(x)
bytes
Data structures are the collection of data on which different processes can be done efficiently. It enables quick and easier access, and efficient modifications. Data Structures allows to organize data in such a way that enables to store collections of data, relate them and perform operations on them. Data structures in python can broadly be classified into two groups - Built-in data structures and User-defined data structures. Figure A.1 Shows the data structure in python. Built-in data structure is important because they are widely used. Therefore, we will elaborate on built-in data structure.
List is used to store collection of ordered1data items. Lists are created using square brackets ([]
). We can also create a list using list ()
function. Lists can hold different types of data, including integers (int
), floats (float
), strings (str
), and even other lists. We can use len ()
function to know the number to elements in the list. Moreover, lists are mutable, meaning that their contents can be changed after the list has been created.
= ['red', 'blue', 'green']
colors print(colors)
['red', 'blue', 'green']
len(colors)
3
= [1, 'apple', 3.14, [5, 6]]
a print(a)
[1, 'apple', 3.14, [5, 6]]
= list((1, 'apple', 3.14, [5, 6]))
b print(b)
[1, 'apple', 3.14, [5, 6]]
A list with repeated elements can be created using the multiplication operator.
= [2] * 5
x = [0] * 7
y
print(x)
print(y)
[2, 2, 2, 2, 2]
[0, 0, 0, 0, 0, 0, 0]
Indexing can be used to access the elements in the list. Python indexes start at 0
. Therefore, a[0]
will access the first element in the list a. Figure A.2 shows the index of the list - colors.
0] colors[
'red'
-1] colors[
'green'
We can add elements to the list using three methods - append ()
, insert ()
, and extend ()
.
# Initialize an empty list
= []
m
# Adding 10 to end of list
50)
m.append(print("After append(150):", m)
# Inserting 40 at index 0
0, 40)
m.insert(print("After insert(0, 40):", m)
# Adding multiple elements [60,70,80] at the end
60, 70, 80])
m.extend([print("After extend([60,70,80]):", m)
After append(150): [50]
After insert(0, 40): [40, 50]
After extend([60,70,80]): [40, 50, 60, 70, 80]
We can change the value of an element by accessing it using its index.
= [10, 20, 30, 40, 50]
p # Change the second element
1] = 25
p[
print(p)
[10, 25, 30, 40, 50]
We can remove elements from the list using three methods - remove ()
, pop ()
, and del ()
.
= [10, 20, 30, 40, 50]
a
# Removes the first occurrence of 30
30)
a.remove(print("After remove(30):", a)
# Removes the element at index 1 (20)
= a.pop(1)
popped_val print("Popped element:", popped_val)
print("After pop(1):", a)
# Deletes the first element (10)
del a[0]
print("After del a[0]:", a)
After remove(30): [10, 20, 40, 50]
Popped element: 20
After pop(1): [10, 40, 50]
After del a[0]: [40, 50]
Dictionary data structure in python is used to store data in key:value format. Unlike list - which uses square brackets ([]
) - dictionary uses curly brackets ({}
). Like lists, dictionary is mutable. Dictionary items can be referred by using key name. We can use len ()
function to know the total number of element of a dictionary and type ()
to know the type.
= {
my_car "brand": "Ford",
"model": "Escape",
"year": 2017
}print(my_car)
{'brand': 'Ford', 'model': 'Escape', 'year': 2017}
print(my_car['model'])
Escape
The values in dictionary items can be of any data type
= {
car_features "brand": "Ford", # string
"electric": False, # boolean
"year": 1964, # integer
"colors": ["red", "white", "blue"] # list of string
}
The function dict ()
can also be used to construct dictionary.
= dict(
my_friends = ["John", "Smith", "Mark"],
name = [36, 45, 49],
age = ["Norway", "Sweden", "Finland"]
country
)print(my_friends)
{'name': ['John', 'Smith', 'Mark'], 'age': [36, 45, 49], 'country': ['Norway', 'Sweden', 'Finland']}
Some built-in dictionary methods2 are -
dict.clear()
- removes all the elements from the dictionary= {
employee 'name': ["John", "Jessica", "Zack"],
'age': [18, 19, 20]
}print(employee)
{'name': ['John', 'Jessica', 'Zack'], 'age': [18, 19, 20]}
employee.clear()print(employee)
{}
dict.copy()
- returns a copy of the dictionary
dict.get(key, default = “None”)
- returns the value of specified key
dict.items()
- returns the value of specified key
dict.keys()
- returns a list containing dictionary’s key
dict.values()
- returns a list of all the values of the dictionary.
In python, tuple is very similar to list, except one difference. List is mutable, but tuple is not. Once a tuple is created, its elements cannot be changed. Unlike lists, we cannot add, remove, or change elelment in tuple. Tuple is created by using parenthese (()
). Also, the function tuple ()
can also be used to create tuple. We can access the elements of tuple by indexing as we did for lists.
= ('10', '20', '30', 'hello', 'world')
my_tuple my_tuple
('10', '20', '30', 'hello', 'world')
3] my_tuple[
'hello'
There are different operations that can be performed on the tuple. Some of them include -
Concatenation - To concatenate, plus operator (+
) is used.
Nesting - Nested tuple means a tuple is inside the another tuple
Repetition - creating a tuple of several times
= ('10', '20', 'SIU', "SOA", "Carbondale")
second_tuple *3 second_tuple
('10',
'20',
'SIU',
'SOA',
'Carbondale',
'10',
'20',
'SIU',
'SOA',
'Carbondale',
'10',
'20',
'SIU',
'SOA',
'Carbondale')
1:]
second_tuple[2:4]
second_tuple[-1] second_tuple[::
('Carbondale', 'SOA', 'SIU', '20', '10')
Finding the Length - using len ()
function, we can figure out the total number of elements in the tuple.
Different data types in tuples - Tuple can include heterogenous data.
Lists to tuples - Using tuple ()
functions, we can convert a list into tuple.
A set in python is a collection of unordered, unchangeable, and unindexed items. Set items are unchangeable, but new items can be added to the set and old items can be deleted from the set. Another important characteristics of set is that it has no duplicate elements. Curly bracket ({}
) is used to create a set. The function difference ()
or minus operator (-
) is used to calculate difference between two sets.
= {'Hello', 'World', "World"}
new_set new_set
{'Hello', 'World'}
type(new_set)
set
0] = "Hi" new_set[
Numpy
? Numpy
is a library in python and it is one of the most important and essential libraries for data science becasue almost all of the libraries in python PyData ecosystem use numpy. Therefore, understanding numpy is important. Moreover, numpy arrays are very fast as they are implemented in C. In this section, we will learn some useful numpy methods.
Before we start using the numpy
module (library), we need to install it. We can run the following code to install numpy.
pip install numpy # OR
conda install numpy
Once the numpy is installed, we need to load (import) the library by running the following code -
import numpy as np
numpy
Functions for ArrayArray is a multi dimensional data structure, which describes a collection of “items” of the same type (homogenous). Arrays are powerful for performing different mathematical and scientific computation. There are many functions in numpy to create arrays. Below some those methods (functions) are described.
* np.array()
is used to create an array in numpy. The array object is also called ndarray. You can pass a list or tuple to the np.array () function. You can create zero, one, two, or three dimensional arrays in numpy. Using the ndim()
you can check the dimension of an array.
list = [2025, 2024, 2023, 2022, 2021, 2020, 2019]
list)
np.array(= np.array(list)
array # 1 dimensional array
array.ndim
20)).ndim # zero dimensional array.
np.array((
= np.array([[1,3,5,7], [2,4,6,8]])
array2
array2# 2 dimensional array array2.ndim
2
* np.arange(start, stop, step)
3 function is used to create array with values starting from start
up to, but not including, stop
value, increasing by step
.
10,21,1)
np.arange(21)
np.arange(10,21)
np.arange(-1,1)
np.arange (-1,1,0.001)
np.arange(10,30,3) np.arange(
array([10, 13, 16, 19, 22, 25, 28])
* np.linspace(start, stop, n)
creates an array of n evenly spaced number between start
and stop
.
10,21,10)
np.linspace(1,100, 10) np.linspace(
array([ 1., 12., 23., 34., 45., 56., 67., 78., 89., 100.])
* np.zeros()
is used to create an array with elements 0.
5)
np.zeros(5,5))
np.zeros((5,5]) np.zeros([
array([[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]])
* np.ones()
is used to create an array with elements 1.
5)
np.ones(5,5))
np.ones((5,5]) np.ones([
array([[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.]])
* np.eye()
is used to create an identity matrix. The same can be done by using np.identity()
function.
5) np.eye(
array([[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1.]])
* np.random.rand()
function generates an array of random numbers between 0 and 1 from a uniform distribution.
10) # one dimensional
np.random.rand(3,2) # two dimensional np.random.rand(
array([[0.02645752, 0.94033618],
[0.83656222, 0.59655895],
[0.32579328, 0.80937829]])
* np.random.randn()
generates an array of random numbers between 0 and 1 from a standard normal distribution.
10) # one dimensional
np.random.randn(3,4) # two dimensional np.random.randn(
array([[ 0.30210099, 0.51770939, 1.82834208, 0.68299169],
[-0.20429609, 0.48092033, 1.25141883, 1.63325492],
[ 0.3242677 , 0.78609986, -0.59844598, -0.78931083]])
* np.random.randint()
allows to generate random integer numbers given an interval of integers.
=0, high=10, size = 5) np.random.randint(low
array([7, 5, 7, 3, 5], dtype=int32)
* np.reshape()
allows to chgange the shape (rows and columns) without changing the data in the array.
= np.random.randn(3,4) # two dimensional
array
array6,2) array.reshape(
array([[ 0.84150961, -0.99492664],
[-2.06698101, -1.22421433],
[-1.60569132, -1.75631627],
[ 0.35674724, -0.47488292],
[ 0.17706789, -1.85904999],
[ 2.01366597, -0.65283419]])
Some other useful functions from numpy include - np.shape
, np.dtype
, np.transpose
. Some useful functions related to linear algebra include - np.linalg.inv()
- to compute inverse of a matrix, np.linalg.det()
- to compute determinant of a matrix, np.linalg.eig()
- to compute eigenvalues and eigenvectors of a matrix, np.linalg.solve()
- to solve a system of linear equations.
Both indexing and slicing of arrays are important skill to learn. Indexing refers to obtaining individual elements from an array while slicing refers to obtaining a sequence of elements from the array. We use array[start:end]
to index an array.
# One Dimensional Array
= np.random.randn(10)
array
array2]
array[1:3]
array[5]
array[:5:]
array[-3:]
array[-3]
array[:>0.50] # conditional indexing array[array
array([1.44293469, 1.16446499])
# Two Dimensional Array
= np.random.randn(8,5)
array2D
array2D1]
array2D[1][2] # double brackets
array2D[1,2] # single bracket (preferred method)
array2D[2:,]
array2D[2:,3:]
array2D[2:3]
array2D[2:4] # Only rows
array2D[3:] array2D[:,
array([[-2.2122436 , 0.06178527],
[ 1.54939543, -0.25975965],
[-0.74136396, -1.60482342],
[-1.33514363, -0.7696434 ],
[-2.34376853, 0.1489806 ],
[ 0.62060885, 0.08980056],
[ 0.63882864, 1.51053534],
[-0.24009317, -1.36122267]])
Array operations involve performing matematical operations on the array as a whole. They are not perfomed on the individual element of the array.
= np.arange(85,96)
array1 = np.arange (35,46)
array2 + array2
array1 - array2
array1 *array2
array1/ array2 array1
array([2.42857143, 2.38888889, 2.35135135, 2.31578947, 2.28205128,
2.25 , 2.2195122 , 2.19047619, 2.1627907 , 2.13636364,
2.11111111])
array1.mean()
array2.std()
np.mean(array1)min(array2)
np.max(array2)
np.
np.sqrt(array1)sum(array1)
np.sum(array1)) np.log(np.
np.float64(6.897704943128636)
In python, there are some functions that we use very frequently. In this section, we will discuss some of those functions.
for
Loop Function for
loop function in python allows to iterate over iterable sequences such as list, tuple, string, or range and execuate codes for each elements in the sequence. for
loop function helps to handle repititve tasks more efficiently and effectively. The syntax of a for
loop function is -
for element in sequence:
# Expected code to execute on each element of the sequence
A basic example of a for
loop function is -
= ['Ashley', 'Elijah', 'John', 'Jack', 'Adams']
analytics_students for student in analytics_students:
print(student)
Ashley
Elijah
John
Jack
Adams
Other examples of a for
loop function is -
= [2,4,6,8,10]
even_numbers for numbers in even_numbers:
= numbers**2
square print (square)
4
16
36
64
100
= [2,4,6,8,10]
even_numbers = []
squares for numbers in even_numbers:
= numbers**2
square **2)
squares.append(numbers
print (squares)
[4, 16, 36, 64, 100]
map()
Function map ()
function, like for
loop function, allows to apply a function on each item in an iterable (list, tuple, or string) sequence. The syntax for map ()
is - map (function, iterable)
. Below is an example of map ()
function -
= [16, 17, 18, 19, 20]
var = map(lambda x: np.log(x), var)
var_log
for x in var_log:
print(x)
2.772588722239781
2.833213344056216
2.8903717578961645
2.9444389791664403
2.995732273553991
map ()
function is useful for simple calculations; however, for complex transformations, using for
loop is efficient and effective.
In addition to predifend functions from different python modules, users can define their own functions, which are sometimes called named functions. The syntax for defining a user defined function in python -
def deduct_num (a, b):
"""
The function will deduct two numbers
"""
= a - b
result return result
In the above example, a user defined function is created. Then function name is deduct_num
and it is created using def
keyword. So, when we need to create a user defined function, we will start with def
keyword followed by the name of the function. The a
and b
are the function’s arguments, which sometimes are also called parameters.
The tripple quote """ """
is used to create a docstring, which also explains the nature of the function or what it will do. The return
statement in function will return a value.
15, 100) deduct_num(
-85
Another example of user defined function -
def welcome (name):
"""
The function greets the person
"""
print(f"Welcome {name}! How are you doing?")
"John") welcome(
Welcome John! How are you doing?
Anonymous function is a function without a name. It is also called lambda function in python. The syntax for lambda function is - lambda arguments: expression
. Below is an example of lambda function -
= lambda x: x**2
sqr 5) sqr(
25
lambda function can take many arguments (parameters), but accepts only one expression.
Create an array of integers from 10 to 50.
Create an array of all even integers from 10 to 50.
Create an array of 10 threes (use either np.full()
or np.ones()
or np.repeat()
).
Create a 3 by 3 matrix with values ranging from 10 to 18.
Create an array of 5 by 5 identify matrix.
Use numpy to generate a random number between 0 and 1.
Use numpy to generate an array of 25 random numbers sampled from a standard normal distribution.
Create a matrix like below -
array([[0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1 ],
[0.11, 0.12, 0.13, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, 0.2 ],
[0.21, 0.22, 0.23, 0.24, 0.25, 0.26, 0.27, 0.28, 0.29, 0.3 ],
[0.31, 0.32, 0.33, 0.34, 0.35, 0.36, 0.37, 0.38, 0.39, 0.4 ],
[0.41, 0.42, 0.43, 0.44, 0.45, 0.46, 0.47, 0.48, 0.49, 0.5 ],
[0.51, 0.52, 0.53, 0.54, 0.55, 0.56, 0.57, 0.58, 0.59, 0.6 ],
[0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.7 ],
[0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.8 ],
[0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.9 ],
[0.91, 0.92, 0.93, 0.94, 0.95, 0.96, 0.97, 0.98, 0.99, 1. ]])
Create an array of 50 linearly spaced points between 0 and 1.
Create an array of 20 linearly spaced points between 0 and 1.
= np.arange(1,26).reshape(5,5)
mat mat
array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25]])
array([[12, 13, 14, 15],
[17, 18, 19, 20],
[22, 23, 24, 25]])
Produce the following (value) 20 from mat matrix - np.int64(20)
Produce the following matrix from mat.
array([[ 2],
[ 7],
[12]])
Produce the following matrix from mat - array([21, 22, 23, 24, 25])
Produce the following matrix from mat
array([[16, 17, 18, 19, 20],
[21, 22, 23, 24, 25]])
Get the sum of all values in the mat matrix
Get the standard deviation of the values in mat matrix
Get the sum of all columns in the mat matrix
Get the sum of all rows in the mat matrix
Get the determinant and eigenvalues and eigenvectors of the matrix mat.
When we say that lists are ordered, it means that the items have a defined order, and that order will not change. If you add new items to a list, the new items will be placed at the end of the list.↩︎
In python, functions are called methods.↩︎
range(start, stop, step)
function creates a sequence of numbers starting from start
, and stopping at stop
. Usually, the step
in range ()
function is 1.↩︎