Daidalos
June 18, 2020

Examples of how to calculate the mean over a dataframe column with pandas in python: [TOC] ### 1 -- Create a dataframe Lets consider the following dataframe: import pandas as pd data = {'Name':['Ben','Anna','Zoe','Tom','John','Steve'], 'Age':[20,27,43,30,12,21]} df = pd.DataFrame(data) returns Name Age 0 Ben 20 1 Anna 27 2 Zoe 43 3 Tom 30 4 John 12 5 Steve 21 ### 2 -- Calculate ...

Daidalos
June 17, 2020

Examples of how to replace NaN values in a pandas dataframe [TOC] ### 1 -- Create a dataframe Lets consider the following dataframe: import pandas as pd import numpy as np data = {'Name':['Ben','Anna','Zoe','Tom','John','Steve'], 'Age':[20,27,43,30,np.nan,np.nan], 'Gender':['M',np.nan,'F','M','M','M']} df = pd.DataFrame(data) returns Name Age Gender 0 Ben 20.0 M 1 Anna 27.0 NaN 2 Zoe ...

Daidalos
June 17, 2020

Examples of how to convert a dataframe column of date of birth DOB to column of age with pandas in python: [TOC] ### 1 -- Create a dataframe Lets consider the following dataframe for example import pandas as pd data = {'Name':['Ben','Anna','Zoe','Tom','John','Steve'], 'dob':['1982-07-08 00:00:00', '1987-03-01 00:00:00', '2016-02-12 00:00:00', '2002-08-14 00:00:00', '2011-01-19 00: ...

Daidalos
June 17, 2020

Examples of how to get the age from a date of birth (DOB) in python: [TOC] ### 1 -- Calculate the age from a DOB (example 1) Lets consider the following date of birth: July 8, 1982: import datetime dob = datetime.date(1982,8,7) to get the age of that person at the present day (June 16, 2020), a solution is to defined a function: def from_dob_to_age(born): today = datetime.date.today() return today.year - born.year - ((today.month, today.day) < ( ...

Daidalos
June 16, 2020

Examples of how to emove duplicates rows in a pandas dataframe in python: [TOC] ### 1 -- Create a dataframe Lets create first for example the following dataframe import pandas as pd data = {'Name':['Ben','Anna','Anna','Anna','Zoe','Zoe','Tom','John','Steve'], 'Age':[20,27,27,27,43,43,30,12,21], 'Sex':[1,0,0,0,0,0,1,1,1]} df = pd.DataFrame(data) print(df) returns here Name Age Sex 0 Ben 20 1 1 A ...

Daidalos
June 16, 2020

Examples of how to drop dataframe rows where a condition is true with pandas in python [TOC] ### 1 -- Create a dataframe Lets consider for example the following dataframe: >>> import pandas as pd >>> data = {'Name':['Ben','Anna','Zow','Tom','John','Steve'], 'Age':[20,27,43,30,12,21], 'Sex':[1,0,0,1,1,1]} >>> df = pd.DataFrame(data) returns here: >>> df Age Name Sex 0 20 Ben 1 1 27 Anna 0 2 43 Zoe 0 3 30 Tom 1 4 12 John ...

Daidalos
June 01, 2020

Examples of how remove (filter) the duplicates in a python list: [TOC] ### 1 -- Create a list Let's consider the following list: >>> l = ['a','a','b','c','d','d','d'] the goal here is to remove the duplicates in the list. ### 2 -- Remove the duplicates using a for loop A first solution is to use a for loop, example: >>> lwd = [] >>> for i in l: ... if i not in lwd: lwd.append(i) ... >>> lwd ['a', 'b', 'c', 'd'] ### 2 -- Remove the duplicates using a d ...

Daidalos
June 01, 2020

Examples of how to check if an element is in a list or not in python: [TOC] ### 1 -- Create a liste Lets create for instance the following list: >>> l ['a', 'b', 'c', 'd', 'e', 'f'] ### 2 -- Check if an element is in the list To check if the element 'c' is in the list called here l, a solution is to use the following logical expression: >>> 'c' in l True check now if 'g' is in the list >>> 'g' in l returns here: False Can then be used with if, illustra ...

Daidalos
May 27, 2020

Example of how to get a list of names (variables) of data stored in a HDF5 file using pandas in python [TOC] ### 1 -- Read the file First, to read an HDF5 file using pandas, we can do: store = pd.HDFStore('data.hdf5') or with pd.HDFStore('data.hdf5') as store: ... ... ### 2-- Get a list of names (variables) of data stored in a HDF5 file using pandas To get the name of all data stored in the hdf5 file, a solution is to use keys() : sto ...

Daidalos
May 26, 2020

Example of how to find in a list the elements starting by *** in python: [TOC] ### 1 - - Create a simple list in python Lets create a liste with words: l = ['name', 'address_01', 'address_02', 'address_03', 'job', 'income'] ### 2 - - Select words that starts by *** For example, lets select only the words that start by 'address' using a list comprehension and the python method [startswith()](https://www.tutorialspoint.com/python/string_startswith.htm): sub_l = [i for ...

Daidalos
May 26, 2020

Examples of how to find the smallest positive value in a list in python ? [TOC] ### 1 -- Find the minimum value Let's consider the following list l = [ 7, 3, 6, 9, 2, -1, -5, -4, -3] to find the smallest value, a solution is to use the function min(): min(l) which returns here: -1 ### 2 -- Find the smallest positive value Now to find the smallest positive value a solution is to use a list comprehension and then min(): min([i for i in l if i > 0]) ...

Daidalos
May 26, 2020

Examples of how to find the difference in minutes between two dates in python: [TOC] ### 1 -- Create two dates Let's create two dates objects in python using the module datetime: import datetime year = 2008 month = 7 day = 8 hour = 12 minute = 47 second = 0 time1 = datetime.datetime(year,month,day,hour,minute,second) hour = 14 minute = 20 time2 = datetime.datetime(year,month,day,hour,minute,second) ### 2 -- Example 1 u ...

Daidalos
May 24, 2020

Example of how to select randomly (sample) the rows of a dataframe using pandas in python: [TOC] ### 1 -- Create a simple dataframe Créons une simple dataframe avec 5 colonnes et 20 lignes: >>> import pandas as pd >>> import numpy as np >>> data = np.arange(1,101) >>> data = data.reshape(20,5) >>> df = pd.DataFrame(data=data,columns=['a','b','c','d','e']) >>> df a b c d e 0 1 2 3 4 5 1 6 7 8 9 10 2 11 12 13 14 15 3 16 ...

Daidalos
May 24, 2020

Examples of how to edit a pandas dataframe column values where a condition is verified in python: [TOC] ### 1 -- Create a simple dataframe with pandas Lets' start by creating a simple dataframe with 5 columns and 20 rows: >>> import pandas as pd >>> import numpy as np >>> data = np.arange(1,101) >>> data = data.reshape(20,5) >>> df = pd.DataFrame(data=data,columns=['a','b','c','d','e']) returns >>> df a b c d e 0 1 2 3 4 5 1 6 7 8 ...

Daidalos
May 24, 2020

Examples of how to select a dataframe rows using a condition with pandas in python: [TOC] ### 1 -- Create a simple dataframe Lets start by creating a simple dataframe with pandas: >>> import pandas as pd >>> data = {'Name':['Ben','Anna','Zow','Tom','John','Steve'], 'Age':[20,27,43,30,12,21], 'Sex':[1,0,0,1,1,1]} >>> df = pd.DataFrame(data) returns >>> df Age Name Sex 0 20 Ben 1 1 27 Anna 0 2 43 Zow 0 3 30 Tom 1 4 12 Joh ...

Daidalos
May 24, 2020

Examples of how to select dataframe columns that start with *** using pandas in python: [TOC] ### 1 -- Create a dataframe Lets start by creating a simple dataframe with 8 columns: import pandas as pd import numpy as np data = np.arange(1,33) data = data.reshape(4,8) df = pd.DataFrame(data=data,columns=['name','add_01','add_02','add_03', 'counrty','streed','zip code','county']) print(df) returns name ...

Daidalos
May 10, 2020

Examples of how to calculate and plot a cumulative distribution function in python [TOC] ### 1 -- Generate random numbers Let's for example generate random numbers from a normal distribution: import numpy as np import matplotlib.pyplot as plt N = 100000 data = np.random.randn(N) ### 2 -- Create an histogram with matplotlib hx, hy, _ = plt.hist(data, bins=50, normed=1,color="lightblue") plt.ylim(0.0,max(hx)+0.05) plt.title('Generate random numbers \n from a ...

Daidalos
May 10, 2020

Example of how to calculate a log-likelihood using a normal distribution in python: [TOC] See the note: [How to estimate the mean with a truncated dataset using python ?](https://www.science-emergence.com/Articles/How-to-estimate-the-mean-with-a-truncated-dataset-using-python-/) to understand the interest of calculating a log-likelihood using a normal distribution in python. ### 1 -- Generate random numbers from a normal distribution Let's for example create a sample of 100000 rando ...

Daidalos
May 09, 2020

Example of how to add text (units, %, etc) in a heatmap cell annotations using seaborn in python: [TOC] ### 1 -- Create a simple heatmap with seaborn Let's create a heatmap with seaborn: import seaborn as sns import numpy as np import pandas as pd import matplotlib.pyplot as plt data = np.array([[25.55535942, 1.99598017, 9.78107706], [ 4.95758736, 39.68268716, 16.78109873], [ 0.45401194, 0.10003128, 0.6921669 ]]) df = pd.DataFrame(data=data) fig = plt.fig ...

Daidalos
May 09, 2020

Examples of how to estimate the mean with a truncated dataset using python for data generated from a normal distribution: [TOC] ### 1 -- Create a dataset of random numbers from a normal distribution Create a set of random numbers distributed according to a normal distribution: import scipy.stats import numpy as np import matplotlib.pyplot as plt mu_0 = 2.0 srd_0 = 4.0 data = np.random.randn(100000) data = data * srd_0 + mu_0 data = data.reshape(-1, 1) ### 2 -- ...

Daidalos
May 09, 2020

Examples of how to use a Gaussian mixture model (GMM) with sklearn in python: [TOC] from sklearn import mixture import numpy as np import matplotlib.pyplot as plt ### 1 -- Example with one Gaussian Let's generate random numbers from a normal distribution with a mean $\mu_0 = 5$ and standard deviation $\sigma_0 = 2$ mu_0 = 5.0 srd_0 = 2.0 data = np.random.randn(100000) data = data * srd_0 + mu_0 data = data.reshape(-1, 1) Plot the data ...

Daidalos
May 09, 2020

Example of how to generate random numbers from a log-normal distribution in python ? [TOC] Log-normal distribution: \begin{equation} \frac{1}{x \sigma \sqrt{2\pi}}.exp(-\frac{(len(x)-\mu)^2}{2\sigma^2}) \end{equation} ### 2 -- Using scipy lognorm Example of how to generate random numbers from a log-normal distribution with $\mu=0$ and $\sigma=0.5$ using scipty function lognorm: from scipy.stats import lognorm import numpy as np import matplotlib.pyplot as plt st ...

Daidalos
May 09, 2020

Example of how to add text on an image using pillow in python: [TOC] ### 1 -- Create an image with pillow and add text on it Example 1: let's for example create a simple image with a red background: from PIL import Image img = Image.new('RGB', (600, 400), color = 'red') img.save('pil_red.png') [image:pilred02 size:50 caption:How to add text on an image using pillow in python ?] To add text, you must first download a 'font' file locally to your machine, for example f ...

Daidalos
May 09, 2020

Example of how to increase the size of the cells text (annotations) of a seaborn heatmap in python: [TOC] ### 1 -- Create a simple heatmap using seaborn Let's first create a simple heatmap using seaborn: import seaborn as sns import numpy as np import pandas as pd import matplotlib.pyplot as plt data = np.array([[25.55535942, 1.99598017, 9.78107706], [ 4.95758736, 39.68268716, 16.78109873], [ 0.45401194, 0.10003128, 0.6921669 ]]) df = pd.DataFrame(data=data ...

Daidalos
May 09, 2020

Examples of how to increase the size of axes labels on a seaborn heatmap in python: [TOC] ### 1 -- Create a simple heatmap using seaborn Let's create a first simple heatmap using seaborn: import seaborn as sns import numpy as np import pandas as pd import matplotlib.pyplot as plt data = np.array([[25.55535942, 1.99598017, 9.78107706], [ 4.95758736, 39.68268716, 16.78109873], [ 0.45401194, 0.10003128, 0.6921669 ]]) df = pd.DataFrame(data=data,columns=['C1',' ...

Daidalos
May 08, 2020

Example of how to remove lines from a to b using vi editor ? [TOC] ### 1 -- Remove lines from a to b (with b>a) To delete lines from a to b, a solution is to use the command: :m,nd Note: to remove from the current line to line n: :,nd ### 2 -- Remove all lines Note: to remove all content in the file: :1,$d ### 3 -- References - [Delete from the current cursor position to a given line number in vi editor](https://stackoverflow.com/questions/6384561/d ...

Daidalos
April 30, 2020

Let's consider the normal (Gaussian) distribution with mean equal to 8 and standard deviation equal to 2: [image:probability-normal-distribution size:75 caption:How to calculate a Gaussian density probability function at a given point in python ?] To calculate a Gaussian density probability function at a given point in python, a solution is to do: scipy.stats.norm.pdf(6,8,2) returns: 0.13 Source code to create the plot: import matplotlib.pyplot as plt import sc ...

Daidalos
April 29, 2020

Example of how to create a global variable in python: [TOC] ### 1 -- Using global To declare for example a as a global variable: global a ### 2 -- Illustration with a function Using global variables in a function: >>> global a,b >>> a = 0.5 >>> b = 2 >>> def function( x ): ... return a * x + b ... >>> function(1) 2.5 >>> b = 4 >>> function(1) 4.5 ### 3 -- References - [Python function global variab ...

Daidalos
April 15, 2020

Example of how to multiply two complex numbers in python [TOC] ### Create two complex numbers in python Let's import the module python cmath that is used to work with complex numbers >>> import cmath Create a first complex number z1: >>> z1 = 1.0 + 2.0j >>> z1 (1+2j) of real part >>> z1.real 1.0 and imaginary part >>> z1.imag 2.0 Let's also create another complex number z2: >>> z2 = 3.0 + 5.0j >>> z2 (3+5j) ...

Daidalos
April 15, 2020

Example of how to create an empty data frame with pandas and add new entries row by row in python: [TOC] ### Create an empty data frame Let's create an empty data frame with pandas: >>> import pandas as pd >>> df1 = pd.DataFrame(columns=['a','b','c','d']) >>> df1 Empty DataFrame Columns: [a, b, c, d] Index: [] Check the number of rows: >>> len(df1) 0 Check the number of columns: >>> len(df1.columns) 4 ### Add new row with ...

Daidalos
April 15, 2020

Examples of how to count the number of occurrences of elements in a pandas data frame column in python [TOC] ### Create a simple date frame with pandas Let's create a simple data frame called df: >>> import pandas as pd >>> import numpy as np >>> df = pd.DataFrame(columns=['target','a','b']) >>> df = df.append({"target": 0, "a": "no", "b": "M"}, ignore_index=True) >>> df = df.append({"target": 1, "a": "yes", "b": "F"}, ignore_index=True) >>> df = df.append({"target ...

Daidalos
April 15, 2020

Example of how to add a frame to a seaborn heatmap figure in python [TOC] ### Plot a figure with seaborn heatmap Example of how to plot a figure with seaborn heatmap import seaborn as sns import numpy as np import pandas as pd import matplotlib.pyplot as plt data = np.array([[25.55535942, 1.99598017, 9.78107706], [ 4.95758736, 39.68268716, 16.78109873], [ 0.45401194, 0.10003128, 0.6921669 ]]) df = pd.DataFrame(data=data) fig = plt.figure(num=None, figsi ...

Daidalos
April 15, 2020

Example of how to change the colorbar size of a seaborn heatmap figure in python: [TOC] ### Create a seaborn heatmap figure import seaborn as sns import numpy as np import pandas as pd import matplotlib.pyplot as plt data = np.array([[25.55535942, 1.99598017, 9.78107706], [ 4.95758736, 39.68268716, 16.78109873], [ 0.45401194, 0.10003128, 0.6921669 ]]) df = pd.DataFrame(data=data) fig = plt.figure(num=None, figsize=(10, 10), dpi=80, facecolor='w', edgecol ...

Daidalos
April 15, 2020

Example of how to check if a directory exists in python: [TOC] ### Using the os function isdir() To check if a file (called for example "images") exists, a solution in python is to use the function [isdir](https://docs.python.org/3/library/os.path.html#os.path.isdir) : >>> import os >>> os.path.isdir('images') that returns a boolean (True or False) is the directory 'images' exists or not. = To test if the directory is available with the path /users/john/images: ...

Daidalos
April 15, 2020

Example of how to check if a file exists in python: [TOC] ### Using the os function isfile() To check if a file (called for example "photo_001.png") exists, a solution in python is to use the function[isfile](https://docs.python.org/3/library/os.path.html#os.path.isfile): >>> import os >>> os.path.isfile('photo_001.png') that returns a boolean (True or False) is the file 'photo_001.png' exists or not. = To test if the file is available with the path /users/john/photo ...

Daidalos
April 14, 2020

Example of how to create a table of contents in a jupyter notebook [TOC] ### Create a table of contents To start, lets create two markdown cells (see image below) [image:jupyter-notebook-toc-01 size:75 caption:How to create a table of contents in a jupyter notebook ?] then, to create a table of contents, a solution is to create a markdown link to an anchor: ### Table of Contents * [Chapter 1](#chapter1) * [Section 1.1](#section_1_1) * [Section 1.2](sSection_1_2) ...

Daidalos
April 14, 2020

Example of how to center a matplotlib figure in a Jupyter notebook: [TOC] ### Plot a matplotlib figure in a Jupyter notebook Let's create a simple matplotlib figure: import matplotlib.pyplot as plt plt.scatter([1,2,3,4,5,6,7,8],[4,1,3,6,1,3,5,2]) plt.title('Nuage de points avec Matplotlib') plt.xlabel('x') plt.ylabel('y') To show the figure in the Jupyter notebook just add: plt.show() by default will be align on the left: [image:jupyter-not ...

Daidalos
April 14, 2020

Examples of how to generate a random number between 0 and 1 in python [TOC] ### Using function random.uniform() To generate a random number between 0 and 1, there are several solutions for example using the random module with uniform(): >>> import random >>> x = random.uniform(0,1) >>> x 0.24773029475050623 Generate a list of random numbers between 0 and 1: >>> list_rx = [random.uniform(0,1) for i in range(10000)] and plot: >>> import matplotlib ...

Daidalos
April 14, 2020

Example of how to divide by a number the elements of a pandas data frame column in python ? [TOC] ### Create a simple Data frame Let's create a data frame with pandas called df: >>> import pandas as pd >>> import numpy as np >>> data = np.arange(1,13) >>> data = data.reshape(3,4) >>> df = pd.DataFrame(data=data,columns=['a','b','c','d']) >>> df a b c d 0 1 2 3 4 1 5 6 7 8 2 9 10 11 12 ### Divide by a number the elements of a given colu ...

Daidalos
April 14, 2020

Example of how to subtract by a number the elements of a datafame column with pandas in python: [TOC] ### Create a simple Data frame Let's create a data frame with pandas called df: >>> import pandas as pd >>> import numpy as np >>> data = np.arange(1,13) >>> data = data.reshape(3,4) >>> df = pd.DataFrame(data=data,columns=['a','b','c','d']) >>> df a b c d 0 1 2 3 4 1 5 6 7 8 2 9 10 11 12 ### Subtract by a number the elements of a give ...

Daidalos
April 14, 2020

Example of how to multiply by a number the elements of a DataFrame column with pandas in python [TOC] ### Create a simple Data frame Let's create a data frame with pandas called df: >>> import pandas as pd >>> import numpy as np >>> data = np.arange(1,13) >>> data = data.reshape(3,4) >>> df = pd.DataFrame(data=data,columns=['a','b','c','d']) >>> df a b c d 0 1 2 3 4 1 5 6 7 8 2 9 10 11 12 ### Multiply by a number the elements of a give ...

Daidalos
April 14, 2020

Example of how to add a constant number to a DataFrame column with pandas in python [TOC] ### Create a simple Data frame Let's create a data frame with pandas called df: >>> import pandas as pd >>> import numpy as np >>> data = np.arange(1,13) >>> data = data.reshape(3,4) >>> df = pd.DataFrame(data=data,columns=['a','b','c','d']) >>> df a b c d 0 1 2 3 4 1 5 6 7 8 2 9 10 11 12 ### Add a constant number to each column elements Let's ...

Daidalos
April 14, 2020

Basic examples of how to merge / concatenate two DataFrames with pandas in python: [TOC] ### Create two data frames Let's create a first data frame called df1 with pandas >>> import pandas as pd >>> import numpy as np >>> data = np.arange(1,13) >>> data = data.reshape(3,4) >>> df1 = pd.DataFrame(data=data,columns=['a','b','c','d']) >>> df1 a b c d 0 1 2 3 4 1 5 6 7 8 2 9 10 11 12 Create a second data frame df2 >>> df2 = pd.DataFram ...

Daidalos
April 11, 2020

Examples of how to solve a quadratic degree equation in python using numpy: [TOC] ### Example 1 With python we can find the roots of a polynomial equation of degree 2 ($ ax ^ 2 + bx + c $) using the function numpy: [roots](http://docs.scipy.org/doc/numpy/reference/generated/numpy.roots.html#numpy-roots). Consider for example the following polynomial equation of degree 2 $ x ^ 2 + 3x-0 $ with the coefficients $ a = 1 $, $ b = 3 $ and $ c = -4 $, we then find: >>> import nump ...

Daidalos
April 11, 2020

Example of how to get the number of columns in a pandas DataFrame in python: [TOC] ### Create a simple dataframe with pandas Let's start by creating a simple dataframe df: >>> import pandas as pd >>> import numpy as np >>> data = np.arange(1,13) >>> data = data.reshape(3,4) >>> df = pd.DataFrame(data=data,columns=['a','b','c','d']) >>> df a b c d 0 1 2 3 4 1 5 6 7 8 2 9 10 11 12 ### Get the number of columns To obtain the number of ...

Daidalos
April 11, 2020

Example of how to get the number of rows in a pandas DataFrame in python: [TOC] ### Create a simple dataframe with pandas Let's start by creating a simple dataframe df: >>> import pandas as pd >>> import numpy as np >>> data = np.arange(1,13) >>> data = data.reshape(3,4) >>> df = pd.DataFrame(data=data,columns=['a','b','c','d']) >>> df a b c d 0 1 2 3 4 1 5 6 7 8 2 9 10 11 12 ### Get the number of rows To obtain the number of rows o ...

Daidalos
April 11, 2020

Example of how to add a column to a pandas DataFrame in python: [TOC] ### Create a simple dataframe with pandas Let's start by creating a simple dataframe df: >>> import pandas as pd >>> import numpy as np >>> data = np.arange(1,13) >>> data = data.reshape(3,4) >>> df = pd.DataFrame(data=data,columns=['a','b','c','d']) >>> df a b c d 0 1 2 3 4 1 5 6 7 8 2 9 10 11 12 ### Add a column to a dataframe dataframe To add a column we can ...

Daidalos
April 11, 2020

Example of how to add a new row at the end of a pandas DataFrame in pandas [TOC] ### Create a simple dataframe with pandas Let's start by creating a simple dataframe df: >>> import pandas as pd >>> import numpy as np >>> data = np.arange(1,13) >>> data = data.reshape(3,4) >>> df = pd.DataFrame(data=data,columns=['a','b','c','d']) >>> df a b c d 0 1 2 3 4 1 5 6 7 8 2 9 10 11 12 ### Add a row to a dataframe To add a line to the df dat ...

Daidalos
April 11, 2020

Example of how to apply a function to a DataFrame row with pandas in python: [TOC] ### Create a simple dataframe with pandas Let's start by creating a simple dataframe df: >>> import pandas as pd >>> import numpy as np >>> data = np.arange(1,13) >>> data = data.reshape(3,4) >>> df = pd.DataFrame(data=data,columns=['a','b','c','d']) >>> df a b c d 0 1 2 3 4 1 5 6 7 8 2 9 10 11 12 ### Apply a function on a given row So let's try to mo ...

Daidalos
April 11, 2020

Example of how to apply a function to a DataFrame column with pandas in python: [TOC] ### Create a simple dataframe with pandas Let's start by creating a simple dataframe df: >>> import pandas as pd >>> import numpy as np >>> data = np.arange(1,13) >>> data = data.reshape(3,4) >>> df = pd.DataFrame(data=data,columns=['a','b','c','d']) >>> df a b c d 0 1 2 3 4 1 5 6 7 8 2 9 10 11 12 ### Apply a function on a given column So let's try ...