How to get a list of names (variables) of data stored in a HDF5 file using pandas in python ?

Example of how to get a list of names (variables) of data stored in a HDF5 file using pandas in python [TOC] ### 1 -- Read the file First, to read an HDF5 file using pandas, we can do: store = pd.HDFStore('data.hdf5') or with pd.HDFStore('data.hdf5') as store: ... ... ### 2-- Get a list of names (variables) of data stored in a HDF5 file using pandas To get the name of all data stored in the hdf5 file, a solution is to use keys() : sto

How to find in a list the elements starting by *** in python ?

Example of how to find in a list the elements starting by *** in python: [TOC] ### 1 - - Create a simple list in python Lets create a liste with words: l = ['name', 'address_01', 'address_02', 'address_03', 'job', 'income'] ### 2 - - Select words that starts by *** For example, lets select only the words that start by 'address' using a list comprehension and the python method [startswith()](https://www.tutorialspoint.com/python/string_startswith.htm): sub_l = [i for

How to find the smallest positive value in a list in python

Examples of how to find the smallest positive value in a list in python ? [TOC] ### 1 -- Find the minimum value Let's consider the following list l = [ 7, 3, 6, 9, 2, -1, -5, -4, -3] to find the smallest value, a solution is to use the function min(): min(l) which returns here: -1 ### 2 -- Find the smallest positive value Now to find the smallest positive value a solution is to use a list comprehension and then min(): min([i for i in l if i > 0])

How to find the difference in minutes between two dates in python ?

Examples of how to find the difference in minutes between two dates in python: [TOC] ### 1 -- Create two dates Let's create two dates objects in python using the module datetime: import datetime year = 2008 month = 7 day = 8 hour = 12 minute = 47 second = 0 time1 = datetime.datetime(year,month,day,hour,minute,second) hour = 14 minute = 20 time2 = datetime.datetime(year,month,day,hour,minute,second) ### 2 -- Example 1 u

How to select randomly (sample) the rows of a dataframe using pandas in python ?

Example of how to select randomly (sample) the rows of a dataframe using pandas in python: [TOC] ### 1 -- Create a simple dataframe Créons une simple dataframe avec 5 colonnes et 20 lignes: >>> import pandas as pd >>> import numpy as np >>> data = np.arange(1,101) >>> data = data.reshape(20,5) >>> df = pd.DataFrame(data=data,columns=['a','b','c','d','e']) >>> df a b c d e 0 1 2 3 4 5 1 6 7 8 9 10 2 11 12 13 14 15 3 16

How to edit a pandas dataframe column values where a condition is verified in python ?

Examples of how to edit a pandas dataframe column values where a condition is verified in python: [TOC] ### 1 -- Create a simple dataframe with pandas Lets' start by creating a simple dataframe with 5 columns and 20 rows: >>> import pandas as pd >>> import numpy as np >>> data = np.arange(1,101) >>> data = data.reshape(20,5) >>> df = pd.DataFrame(data=data,columns=['a','b','c','d','e']) returns >>> df a b c d e 0 1 2 3 4 5 1 6 7 8

How to select dataframe rows using a condition with pandas in python ?

Examples of how to select a dataframe rows using a condition with pandas in python: [TOC] ### 1 -- Create a simple dataframe Lets start by creating a simple dataframe with pandas: >>> import pandas as pd >>> data = {'Name':['Ben','Anna','Zow','Tom','John','Steve'], 'Age':[20,27,43,30,12,21], 'Sex':[1,0,0,1,1,1]} >>> df = pd.DataFrame(data) returns >>> df Age Name Sex 0 20 Ben 1 1 27 Anna 0 2 43 Zow 0 3 30 Tom 1 4 12 Joh

How to select dataframe columns that start with *** using pandas in python ?

Examples of how to select dataframe columns that start with *** using pandas in python: [TOC] ### 1 -- Create a dataframe Lets start by creating a simple dataframe with 8 columns: import pandas as pd import numpy as np data = np.arange(1,33) data = data.reshape(4,8) df = pd.DataFrame(data=data,columns=['name','add_01','add_02','add_03', 'counrty','streed','zip code','county']) print(df) returns name

Comment obtenir une liste des noms des données stockées dans un fichier hdf5 avec pandas en python ?

Exemple de comment obtenir une liste des noms des données stockées dans un fichier hdf5 avec pandas en python: [TOC] ### 1 -- Lire le fichier Pour lire le fichier hdf (par exemple 'data.hdf5') avec pandas on peut faire comme ceci: store = pd.HDFStore('data.hdf5') ou with pd.HDFStore('data.hdf5') as store: ... ... ### 2-- Obtenir une liste des noms des données stockées dans un fichier hdf5 Pour alors obtenir une liste des noms des données st

Comment trouver dans une liste les éléments qui commencent par *** en python ?

Exemple de comment trouver dans une liste les éléments qui commencent par *** en python [TOC] ### 1 - - Créer une simple liste en python Créons une simple liste de mots en python: l = ['name', 'address_01', 'address_02', 'address_03', 'job', 'income'] ### 2 - - Sélectionner les éléments commençant par *** Pour sélectionner par exemple les éléments commençant par 'address' on peut utiliser les "list comprehensions" et la méthode [startswith()](https://www.tutorialspoint.com

Comment trouver la plus petite valeur positive d'une liste en python ?

Exemples de comment trouver la plus petite valeur positive d'une liste en python: [TOC] ### 1 -- Trouver la valeur minimum Soit la liste suivante l = [ 7, 3, 6, 9, 2, -1, -5, -4, -3] pour trouver la valeur minimum on peut utiliser la fonction min() en python: min(l) ce qui donne ici -1 ### 2 -- Trouver la plus petite valeur positive Pour trouver la plus petite valeur positive on peut utiliser une "list comprehension" puis utiliser min() min([i f

Comment obtenir la difference en minutes entre deux dates en python ?

Exemples de comment obtenir la difference en minutes entre deux dates en python: [TOC] ### 1 -- Créer deux dates différentes Créons tout d'abord deux dates time1 et time2 comme ceci en utilisant le module python datetime: import datetime year = 2008 month = 7 day = 8 hour = 12 minute = 47 second = 0 time1 = datetime.datetime(year,month,day,hour,minute,second) hour = 14 minute = 20 time2 = datetime.datetime(year,month,da

Comment sélectionner (filtrer) les lignes d'une dataframe en utilisant une condition avec pandas en python ?

Exemples de comment sélectionner (filtrer) les lignes d'une dataframe en utilisant une condition avec pandas en python: [TOC] ### 1 -- Créer une dataframe Commençons par créer une simple dataframe avec pandas: >>> import pandas as pd >>> data = {'Name':['Ben','Anna','Zow','Tom','John','Steve'], 'Age':[20,27,43,30,12,21], 'Sex':[1,0,0,1,1,1]} >>> df = pd.DataFrame(data) ce qui donne: >>> df Age Name Sex 0 20 Ben 1 1 27 Anna 0 2 43 Zow 0

Comment éditer les éléments d'une colonne ou une condition est vérifiée avec pandas en python ?

Exemple de comment éditer les éléments d'une colonne ou une condition est vérifiée avec pandas en python: [TOC] ### 1 -- Créer une simple dataframe avec pandas Pour créer une simple dataframe avec 5 colonnes et 20 lignes on peut faire comme ceci: >>> import pandas as pd >>> import numpy as np >>> data = np.arange(1,101) >>> data = data.reshape(20,5) >>> df = pd.DataFrame(data=data,columns=['a','b','c','d','e']) ce qui donne >>> df a b c d e 0 1 2

Comment sélectionner aléatoirement (échantillonner) les lignes d'une dataframe avec pandas en python ?

Exemple de comment sélectionner aléatoirement (échantillonner) les lignes d'une dataframe avec pandas en python: [TOC] ### 1 -- Créer une simple dataframe Créons une simple dataframe avec 5 colonnes et 20 lignes: >>> import pandas as pd >>> import numpy as np >>> data = np.arange(1,101) >>> data = data.reshape(20,5) >>> df = pd.DataFrame(data=data,columns=['a','b','c','d','e']) >>> df a b c d e 0 1 2 3 4 5 1 6 7 8 9 10 2 11 12

Comment sélectionner dans une dataframe les colonnes commençant par *** avec pandas en python ?

Exemple de comment sélectionner dans une dataframe les colonnes commençant par *** avec pandas en python [TOC] ### 1 -- Créer une simple dataframe Créons une simple dataframe avec 8 colonnes: import pandas as pd import numpy as np data = np.arange(1,33) data = data.reshape(4,8) df = pd.DataFrame(data=data,columns=['name','add_01','add_02','add_03', 'counrty','streed','zip code','county']) print(df) donne

How to calculate and plot a cumulative distribution function with matplotlib in python ?

Examples of how to calculate and plot a cumulative distribution function in python [TOC] ### 1 -- Generate random numbers Let's for example generate random numbers from a normal distribution: import numpy as np import matplotlib.pyplot as plt N = 100000 data = np.random.randn(N) ### 2 -- Create an histogram with matplotlib hx, hy, _ = plt.hist(data, bins=50, normed=1,color="lightblue") plt.ylim(0.0,max(hx)+0.05) plt.title('Generate random numbers \n from a

How to calculate a log-likelihood in python (example with a normal distribution) ?

Example of how to calculate a log-likelihood using a normal distribution in python: [TOC] See the note: [How to estimate the mean with a truncated dataset using python ?](https://www.science-emergence.com/Articles/How-to-estimate-the-mean-with-a-truncated-dataset-using-python-/) to understand the interest of calculating a log-likelihood using a normal distribution in python. ### 1 -- Generate random numbers from a normal distribution Let's for example create a sample of 100000 rando

How to add text (units, %, etc) in a heatmap cell annotations using seaborn in python ?

Example of how to add text (units, %, etc) in a heatmap cell annotations using seaborn in python: [TOC] ### 1 -- Create a simple heatmap with seaborn Let's create a heatmap with seaborn: import seaborn as sns import numpy as np import pandas as pd import matplotlib.pyplot as plt data = np.array([[25.55535942, 1.99598017, 9.78107706], [ 4.95758736, 39.68268716, 16.78109873], [ 0.45401194, 0.10003128, 0.6921669 ]]) df = pd.DataFrame(data=data) fig = plt.fig

How to estimate the mean with a truncated dataset using python ?

Examples of how to estimate the mean with a truncated dataset using python for data generated from a normal distribution: [TOC] ### 1 -- Create a dataset of random numbers from a normal distribution Create a set of random numbers distributed according to a normal distribution: import scipy.stats import numpy as np import matplotlib.pyplot as plt mu_0 = 2.0 srd_0 = 4.0 data = np.random.randn(100000) data = data * srd_0 + mu_0 data = data.reshape(-1, 1) ### 2 --

How to use a Gaussian mixture model (GMM) with sklearn in python ?

Examples of how to use a Gaussian mixture model (GMM) with sklearn in python: [TOC] from sklearn import mixture import numpy as np import matplotlib.pyplot as plt ### 1 -- Example with one Gaussian Let's generate random numbers from a normal distribution with a mean $\mu_0 = 5$ and standard deviation $\sigma_0 = 2$ mu_0 = 5.0 srd_0 = 2.0 data = np.random.randn(100000) data = data * srd_0 + mu_0 data = data.reshape(-1, 1) Plot the data

How to generate random numbers from a log-normal distribution in python ?

Example of how to generate random numbers from a log-normal distribution in python ? [TOC] Log-normal distribution: \begin{equation} \frac{1}{x \sigma \sqrt{2\pi}}.exp(-\frac{(len(x)-\mu)^2}{2\sigma^2}) \end{equation} ### 2 -- Using scipy lognorm Example of how to generate random numbers from a log-normal distribution with $\mu=0$ and $\sigma=0.5$ using scipty function lognorm: from scipy.stats import lognorm import numpy as np import matplotlib.pyplot as plt st

How to add text on an image using pillow in python ?

Example of how to add text on an image using pillow in python: [TOC] ### 1 -- Create an image with pillow and add text on it Example 1: let's for example create a simple image with a red background: from PIL import Image img = Image.new('RGB', (600, 400), color = 'red') img.save('pil_red.png') [image:pilred02 size:50 caption:How to add text on an image using pillow in python ?] To add text, you must first download a 'font' file locally to your machine, for example f

How to increase the size of the cells text (annotations) of a seaborn heatmap in python ?

Example of how to increase the size of the cells text (annotations) of a seaborn heatmap in python: [TOC] ### 1 -- Create a simple heatmap using seaborn Let's first create a simple heatmap using seaborn: import seaborn as sns import numpy as np import pandas as pd import matplotlib.pyplot as plt data = np.array([[25.55535942, 1.99598017, 9.78107706], [ 4.95758736, 39.68268716, 16.78109873], [ 0.45401194, 0.10003128, 0.6921669 ]]) df = pd.DataFrame(data=data

How to increase the size of axes labels on a seaborn heatmap in python ?

Examples of how to increase the size of axes labels on a seaborn heatmap in python: [TOC] ### 1 -- Create a simple heatmap using seaborn Let's create a first simple heatmap using seaborn: import seaborn as sns import numpy as np import pandas as pd import matplotlib.pyplot as plt data = np.array([[25.55535942, 1.99598017, 9.78107706], [ 4.95758736, 39.68268716, 16.78109873], [ 0.45401194, 0.10003128, 0.6921669 ]]) df = pd.DataFrame(data=data,columns=['C1','