Read dataset and reset index in place:
1 |
|
Save dataset set a csv file
1 |
|
Visualize DataFrame:
1 |
|
Convert object
(string) columns to Categorical
columns:
1 |
|
Convert float
columns to int
columns (as long as there are no missing values):
1 |
|
Convert Categorical columns into dummy columns:
1 |
|
Count null values of each column:
1 |
|
Select rows with one or more null values:
1 |
|
Fill missing values:
1 |
|
Remove missing values:
1 |
|
Remove columns/features:
1 |
|
Remove duplicate rows of DataFrame
:
1 |
|
Add interaction term:
1 |
|
Reset the index to the default integer index:
1 |
|
Convert pandas column to DateTime:
1 |
|
Select rows between two dates:
1 |
|
Select rows if value in column is in a list of values:
1 |
|
Sort by the values along either axis:
1 |
|
Sort object by labels (along an axis):
1 |
|
Replace values:
1 |
|
Normalize features:
1 |
|
Standard database join operations between DataFrame
or named Series
objects:
1 |
|
Concatenate DataFrame
objects:
1 |
|
Split features X
and labels y
:
1 |
|
Split training set and test set:
1 |
|
Add a column of constant 1
’s:
1 |
|
Swap/reorder columns of DataFrame
objects:
1 |
|
Visualize the number of distinct values that each feature can take:
1 |
|
Visualize the number of distinct values that each feature can take and the corresponding data type:
1 |
|
Visualize counts of unique values in descending order of frequency:
1 |
|
For a certain column, parse string representations to lists of elements:
1 |
|
For a certain column, convert lists of strings to dummies:
1 |
|
1 |
|
Get a subset of the DataFrame’s columns based on the column dtypes:
1 |
|
Make plots of Series or DataFrame (matplotlib is used by default):
1 |
|
Make histograms:
1 |
|
Generate descriptive statistics (count, mean, std, min, 25%, 50%, 75%, max for each feature) of a GroupBy
object`:
1 |
|
MultiIndex indexing:
1 |
|
1 |
|
Revert from MultiIndex to single index dataframe:
1 |
|
Transform multiple columns to MultiIndex:
1 |
|
Slice a MultiIndex DataFrame with a condition based on the index:
1 |
|
Exchange index level of a MultiIndex DataFrame:
1 |
|
Rename a Series
:
1 |
|
Convert Series
to DataFrame
:
1 |
|
Build a pd.Timestamp
(datetime) object:
1 |
|
Add date offset to a pd.Timestamp
object:
1 |
|
Slice a dataframe based on datetime index (if datetime column has been set as the index):
1 |
|
Conversion of pd.Timedelta
Series, pd.TimedeltaIndex
, and pd.Timedelta
scalars:
1 |
|
1 |
|
Rolling window calculations:
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
|
Count the number of True
/False
in each row/column:
1 |
|
Print all rows and all columns of a DataFrame
1 |
|