This post covers useful notes for using pandas (the famous package for Python for data analysis). This post are more advanced topics such as group dataframe, group data with custom python functions and join dataframe etc.
1. Groupby dataframe :
1 | #This is a best way to find if this column has any null values or not: |
2. Group data practical usage: agg function
1 | # Groupby certain conditions ('Continent' in this case) to get a groupby generater obj, then using key ('Continent name') to access each item: |
3. Group data advanced usage: apply function
1 |
|
3. Reduce data size by ‘category’ datatype:
NOTE: Only suitable to columns that have limited types of string value, for instance for a type column in movies dataframe; or for gender types of student dataframe and the likes.
1 | # Using 'category' can massively reduce the size of a dataframe : |