mobts.utils package
Submodules
mobts.utils.formatting module
Initial formatting of the dataset
This module contains: - standardizing input names - removing rows with missing counter names - removing counters which do not have sufficient present observations - transforming the timestamp column to datetime type - adding further temporal columns
- mobts.utils.formatting._add_temporal_columns(df: DataFrame, freq: str, cols: ColumnsConfig = ColumnsConfig(counter='counter', timestamp='timestamp', count='count', weekday='weekday', week_num='week_num', how='how', hour='hour', date='date')) DataFrame
Adding further temporal columns
df: Dataset
cols: Column config
Dataframe with additional date, weekday, and week number columns for daily frequency
All above, plus hour and hour of week (how) columns for hourly frequency
- mobts.utils.formatting._drop_nan_counters(df: DataFrame, cols: ColumnsConfig = ColumnsConfig(counter='counter', timestamp='timestamp', count='count', weekday='weekday', week_num='week_num', how='how', hour='hour', date='date')) DataFrame
Remove rows with missing counter names
df: Dataset
cols: Column config
Dataframe with no missing counter name
Number of rows with missing counter names
- mobts.utils.formatting._drop_sparse_counters(df: DataFrame, cols: ColumnsConfig = ColumnsConfig(counter='counter', timestamp='timestamp', count='count', weekday='weekday', week_num='week_num', how='how', hour='hour', date='date'), cfg_spr: SparsityConfig = SparsityConfig(drop_sparse_counters=True, sparse_threshold=0.5)) DataFrame
Remove counters with sparse observations
df: dataset
cols: column config
cfg_spr: sparsity config
Dataframe with no sparse counters
- mobts.utils.formatting._format_datetime(df: DataFrame, cols: ColumnsConfig = ColumnsConfig(counter='counter', timestamp='timestamp', count='count', weekday='weekday', week_num='week_num', how='how', hour='hour', date='date')) DataFrame
Transform the timestamp column to datetime type
df: Dataset
cols: Column config
Dataframe with timestamp column as datetime datetype
- mobts.utils.formatting._standardize_input(df: DataFrame, counter_col: str, timestamp_col: str, count_col: str, out_cols: ColumnsConfig = ColumnsConfig(counter='counter', timestamp='timestamp', count='count', weekday='weekday', week_num='week_num', how='how', hour='hour', date='date')) DataFrame
Standardize raw input to canonical column names
df: Dataset
counter_col: Column for counter
timestamp_col: Column for timestamp
count_col: Column for counts
out_cols: Column config
Dataframe with standardized canonical names
This functions is fed externally when it is run. The three column names should be given by user.