|
| Events () |
|
void | insert_row (pChar p_e, pChar p_d, double w) |
| Process a row from a transaction file. More...
|
|
bool | define_event (pChar p_e, pChar p_d, double w, uint64_t code) |
| Define events explicitly. More...
|
|
String | optimize_events (Clips &clips, TargetMap &targets, int num_steps=10, int codes_per_step=5, double threshold=0.0001, pCodeSet p_force_include=nullptr, pCodeSet p_force_exclude=nullptr, Transform x_form=tr_linear, Aggregate agg=ag_longest, double p=0.5, int depth=1000, bool as_states=true, double exp_decay=0.00693, double lower_bound_p=0.95, bool log_lift=true) |
| Events optimizer. More...
|
|
bool | score_model (double &score, double &targ_prop, CodeInTreeStatMap &codes_stat, bool calc_tree_stats, Clips &clips, TargetMap &targets, EventCodeMap code_dict, Transform x_form, Aggregate agg, double p, int depth, bool as_states) |
| Internal: Do one step of the optimize_events() method. More...
|
|
CodeScores | get_top_codes (CodeInTreeStatMap &codes_stat, double targ_prop, double exp_decay, double lower_bound_p, bool log_lift) |
| Internal: Extract the top top_n codes by lift from a CodeInTreeStatMap map. More...
|
|
double | linear_correlation (OptimizeEval &ev) |
| Compute Pearson linear correlation between predicted and observed in an OptimizeEval. More...
|
|
bool | load (pBinaryImage &p_bi) |
| Load the state of an object from a base64 mercury-dynamics serialization using image_get() More...
|
|
bool | load (pBinaryImage &p_bi, int &c_block, int &c_ofs) |
| Load the state of an object from a base64 mercury-dynamics serialization using image_get() More...
|
|
bool | save (pBinaryImage &p_bi) |
| Save the state of an object into a base64 mercury-dynamics serialization using image_put() More...
|
|
void | set_max_num_events (int max_events) |
| Sets the public property max_num_events to simplify the python interface. More...
|
|
void | set_store_strings (bool store) |
| Sets the public property store_strings to simplify the python interface. More...
|
|
ElementHash | add_str (pChar p_str) |
| Define a new string and push it into the StringUsageMap. More...
|
|
void | erase_str (ElementHash hash) |
| Remove a string from the StringUsageMap by decreasing its use count and destroying it if not used anymore. More...
|
|
String | get_str (ElementHash hash) |
| Get a string content from its hash value. More...
|
|
uint64_t | event_code (BinEventPt &ept) |
| Return the code associated to an BinEventPt if found in the object. More...
|
|
int | num_events () |
| Return the number of events stored in the object. More...
|
|
EventMap::iterator | events_begin () |
| Return the EventMap::iterator to the first elements in the private variable .events. More...
|
|
EventMap::iterator | events_end () |
| Return the EventMap::iterator to past-the-end in the private variable .events. More...
|
|
EventMap::iterator | events_next_after_find (BinEventPt &ept) |
| Return the EventMap::iterator to the next BinEventPt after matching ev or nullptr if not found or is last. More...
|
|
A container class to hold events.
This class has two different modes, in both cases, it is constructed empty and the public properties store_strings, time_format and max_num_events can be set after construction.
1. The object identifies recurring events from a sequence of transactions passed to the object using insert_row()
2. The object is given the events as a series of define_event() calls
To simplify the Python interface, the object has set_max_num_events() and set_store_strings() as methods.
String reels::Events::optimize_events |
( |
Clips & |
clips, |
|
|
TargetMap & |
targets, |
|
|
int |
num_steps = 10 , |
|
|
int |
codes_per_step = 5 , |
|
|
double |
threshold = 0.0001 , |
|
|
pCodeSet |
p_force_include = nullptr , |
|
|
pCodeSet |
p_force_exclude = nullptr , |
|
|
Transform |
x_form = tr_linear , |
|
|
Aggregate |
agg = ag_longest , |
|
|
double |
p = 0.5 , |
|
|
int |
depth = 1000 , |
|
|
bool |
as_states = true , |
|
|
double |
exp_decay = 0.00693 , |
|
|
double |
lower_bound_p = 0.95 , |
|
|
bool |
log_lift = true |
|
) |
| |
Events optimizer.
Optimizes the events to maximize prediction signal. (F1 score over same number of positives.) It converts code values many-to-one trying to group event codes into categories that represent similar events.
Before starting, a non-optimized Events object must be populated with an initial set of codes we want to reduce by assigning new many-to-one codes to them.
The algorithm initially removes all codes not found in the clips object. This completely removes them.
The algorithm builds a list of most promising (not already used) codes at the beginning of each step by full tree search. From that list, each code is tried downwards as {noise, new_code, last_code} for score improvement above threshold up to codes_per_step steps. And assigned a new code accordingly. The codes assigned become part of the internal EventCodeMap and in the next step they will replace their old values.
When the algorithm finishes, the internal EventCodeMap is used to rename the object codes and the whole process is reported.
- Parameters
-
clips | A clips object with the same codes and clips for a set of clients whose prediction we optimize. |
targets | The target events in a TargetMap object in the same format expected by a Targets object. (Internally a Targets object will be used to make the predictions we want to optimize.) |
num_steps | The number of steps to iterate. The method will stop early if no codes are found at a step. |
codes_per_step | The number of codes to be tried from the top of the priority list at each step. |
threshold | A minimum threshold, below which a score change is not considered improvement. |
p_force_include | An optional pointer to a set of codes that must be included before starting. |
p_force_exclude | An optional pointer to a set of codes that will excluded and set to the base code. |
x_form | The x_form argument to fit the internal Targets object prediction model. |
agg | The agg argument to fit the internal Targets object prediction model. |
p | The p argument to fit the internal Targets object prediction model. |
depth | The depth argument to fit the internal Targets object prediction model. |
as_states | The as_states argument to fit the internal Targets object prediction model. |
exp_decay | Exponential Decay Factor applied to the internal score in terms of depth. That score selects what codes enter the model. The decay is applied to the average tree depth. 0 is no decay, default value = 0.00693 decays to 0.5 in 100 steps. |
lower_bound_p | Another p for lower bound, but applied to the scoring process rather than the model. |
log_lift | A boolean to set if lift (= LB(included)/LB(after inclusion)) is log() transformed or not. |
- Returns
- A
separated report string that contains either "ERROR" or "success" as the first line.