Reels

A container class to hold events. More...
#include <reels.h>
Public Member Functions  
void  insert_row (pChar p_e, pChar p_d, double w) 
Process a row from a transaction file.  
bool  define_event (pChar p_e, pChar p_d, double w, uint64_t code) 
Define events explicitly.  
String  optimize_events (Clips &clips, TargetMap &targets, int num_steps=10, int codes_per_step=5, double threshold=0.0001, pCodeSet p_force_include=nullptr, pCodeSet p_force_exclude=nullptr, Transform x_form=tr_linear, Aggregate agg=ag_longest, double p=0.5, int depth=1000, bool as_states=true, double exponential_decay=0.00693, double lower_bound_p=0.95, bool log_lift=true) 
Events optimizer.  
bool  score_model (double &score, double &targ_prop, CodeInTreeStatMap &codes_stat, bool calc_tree_stats, Clips &clips, TargetMap &targets, EventCodeMap code_dict, Transform x_form, Aggregate agg, double p, int depth, bool as_states) 
Internal: Do one step of the optimize_events() method.  
CodeScores  get_top_codes (CodeInTreeStatMap &codes_stat, double targ_prop, double exponential_decay, double lower_bound_p, bool log_lift) 
Internal: Extract the top top_n codes by lift from a CodeInTreeStatMap map.  
double  linear_correlation (OptimizeEval &ev) 
Compute Pearson linear correlation between predicted and observed in an OptimizeEval.  
bool  load (pBinaryImage &p_bi) 
Load the state of an object from a base64 mercurydynamics serialization using image_get()  
bool  load (pBinaryImage &p_bi, int &c_block, int &c_ofs) 
Load the state of an object from a base64 mercurydynamics serialization using image_get()  
bool  save (pBinaryImage &p_bi) 
Save the state of an object into a base64 mercurydynamics serialization using image_put()  
void  set_max_num_events (int max_events) 
Sets the public property max_num_events to simplify the python interface.  
void  set_store_strings (bool store) 
Sets the public property store_strings to simplify the python interface.  
ElementHash  add_str (pChar p_str) 
Define a new string and push it into the StringUsageMap.  
void  erase_str (ElementHash hash) 
Remove a string from the StringUsageMap by decreasing its use count and destroying it if not used anymore.  
String  get_str (ElementHash hash) 
Get a string content from its hash value.  
uint64_t  event_code (BinEventPt &ept) 
Return the code associated to an BinEventPt if found in the object.  
int  num_events () 
Return the number of events stored in the object.  
EventMap::iterator  events_begin () 
Return the EventMap::iterator to the first elements in the private variable .events.  
EventMap::iterator  events_end () 
Return the EventMap::iterator to pasttheend in the private variable .events.  
EventMap::iterator  events_next_after_find (BinEventPt &ept) 
Return the EventMap::iterator to the next BinEventPt after matching ev or nullptr if not found or is last.  
Public Attributes  
bool  store_strings = true 
If true, the object stores the string values.  
int  max_num_events = DEFAULT_NUM_EVENTS 
The maximum number of recurrent event stored via insert_row()  
A container class to hold events.
This class has two different modes, in both cases, it is constructed empty and the public properties store_strings, time_format and max_num_events can be set after construction.
1. The object identifies recurring events from a sequence of transactions passed to the object using insert_row() 2. The object is given the events as a series of define_event() calls
To simplify the Python interface, the object has set_max_num_events() and set_store_strings() as methods.

inline 
Define a new string and push it into the StringUsageMap.
p_str  The string to be added. 
Define events explicitly.
p_e  The "emitter". A C/Python string representing "owner of event". 
p_d  The "description". A C/Python string representing "the event". 
w  The "weight". A double representing a weight of the event. 
code  A unique code number identifying the event. 
Caveat**: insert_row() and define_event() should not be mixed. The former is for event discovery and the latter for explicit definition. A set of events is build either one way or the other.

inline 
Remove a string from the StringUsageMap by decreasing its use count and destroying it if not used anymore.
hash  hash(key) 

inline 
Return the code associated to an BinEventPt if found in the object.
ept  The BinEventPt searched. 

inline 
Return the EventMap::iterator to the first elements in the private variable .events.

inline 
Return the EventMap::iterator to pasttheend in the private variable .events.

inline 
Return the EventMap::iterator to the next BinEventPt after matching ev or nullptr if not found or is last.
ept  The BinEventPt searched. 

inline 
Get a string content from its hash value.
hash  hash(key) 
CodeScores reels::Events::get_top_codes  (  CodeInTreeStatMap &  codes_stat, 
double  targ_prop,  
double  exponential_decay,  
double  lower_bound_p,  
bool  log_lift  
) 
Internal: Extract the top top_n codes by lift from a CodeInTreeStatMap map.
codes_stat  A complete CodeInTreeStatMap computed by score_model(). 
targ_prop  The targets/seen proportion at the tree root. 
exponential_decay  Exponential Decay Factor applied to the internal score in terms of depth. That score selects what codes enter the model. The decay is applied to the average tree depth. 0 is no decay, default value = 0.00693 decays to 0.5 in 100 steps. 
lower_bound_p  Another p for lower bound, but applied to the scoring process rather than the model. 
log_lift  A boolean to set if lift (= LB(included)/LB(after inclusion)) is log() transformed or not. 
Process a row from a transaction file.
p_e  The "emitter". A C/Python string representing "owner of event". 
p_d  The "description". A C/Python string representing "the event". 
w  The "weight". A double representing a weight of the event. 
Caveat**: insert_row() and define_event() should not be mixed. The former is for event discovery and the latter for explicit definition. A set of events is build either one way or the other.

inline 
Compute Pearson linear correlation between predicted and observed in an OptimizeEval.
ev  The vector of OptimizeEvalItem containing t_obs (observed) and t_hat (predicted) values. 
bool reels::Events::load  (  pBinaryImage &  p_bi  ) 
Load the state of an object from a base64 mercurydynamics serialization using image_get()
p_bi  The address of a BinaryImage stream containing a previously save()ed image at the cursor position. 
bool reels::Events::load  (  pBinaryImage &  p_bi, 
int &  c_block,  
int &  c_ofs  
) 
Load the state of an object from a base64 mercurydynamics serialization using image_get()
p_bi  The address of a BinaryImage stream containing a previously save()ed image at the cursor position. 
c_block  The current reading cursor (block number) required only for nested use of load(). 
c_ofs  The current reading cursor (offset in block) required only for nested use of load(). 

inline 
Return the number of events stored in the object.
String reels::Events::optimize_events  (  Clips &  clips, 
TargetMap &  targets,  
int  num_steps = 10 , 

int  codes_per_step = 5 , 

double  threshold = 0.0001 , 

pCodeSet  p_force_include = nullptr , 

pCodeSet  p_force_exclude = nullptr , 

Transform  x_form = tr_linear , 

Aggregate  agg = ag_longest , 

double  p = 0.5 , 

int  depth = 1000 , 

bool  as_states = true , 

double  exponential_decay = 0.00693 , 

double  lower_bound_p = 0.95 , 

bool  log_lift = true 

) 
Events optimizer.
Optimizes the events to maximize prediction signal. (F1 score over same number of positives.) It converts code values manytoone trying to group event codes into categories that represent similar events.
Before starting, a nonoptimized Events object must be populated with an initial set of codes we want to reduce by assigning new manytoone codes to them.
The algorithm initially removes all codes not found in the clips object. This completely removes them.
The algorithm builds a list of most promising (not already used) codes at the beginning of each step by full tree search. From that list, each code is tried downwards as {noise, new_code, last_code} for score improvement above threshold up to codes_per_step steps. And assigned a new code accordingly. The codes assigned become part of the internal EventCodeMap and in the next step they will replace their old values.
When the algorithm finishes, the internal EventCodeMap is used to rename the object codes and the whole process is reported.
clips  A clips object with the same codes and clips for a set of clients whose prediction we optimize. 
targets  The target events in a TargetMap object in the same format expected by a Targets object. (Internally a Targets object will be used to make the predictions we want to optimize.) 
num_steps  The number of steps to iterate. The method will stop early if no codes are found at a step. 
codes_per_step  The number of codes to be tried from the top of the priority list at each step. 
threshold  A minimum threshold, below which a score change is not considered improvement. 
p_force_include  An optional pointer to a set of codes that must be included before starting. 
p_force_exclude  An optional pointer to a set of codes that will excluded and set to the base code. 
x_form  The x_form argument to fit the internal Targets object prediction model. 
agg  The agg argument to fit the internal Targets object prediction model. 
p  The p argument to fit the internal Targets object prediction model. 
depth  The depth argument to fit the internal Targets object prediction model. 
as_states  The as_states argument to fit the internal Targets object prediction model. 
exponential_decay  Exponential Decay Factor applied to the internal score in terms of depth. That score selects what codes enter the model. The decay is applied to the average tree depth. 0 is no decay, default value = 0.00693 decays to 0.5 in 100 steps. 
lower_bound_p  Another p for lower bound, but applied to the scoring process rather than the model. 
log_lift  A boolean to set if lift (= LB(included)/LB(after inclusion)) is log() transformed or not. 
bool reels::Events::save  (  pBinaryImage &  p_bi  ) 
Save the state of an object into a base64 mercurydynamics serialization using image_put()
p_bi  The address of a BinaryImage stream that is either empty or has been used only for writing. 
bool reels::Events::score_model  (  double &  score, 
double &  targ_prop,  
CodeInTreeStatMap &  codes_stat,  
bool  calc_tree_stats,  
Clips &  clips,  
TargetMap &  targets,  
EventCodeMap  code_dict,  
Transform  x_form,  
Aggregate  agg,  
double  p,  
int  depth,  
bool  as_states  
) 
Internal: Do one step of the optimize_events() method.
score  Returns score by reference. 
targ_prop  Returns the targets/seen proportion at the tree root by reference (used by get_top_codes). 
codes_stat  Returns a complete CodeInTreeStatMap if calc_tree_stats is true. 
calc_tree_stats  Complete a tree search (if true) or just evaluate the score if not. 
clips  A clips object with the same codes and clips for a set of clients whose prediction we optimize. 
targets  The target events in a TargetMap object in the same format expected by a Targets object. (Internally a Targets object will be used to make the predictions we want to optimize.) 
code_dict  A dictionary of code transformations to be applied to a copy of the clips before fitting. 
x_form  The x_form argument to fit the internal Targets object prediction model. 
agg  The agg argument to fit the internal Targets object prediction model. 
p  The p argument to fit the internal Targets object prediction model. 
depth  The depth argument to fit the internal Targets object prediction model. 
as_states  The as_states argument to fit the internal Targets object prediction model. 

inline 
Sets the public property max_num_events to simplify the python interface.
max_events  The value to apply to max_num_events. 

inline 
Sets the public property store_strings to simplify the python interface.
store  True for storing the string contents. 