Reels
|
A container class to hold target events and do predictions based on clips. More...
#include <reels.h>
Public Member Functions | |
Targets (pClipMap p_clips, TargetMap target) | |
Construct a Targets object from a Clips object and a TargetMap. More... | |
bool | insert_target (pChar p_c, pChar p_t) |
Utility to fill the internal TargetMap target. More... | |
bool | fit (Transform x_form, Aggregate agg, double p, int depth, bool as_states) |
Fit the prediction model. More... | |
TimesToTarget | predict () |
Predict time to target for all the clients in the Clips object used to fit the model. More... | |
TimesToTarget | predict (Clients clients) |
Predict time to target for all the clients in a given Clients object whose clips have been used to fit the model. More... | |
TimesToTarget | predict (pClipMap p_clips) |
Predict time to target for a set of clients whose clips are given in a ClipMap. More... | |
void | verbose_predict_clip (const ElementHash &client, Clip &clip, TimePoint &obs_time, bool &target_yn, int &longest_seq, uint64_t &n_visits, uint64_t &n_targets, double &targ_mean_t) |
Predict time for a single Clip returning all kind of prediction related information. More... | |
bool | load (pBinaryImage &p_bi) |
Load the state of an object from a base64 mercury-dynamics serialization using image_get() More... | |
bool | save (pBinaryImage &p_bi) |
Save the state of an object into a base64 mercury-dynamics serialization using image_put() More... | |
int | update_node (int idx_parent, uint64_t code, bool target, ExtFloat time_d) |
Update (fit) the CodeTree inserting new nodes as necessary. More... | |
double | normal_pdf (double x) |
Density (pdf) for the normal distribution with mean 0 and standard deviation 1. More... | |
double | normal_cdf (double x) |
Cumulative distribution (cdf) for the normal distribution with mean 0 and standard deviation 1. More... | |
double | agresti_coull_upper_bound (uint64_t n_hits, uint64_t n_total) |
Upper bound of the Agresti-Coull confidence interval for a binomial proportion. More... | |
double | agresti_coull_lower_bound (uint64_t n_hits, uint64_t n_total) |
Lower bound of the Agresti-Coull confidence interval for a binomial proportion. More... | |
double | predict_time (CodeTreeNode &node) |
Predict the time to target for a sub-clip that starts at a node. More... | |
double | predict_clip (Clip clip) |
Predict the time to target for a clip. More... | |
bool | recurse_tree_stats (int depth, int idx, int parent_idx, uint64_t code, CodeInTreeStatMap &codes_stat) |
Recursive tree exploration updating a CodeInTreeStatMap map. More... | |
int | num_targets () |
Return the size of the internal TargetMap. More... | |
int | tree_size () |
Return the size of the internal CodeTree. More... | |
pClipMap | clip_map () |
The address of the internal ClipMap. More... | |
pCodeTree | p_tree () |
The address of the internal CodeTree. More... | |
pTargetMap | p_target () |
The address of the internal TargetMap. More... | |
Public Member Functions inherited from reels::TimeUtil | |
TimeUtil () | |
TimePoint | get_time (pChar p_t) |
Convert time as a string to a TimePoint (using the object's time_format). More... | |
void | set_time_format (pChar fmt) |
Sets the public property time_format to simplify the python interface. More... | |
Additional Inherited Members | |
Public Attributes inherited from reels::TimeUtil | |
char | time_format [128] = "%Y-%m-%d %H:%M:%S" |
Date and time format for insert_row() and define_event() More... | |
A container class to hold target events and do predictions based on clips.
|
inline |
Lower bound of the Agresti-Coull confidence interval for a binomial proportion.
The confidence level is passed as an argument to the fit() method, the fit method computes some "binomial_z.*" variables used in this function.
n_hits | The number of successes. |
n_total | The number of independent trials. |
|
inline |
Upper bound of the Agresti-Coull confidence interval for a binomial proportion.
The confidence level is passed as an argument to the fit() method, the fit method computes some "binomial_z.*" variables used in this function.
n_hits | The number of successes. |
n_total | The number of independent trials. |
|
inline |
The address of the internal ClipMap.
Fit the prediction model.
x_form | A possible transformation of the times. (Currently "log" or "linear".) |
agg | The mechanism used for the aggregation. (Currently "minimax", "mean" or "longest".) |
p | The width of the confidence interval for the binomial proportion used to calculate the lower bound. (E.g., p = 0.5 will estimate a lower bound of a symmetric CI with coverage of 0.5.) |
depth | The maximum depth of the tree (maximum sequence length learned). |
as_states | Treat events as states by removing repeated ones from the ClipMap keeping the time of the first instance only. When used, the ClipMap passed to the constructor by reference will be converted to states as a side effect. |
Fit can only be called once in the life of a Targets object and predict() cannot be called before fit().
Utility to fill the internal TargetMap target.
The TargetMap can be initialized and given to the constructor, or an empty TargetMap can be given to the constructor ans initialized by this method.
p_c | The "client". A C/Python string representing "the actor". |
p_t | The "time". A timestamp of the event as a C/Python string. (The format is given via set_time_format().) |
bool reels::Targets::load | ( | pBinaryImage & | p_bi | ) |
Load the state of an object from a base64 mercury-dynamics serialization using image_get()
p_bi | The address of a BinaryImage stream containing a previously save()-ed image at the cursor position. |
|
inline |
Cumulative distribution (cdf) for the normal distribution with mean 0 and standard deviation 1.
x | The quantile. |
|
inline |
Density (pdf) for the normal distribution with mean 0 and standard deviation 1.
x | The quantile. |
|
inline |
Return the size of the internal TargetMap.
|
inline |
The address of the internal TargetMap.
|
inline |
The address of the internal CodeTree.
TimesToTarget reels::Targets::predict | ( | ) |
TimesToTarget reels::Targets::predict | ( | Clients | clients | ) |
Predict time to target for all the clients in a given Clients object whose clips have been used to fit the model.
It will predict the prediction of the zero-length clip (accumulated in the root node) if the client is not found.
clients | A Clients object with a subset of the clients used to fit the model. |
predict() cannot be called before fit() and can be called any number of times in all overloaded forms after that.
TimesToTarget reels::Targets::predict | ( | pClipMap | p_clips | ) |
Predict time to target for a set of clients whose clips are given in a ClipMap.
p_clips | A ClipMap of clients and clips to be used in prediction. |
predict() cannot be called before fit() and can be called any number of times in all overloaded forms after that.
|
inline |
Predict the time to target for a clip.
clip | A clip containing a sequence of event codes. |
|
inline |
Predict the time to target for a sub-clip that starts at a node.
This method assumes uniform distribution over time of the events (like Poisson distribution) but weighted by evidence.
Basically, we assume that the observed maximum likelihood time (mu_hat) is the mean (therefore the half) of a uniform distribution U(0, 2*mu_hat) and the whole "event space", if estimated without bias, would be U(0, 2*mu_hat*n_seen/n_target).
Caveat 1. We don't want it unbiased, but biased by evidence, because many estimates will be done, "lucky paths" are a real thing, especially with low number of visits. That is why we use agresti_coull_lower_bound() and not n_seen/n_target. Note that, of course, as you reduce the confidence (the p argument of fit), you will make it approach n_seen/n_target.
Caveat 2. We, intentionally bias towards underestimating the urgency. With much evidence, it will be close to unbiased, but with little evidence, if will underrepresent the urgency (proximity of the event happening). This is why minimax will select the most urgent (min) of the biased up values (max).
Caveat 3. mu_hat is numerically more stable if the transformation is log (although the aggregation it is computed in 128 bit floating point arithmetic which should be stable either way) but, in that case mu_hat will not be the mean time, but its geometric mean (since adding logs is multiplying).
node | The node in the tree defining the sub-clip. |
bool reels::Targets::recurse_tree_stats | ( | int | depth, |
int | idx, | ||
int | parent_idx, | ||
uint64_t | code, | ||
CodeInTreeStatMap & | codes_stat | ||
) |
Recursive tree exploration updating a CodeInTreeStatMap map.
depth | The recursion depth |
idx | The index of the current node |
parent_idx | The index of the parent node |
code | The current code |
codes_stat | The CodeInTreeStatMap being updated |
bool reels::Targets::save | ( | pBinaryImage & | p_bi | ) |
Save the state of an object into a base64 mercury-dynamics serialization using image_put()
p_bi | The address of a BinaryImage stream that is either empty or has been used only for writing. |
|
inline |
Return the size of the internal CodeTree.
|
inline |
Update (fit) the CodeTree inserting new nodes as necessary.
idx_parent | The index of the parent node. For the first insertion, root == 0. For more, returned values of this. |
code | The node will be the child of the parent node whose code is this code. |
target | The target was matched in the clip or not. |
time_d | The time difference from the code to the target (when there is a target, must be 0 otherwise). |
void reels::Targets::verbose_predict_clip | ( | const ElementHash & | client, |
Clip & | clip, | ||
TimePoint & | obs_time, | ||
bool & | target_yn, | ||
int & | longest_seq, | ||
uint64_t & | n_visits, | ||
uint64_t & | n_targets, | ||
double & | targ_mean_t | ||
) |
Predict time for a single Clip returning all kind of prediction related information.
client | The client hash (needed to see if he fits the target). |
clip | The clip we want to predict |
obs_time | A variable to store the time between the last even and the target when the target is hit. |
target_yn | A variable to store if the target was hit (true) or not (false). |
longest_seq | A variable to store the longest aligned matching sequence stored in the tree. |
n_visits | A variable to store the number of visits for the longest sequence. |
n_targets | A variable to store the number of target hits for the longest sequence. |
targ_mean_t | A variable to store average observed time for hits in the longest sequence. |