In a Machine Learning algorithm we can split the given dataset into training and test data. We can either split randomly or use time based splitting.
For time based splitting we need a timestamp as one of the attributes / features.
Like in case of e-commerce website we can have reviews for various products. These reviews can have timestamps also. In such scenarios it’s better to use time-based strategy.
To do this we can first sort the reviews using timestamp and then do the split.
- Sort data by time.
- Split – Training (80%) and Testing(20%)
This approach can give better accuracy. Since the testing data will be more recent and hence better prediction.