The high computational capacity that we have thanks to the new technologies allows us to communicate two great worlds such as optimization methods and machine learning. The concept behind the hybridization of both worlds is called Learnheuristics which allows to improve optimization methods through machine learning techniques where the input data for learning is the data produced by the optimization methods during the search process. Among the most outstanding machine learning techniques is Q-Learning whose learning process is based on rewarding or punishing the agents according to the consequences of their actions and this reward or punishment is carried out by means of a reward function. This work seeks to compare different Learnheuristics instances composed by Sine Cosine Algorithm and Q-Learning whose different lies in the reward function applied. Preliminary results indicate that there is an influence on the quality of the solutions based on the reward function applied.