^{1}

^{1}

^{2}

^{1}

^{2}

We provide a causal inference framework to model the effects of machine learning algorithms on user preferences. We then use this mathematical model to prove that the overall system can be tuned to alter those preferences in a desired manner. A user can be an online shopper or a social media user, exposed to digital interventions produced by machine learning algorithms. A user preference can be anything from inclination towards a product to a political party affiliation. Our framework uses a state-space model to represent user preferences as latent system parameters which can only be observed indirectly via online user actions such as a purchase activity or social media status updates, shares, blogs, or tweets. Based on these observations, machine learning algorithms produce digital interventions such as targeted advertisements or tweets. We model the effects of these interventions through a causal feedback loop, which alters the corresponding preferences of the user. We then introduce algorithms in order to estimate and later tune the user preferences to a particular desired form. We demonstrate the effectiveness of our algorithms through experiments in different scenarios.

Recent innovations in communication technologies, coupled with the increased use of Internet and smartphones, greatly enhanced institutions’ ability to gather and process an enormous amount of information on individual users on social networks or consumers in different platforms [

Furthermore, unlike applications where the machine learning algorithms are used as mere tools for processing and inferring using the available data such as predicting the best movie for a particular user [

Online users are exposed to persuasive technologies and are continually immersed in digital content and interventions in various forms such as advertisements, news feeds, and recommendations [

The Digital Feedback Loop.

To this end, in this paper, we are particularly interested in the causal effects of machine learning algorithms on users [

This problem framework readily models a wide range of real life applications and scenarios [

In different applications the preferences can be the state and the advertisements (content, the medium of the advertisement, the frequency, etc.) are the actions or output of the machine learning algorithm. In a different context, the opinions of the social network users on Facebook of a particular event or a new product can be represented as a state. Our model is comprehensive such that the relevant information on the user such as his/her age, gender, demographics, and residency is collectively represented by a side information vector since the advertiser collects data on the consumer such as the spending patterns, demographics, age, gender, and polls.

A summary of our work in this paper is as follows, with the last bullet being our key contribution:

We model the effects of machine learning algorithms such as recommendation engines on users through a causal feedback loop. We introduce a complete state-space formulation modeling:

We introduce algorithms to estimate the unknown system parameters with and without feedback. In both cases, all the parameters are estimated jointly. We emphasize that we provide a complete set of equations covering all the possible scenarios.

To tune the preferences of users towards a desired sequence, we also introduce a linear regression algorithm and introduce an optimization framework using stochastic gradient descent algorithm. Unlike all the previous works that only use the observations to predict certain desired quantities, as the first time in the literature, we specifically design outputs to “update” the internal state of the system in a desired manner.

The rest of the paper is organized as follows. In the next section, we present a comprehensive state-space model that includes the evolution of the latent state vector, underlying observation model and side information. In the same section, we also introduce the causal feedback loop and possible variations to model different real life applications. We then introduce the Extended Kalman Filtering framework to estimate the unknown system parameters. We investigate different real life scenarios including the system with and without the feedback. We present all update and estimation equations. In the following section, we introduce an online learning algorithm to tune the underlying state vector, that is, preferences vector, towards a desired vector sequence through a linear regression and causal feedback loop. We then demonstrate the validity of our introduced algorithms under different scenarios via simulations. We include our simulation results to show that we are able to converge on unknown parameters in designing a system which can steer user preferences. The final section includes conclusions and scope of future work.

In this paper, all vectors are column vectors and denoted by lower case letters. Matrices are represented by uppercase letters. For a vector

We represent preferences of a user as a state vector

The relevant information on the user such as his/her age, gender, demographics, and residency is collectively represented by a side information vector

The machine learning system collects data on the user, say

Based on prior preferences, different user effects and trends, and the preferences of the user change, we represent this change as

A state-space model to represent evaluation of the user preferences without feedback effects.

To include local trends and seasonality effects, one can use

In the following, we model the effect of the actions of the machine learning algorithm in the “observation” (

Based on the collected data

If we have a finite set of actions, that is,

Based on the actions of the machine learning algorithm (and prior preferences), we assume that the preferences of the user changes in a linear state-space form with an additive model for the causal effect [

A complete state-space model of the system with action generation and feedback effects.

We can also use a jump state model to represent the causal effects for the case where

Our estimation derivations in the following sections can also be extended to cover this case using a jump state model [

For certain causal inference problems, the actions sequence

In the following, we introduce algorithms that optimize

We consider the problem of designing a sequence of actions

The overall system parameters,

Without the feedback loop, the system is described by

For estimating the parameters of the feedback loop, that is,

Using (

Hence, the complete state-space description with causal loop is given by

In (

Since we can control

In the state update equation (

After several steps, we derive the EKF equations to estimate the augmented states for this case as

To obtain an expression for

These updates provide the complete EKF formulation with feedback. In the sequel, we introduce the complete estimation framework where we estimate all the parameters jointly.

We can define a superset of parameters

After some algebra, we get the complete EKF equations as

To obtain an expression for

After straightforward algebra, we get

After the parameters are estimated through methods described in the previous sections, the complete system framework is given by

Our goal in this section is to design

In order to tune the user preferences, we design

To minimize the difference between these two sequences, we introduce a stochastic gradient approach where

If these two conditions are met, then the estimated parameters

In (

To get

Using (

From (

This completes the derivation of the stochastic gradient update for online learning of the tuning regression vector.

In this section, we share our simulation results to show that estimated parameters of the system converge to the real values, proving that a system can be designed with the right parameters which allows a sequence of actions or interventions to tune the preferences of a user in a desired manner. Since our goal is mainly to establish a pathway to the possibility of designing a system that can steer user preferences in a desired manner, we consider our basic simulation set to be sufficient based on the mathematical proof we provided in the form of EKF formulations. The true parameters of the system are known to us since we are running our experiments in the form of simulations. Specifically, the preferences of the user, which are not directly observable in real life, are known in case of simulations. We run simulations for the EKF formulations we derived in the previous sections to show that our estimation of the preferences converges to the real preference values. We illustrate the convergence of our algorithms under different scenarios.

In the first scenario, we have the case where the corresponding system has no feedback. As the true system, we choose a second-order linear state-space model, where ^{−3}^{−3}^{−3} and 10^{−4}, to demonstrate the effect of this design parameter on the system. We emphasize that neither

In Figure

Estimation of the underlying preferences vector when there is no feedback. The results are averaged over 100 independent trials. Here, we have no feedback and parameters of both the state equation and the observation equation are unknown. The results are shown for two different noise variances for the EKF formulation.

In the second set of experiments, we have feedback present; that is,

Estimation of the underlying vector of preferences and the feedback parameters when there is feedback. The results are averaged over 100 independent trials. Two different configurations are simulated for the feedback as well as for the linear control parameters, for example, the fixed and random initial cases. For both scenarios, our estimation process converges to the true underlying processes.

In this paper, we model the effects of the machine learning algorithms such as recommendation engines on users through a causal feedback loop. To this end, we introduce a complete state-space formulation modeling:

We consider our work as a significant theoretical first step in designing a system with the right parameters which allows a sequence of actions or interventions to tune the preferences of a user in a desired manner. We emphasize that the main goal of our study is to establish a pathway to designing such a system. We achieve this by first providing mathematical proof and then through a basic set of simulations.

A next step in future studies can be to make the system more stable and also to make the design process easy and practical for system designers. Further analysis on the convergence of the system and more simulations, experiments, and numerical analyses are needed to take our results to the next level. A direct comparison to previous studies is not possible for this first step of our study since, to the best of our knowledge, this is the first time a task of this nature is being undertaken. Our main success criterion is the fact that estimated parameters converge to the real parameter values. However, as our framework evolves, we will be able to track its relative performance.

Another area of focus for future studies is the optimal selection of action sequences. This can be particularly challenging since user preferences can change over time due to the abundance of new products and services. Algorithms to optimally select actions may require online learning and decision making in real time to accommodate these changes.

The authors declare that there are no conflicts of interest regarding the publication of this article.

The authors would like to thank Koc University Graduate School of Social Sciences and Humanities for their support. This work was also supported by the BAGEP Award of the Science Academy.