siRNA Efficiency: Structure or Sequence—That Is the Question

The triumphant success of RNA interference (RNAi) in life sciences is based on its high potency to silence genes in a sequence-specific manner. Nevertheless, the first task for successful RNAi approaches is the identification of highly active small interfering RNAs (siRNAs). Early on, it has been found that the potency of siRNAs can vary drastically. Great progress was made when thermodynamic properties that influence siRNA activity were discovered. Design algorithms based on these parameters enhance the chance to generate potent siRNAs. Still, many siRNAs designed accordingly fail to silence their targeted gene, whereas others are highly efficient despite the fact that they do not fulfil the recommended criteria. Therefore, the accessibility of the siRNA-binding site on the target RNA has been investigated as an additional parameter which is important for RNAi-mediated silencing. These and other factors which are crucial for successful RNAi approaches will be discussed in the present review.


INTRODUCTION
RNA interference (RNAi) is a naturally occurring phenomenon of RNA-mediated gene silencing that is highly conserved among multicellular organisms (for recent reviews, see, eg, [1][2][3][4]). It is a post-transcriptional process initiated by double-stranded RNA molecules that induce degradation of a complementary target RNA. In the first step of the pathway, long double-stranded RNA molecules are chopped into shorter duplexes with 2 nucleotide overhangs at both 3 ends by an endonuclease dubbed Dicer, the structure of which has been solved only recently [5]. The resulting 21 mer effector RNAs, named small or short interfering RNAs (siRNAs), are incorporated into a multimeric protein complex, the RNA-induced silencing complex (RISC). One of the two siRNA strands guides RISC to a complementary RNA. After hybridization the endonucleolytic "slicer" activity of RISC cleaves the target RNA, thus preventing its translation.
While long double-stranded RNA molecules can be employed to induce RNAi in lower eukaryotes, siRNAs being 21 nucleotides in length have to be used for gene silencing in mammalian cells in order to prevent the activation of an unspecific interferon response [6]. Due to the higher efficiency of siRNAs compared to traditional antisense oligonucleotides and ribozymes [7][8][9] and the relative ease of RNAimediated knockdown of target gene expression compared to knockout by homologous recombination, RNAi has rapidly become a standard technology in life sciences. Furthermore, siRNAs are not only new powerful research tools, but are also considered to be a promising new class of therapeutics [10][11][12][13].
In addition to siRNAs, endogenously expressed short double-stranded RNA molecules, referred to as microRNAs (miRNAs), entered the focus of current research (for a review, see [14]). These molecules are now believed to be important cellular gene regulators that play an important role in developmental processes and various diseases. At the beginning of the miRNA pathway, RNA polymerase II generates long primary RNAs that contain the miRNA sequences. These transcripts designated as pri-miRNAs are cleaved in the nucleus by an RNase III family enzyme, Drosha, to give the pre-miRNAs approximately 70-90 nucleotides with a 2 nucleotide 3 overhang. After being exported to the cytoplasm, the pre-miRNA is recognized by Dicer and processed to generate the mature miRNA, which is incorporated into RISC. In contrast to siRNAs, however, miRNAs are capable of inhibiting translation of the targeted mRNA without degrading it (at least in mammalian cells). Still, the siRNA and miRNA pathways share many similarities. Elucidation of the mechanisms of miRNA activity therefore helps to understand the mode of action of siRNAs and vice versa.
Despite the great success of RNAi mediated approaches, the design of highly efficient siRNAs still remains a hurdle that has to be overcome. Initial expectations expressed on an antisense meeting in 2001 that there is no need to select for optimal siRNA target sequences [15] have soon been proven to be too optimistic, since a drastic variation of silencing efficiency was observed for different siRNAs directed against the same target RNA [16]. It thus became clear that either factors intrinsic to the siRNA or properties of the targeted mRNA are crucial for the success of an RNAi approach. In the present review our current knowledge about factors that influence the potency of siRNAs will be summarized and advice will be given that helps with the generation of efficient molecules. It will, however, become obvious that we do not yet know all relevant features so that even the sophisticated design algorithms available to date do not guarantee satisfactory activity of the proposed siRNAs.

THERMODYNAMIC PROPERTIES OF EFFICIENT siRNAs
Early on, recommendations have been given for the selection of siRNA target sites [17]: the selected region should preferably be located in the coding region, at least 50 nucleotides downstream of the start codon; the GC-content should be approximately 50%; and a sequence motive AA N 19 TT was suggested to be advantageous. A blast search is necessary to ensure that the siRNA has no significant homologies with other genes than the intended target. Even though these selection criteria have been employed with great success in numerous RNAi experiments, a further increased hit rate for highly potent siRNAs was desirable for the generation of large libraries. Significant progress towards the design of active siRNAs was achieved when an unexpected asymmetry concerning the incorporation of the two strands of siRNAs and miRNAs was found in two independent studies [18,19]. Analysis of the known miRNA sequences in the context of miRNA precursor hairpins revealed a low stability of the 5 end of the antisense strand compared to the 5 end of the sense strand [18]. Subsequently, the same feature was observed for siRNAs. Functional duplexes displayed a lower relative thermodynamic stability at the 5 end of the antisense strand than nonfunctional duplexes. The finding that the relative stabilities of the base pairs at the termini of the two siRNA strands that determine the degree to which each strand is fed into the RNAi pathway led to the hypothesis that strand incorporation into RISC is determined by an RNA helicase that initiates dissociation of the miRNA or siRNA duplex at the end with the lower thermodynamic stability [19].
These findings were further refined in a systematic analysis of 180 siRNAs targeting the mRNAs of two genes [20]. In addition to the relative stability of both ends of the siRNA, base preferences at certain positions of the duplex were identified in functional siRNAs. A set of eight criteria was used in an algorithm intended to improve the selection of potent siRNAs (Table 1 and Figure 1). A total of 6 or more points according to this scoring system was proposed to significantly increase the probability for efficient gene silencing.
Independent studies analysing the activities of siRNAs against different mRNAs confirmed the basic outcome of these studies [eg, [21,22]]. Although some base preferences at certain positions of the siRNAs were either questioned or added to the list, the relative thermodynamic stability of the siRNA termini was verified to be a major determinant of the functionality of siRNAs. Somewhat different results were obtained, when a database was compiled consisting of 398 siRNAs against 92 genes from 30 different studies, in order to overcome a major shortcoming of earlier studies, the low number of genes being targeted [23]. Bioinformatic analysis of the data set led to a set of rules (termed "Stockholm rules") that differs from the scoring systems described above.
Various academic groups and commercial vendors developed a software for designing siRNAs based on the identified features of active siRNAs. A list of freely available web tools is given in Table 2. Some additional prediction servers were introduced in a special web server issue of Nucleic Acids Research of July 2004.
In a more recent study, a set of approximately 2200 randomly selected siRNAs targeting 24 mRNA species was used to train a neuronal network to predict the activity of siRNAs [25]. Statistic analysis of the large data set revealed some of the criteria discovered previously, but also identified new motives that are overrepresented in potent siRNAs. The approach to train an artificial neuronal network goes beyond earlier efforts like the above-mentioned scoring system, which uses a linear summation of parameters, in that it can handle complex sequence motifs and synergistic relations between two or more parameters. The neuronal network-based algorithm was finally employed to design a library of approximately 50.000 siRNAs that cover the human genome with a redundancy of two siRNAs per gene.
Taken together, the analysis of the sequences of active and nonfunctional siRNAs clearly revealed that the two strands of an siRNA duplex are not equally eligible for assembly into RISC. Rather, the relative stability of both ends of the siRNA is widely considered to determine which of the strands will preferentially participate in the RNAi pathway. It is therefore advisable to take into account the proposed criteria for active siRNAs when designing siRNAs against a new target. It has to be mentioned, however, that following these algorithms does not guarantee for the success of an RNAi approach. On the contrary, numerous highly efficient siRNAs have been published that do not obey the rules. Before addressing further determinants of siRNA activity in more detail, a short summary of structural studies will be given that may account for the asymmetric incorporation of the two siRNA strands into RISC.

STRUCTURAL BASIS FOR STRAND ASYMMETRY
In recent years, significant progress has been made to elucidate the molecular basis of RNAi and to understand the asymmetric strand incorporation (for a review, see [26]). The catalytic activity of RISC, termed slicer, which leads to the cleavage of the target RNA, has been identified to be located in the Argonaute2 (Ago2) protein [27]. Ago2 contains two major domains referred to as PIWI and PAZ (acronym for PIWI/Argonaute/Zwille). Crystallographic analysis revealed the PIWI domain at the C-terminus of the protein to closely resemble the structure of RNase H [28]. This enzyme cleaves the RNA component of an RNA/DNA hybrid. The PIWI domain of Ago2 can thus be regarded as a variant of the RNase H structure motive specialized in cleavage of one strand of double-stranded RNAs. Recombinant human Ago2 and an siRNA were found to form a minimal RISC that accurately cleaves substrate RNAs [29]. Interestingly, only single-stranded siRNA could be specifically incorporated into recombinant Ago2, whereas photoreactive double-stranded siRNA did not crosslink with Ago2. This finding indicates the importance of the RISC loading complex (RLC) for efficient incorporation of the siRNA into the Ago2 protein. In Drosophila melanogaster, a heterodimer consisting of Dicer-2 and the double-stranded RNA binding protein R2D2, which contains the siRNA, was found to be important for RISC assembly [30]. R2D2 binds the thermodynamically more stable end of the siRNA, that is, the 3 end of the guide strand, and can thus determine which one of the strands will be associated with Ago2. It has there-fore been described as the "protein sensor for siRNA thermodynamic asymmetry." In human cells, the HIV-1 trans-activating response RNA-binding protein (TRBP) has been found to recruit the Dicer complex to Ago2 [34]. Based on these findings a model has been proposed for RISC assembly and function [31] that is depicted in Figure 2. In cytoplasm, RISC containing Dicer, TRBP, and Ago2 recognizes hairpin RNAs like pre-miRNAs. The RNase III Dicer generates ∼22 nt long duplexes which remain associated with RISC as a ribonucleoprotein complex. In analogy to R2D2 from Drosophila, TRBP and Dicer are likely to sense the thermodynamic asymmetry between the two ends of the duplex. Two recent reports suggest that the passenger strand is cleaved, before being removed from the Ago2 protein [32,33]. The guide strand remains bound to the active RISC and recognizes target RNAs by complementary base pairing. The PIWI domain of Ago2 cleaves the target RNA. After release of the cleavage products, RISC can undergo further rounds of target RNA destruction. Interestingly, none of these steps requires energy from ATP hydrolysis. Although RISC can utilize 21 mer siRNA duplexes, pre-miRNA-type Dicer substrates result in a 10-fold higher activity [31].

TARGET SITE ACCESSIBILITY
Although there is no doubt that the design criteria described above increase the success rate to generate active siR-NAs, a survey of published RNAi experiments readily reveals that many siRNAs are highly potent although they do  Figure 2: Model for assembly and function of RISC according to [31] under consideration of [32,33]. not fulfil the recommendations. Even more intriguing is the fact that siRNAs may be unsuitable to silence their target although they comply with these rules. It is thus obvious that additional features have to be considered to optimize the efficiency of RNAi. Some earlier studies had already suggested that the structure of the target RNA may influence siRNA activity [35][36][37]. When it became clear that the design algorithms based solely on thermodynamic parameters of the siRNA are helpful tools, but do not guarantee success of RNAi approaches, target-site accessibility came back into the focus.
Luo and Chang [38] described the local mRNA structure at the target site as the main cause for the positional effect of different siRNAs. As a reliable parameter for target site accessibility, they introduced the "hydrogen bond index" representing the average number of hydrogen bonds formed between nucleotides in the target region and the rest of the mRNA. This index, which has to be determined by bioinformatic secondary structure prediction, was found to correlate inversely with the gene-silencing effect. Further experiments revealed that the tight stem-loop structure of the HIV-1 transactivation response element (TAR) is detrimental to silencing by RNAi [39]. In contrast, the location of the siRNA-binding site within a translated or noncoding region of the mRNA had only marginal effects.
A systematic global analysis was performed with a set of siRNAs directed against two target RNAs, for which the accessibility of the siRNA target sites was determined by an iterative computational approach and by experimental RNase H mapping [40]. IC 50 -values as well as the maximal extent of target suppression were significantly improved for siRNAs against accessible local target sites compared to those siRNAs which targeted inaccessible regions of the mRNAs. In contrast, the relative thermodynamic stability of both ends of the siRNA was not found to be a suitable marker for siRNA activity. This finding was further strengthened by a kinetic analysis of isolated human RISC [41]. An siRNA directed against the highly structured RNA of the HIV-1 TAR was found to be incapable of target RNA cleavage. When the tight structure was disrupted by the addition of an oligonucleotide consisting of 2 -O-methyl RNA, target-site accessibility increased leading to enhanced cleavage of the TAR RNA.
In a recent study, we aimed at deciphering the contributions of both factors, that is, the thermodynamic properties of the siRNA and the target RNA structure, to the efficiency of an RNAi approach by constructing a set of intentionally designed target sites [42]. A highly active siRNA, which is capable of silencing its full-length target RNA in the subnanomolar range, maintained its potency when directed against the isolated target site fused to the green fluorescent protein (GFP). Interestingly, a fusion construct with the siRNA-binding site in reverse orientation was found to be silenced to a much lower extent, confirming the existence of a strand bias. However, incorporation of the original target site into a tight hairpin structure was detrimental to silencing as well. Further experimental and bioinformatic analysis of a set of target RNAs with varying degrees of target-site accessibility revealed a linear correlation between the local free energy in the siRNA-binding region and the extent of gene knockdown. These findings demonstrate that the thermodynamic properties of the siRNA itself as well as the structure of the target RNA both influence the efficiency of an siRNA. We therefore proposed a model, according to which the outcome of an RNAi approach is determined at two points of the multistep process (Figure 3). Firstly, asymmetric strand incorporation into RISC is controlled by thermodynamic properties of the siRNA; secondly, accessibility of the target site may further modulate the efficiency of silencing. Even siRNAs with favorable thermodynamic properties may thus be incapable of inhibiting gene expression in cases in which the binding region is inaccessible due to tight secondary structures.
Design of siRNAs according to the criteria recommended by Reynolds et al [20] frequently results in satisfactory inhibition of gene expression. Some targets, however, are refractory to RNAi-mediated silencing, most likely due to the existence of stable secondary structures. For example, we and others failed to identify efficient siRNAs against the highly structured 5 untranslated region of plus-stranded RNA viruses and were more successful when targeting less tightly arranged parts of the coding region [43][44][45][46][47] Figure 3: Efficiency of an siRNA is determined at two points of the RNAi pathway. (1) A strand bias exists that is defined by the intrinsic thermodynamic properties of the siRNA duplex, that is, by the relative stability of both ends. (2) A highly ordered structure may have a detrimental influence on the hybridisation of the siRNA/RISC to its target site and may therefore reduce the efficiency of the silencing process, even in cases in which the intended antisense strand is favored for incorporation into RISC. (Reprinted with slight modifications from the Journal of Molecular Biology; see [42], with kind permission from Elsevier.) might be advisable to take the target RNA structure into account as well. Several freely available design algorithms, for example, the Sfold web server (http://sfold.wadsworth.org [48]) and the siRNA design tool offered by MWG-biotech (http://www.mwg-biotech.com [49]) allow the design of siRNAs based on thermodynamic properties of the duplexes with consideration of the predicted secondary structure of the binding region of a potential siRNA.

STRATEGIES TO IMPROVE siRNA EFFICIENCY
Detailed bioinformatic analysis of the large set of sequenceactivity relationships reported by Huesken et al [25] confirmed that the score according to Reynolds et al [20] as well as the target-site accessibility correlate with the extent of siRNA-mediated gene silencing. However, this investigation clearly revealed that both parameters are insufficient to fully explain or predict the potency of siRNAs (G. Schramm, personal communication). Thus, further factors can be expected to influence the functionality of siRNA molecules. Recently, Patzel et al [50] suggested that the structure of the guide strand could be another feature, which is crucial for the efficiency. Employing a series of siRNAs with different structures, guide strands that do not form defined structures or possess freely available terminal nucleotides, mainly at the 3 end of the guide strand, were found to increase the efficiency of siRNAs (Figure 4). In contrast, structures with base-paired ends were virtually inactive. Interestingly, in this study neither the thermodynamic duplex profiles nor target mRNA structure were found to be of major importance for the silencing process.
siRNA Unwinding Inactive Active Figure 4: Influence of guide RNA structure on siRNA efficiency [50]. siRNA guide strands with base-paired termini were found to be inactive, whereas guide RNAs with freely accessible ends (mainly 3 ends) were highly efficient.
A strategy to circumvent the need to identify suitable individual siRNAs is to use mixtures of siRNAs. To this end, long double-stranded RNA molecules have been processed in vitro by Escherichia coli RNase III [51]. The resulting pool of siRNAs, dubbed endoribonuclease-prepared siRNAs (esiRNA), can subsequently be transfected into cells to silence the corresponding gene. This efficient and cost effective method allowed the rapid generation of a large library consisting of more than 5000 esiRNAs [51]. It is still under debate whether this approach will elicit severe off-target effects due to the large number of sequences contained in the pool. It has, however, also been argued that pooling of siR-NAs might decrease unspecific effects, since this strategy dilutes out the off-target effects of each individual siRNA, while retaining the total target-specific silencing capacity.
Two independent studies described additional approaches to enhance the efficiency of a single siRNA. Conventional siRNAs consist of a 19 mer double-stranded region and two nucleotide overhangs at the 3 ends of each strand. Accordingly, short hairpin RNAs used for vector expression are designed with a 19 mer duplex, a loop connecting both strands, and two to four uridines at the 3 end of the antisense strand. The two more recent publications now report that longer siRNA duplexes are up to 100-fold more potent than the corresponding conventional 21 mer siRNAs [52,53]. In one of these studies a set of chemically synthesised siRNAs of varying length was used [52]. The optimum of silencing efficiency was found for siRNAs being 27 nucleotides in length. These 27 mers were even suitable to target sites that are refractory to silencing by 21 mer siRNAs. Importantly, the 27 mer duplexes did not activate the interferon response or protein kinase R. The authors of the second publication found 29 mer short hairpin RNAs to be particularly potent inducers of RNAi [53]. The higher efficiency of longer double-stranded RNA duplexes might be due to the fact that these siRNAs and shRNAs, respectively, are initially processed by Dicer to give 21 mers. As described above, mechanistic models based on copurification experiments [31] indicate that Dicer is involved in the loading process of siRNAs into RISC, thus explaining the improved potency of Dicer substrates compared to traditional 21 mer siRNAs. In a follow-up study, 27 mer duplexes with 2-base 3 -overhangs were found to be superior compared to blunt-end duplexes [54]. Interestingly, asymmetric strand utilization was found with the strand carrying the overhang being preferred for silencing. The authors conclude that Dicer processing confers functional polarity within the RNAi pathway for longer double-stranded RNAs.
Recently developed strategies to generate siRNAs from a miRNA environment went along the same lines to employ Dicer substrates for silencing. Stegmeier et al [55] generated an siRNA by replacing a naturally occurring miRNA by a target-specific siRNA sequence flanked by ∼125 bases of 5 and 3 sequence derived from the primary miRNA transcript. This construct can be expressed from both Pol III and Pol II promotors, thus opening the road to use tissue-specific promotors. The microRNA-type expression of shRNAs has been found to be superior compared to conventionally expressed isolated shRNAs and has been used to generate large libraries covering a substantial fraction of the predicted genes in the human and mouse genomes [56].

SUMMARY
Various factors have been identified that contribute to the efficacy of small interfering RNAs. Thermodynamic properties of a given siRNA itself influence its asymmetric incorporation into the RNA-induced silencing complex. Furthermore, the local structure of the targeted RNA might render the siRNA-binding region inaccessible, thus preventing efficient silencing. Additional factors like the availability of free ends of the siRNA antisense strand have been described to be relevant to the induction of RNAi. It is, however, clear that all of these features still do not provide an exhaustive description of the determinants of siRNA potency. We can therefore expect additional factors to be identified that contribute to the activity of siRNAs. Additional research is needed to further increase the success rate when designing siRNAs against a new target RNA.