Plan Execution: Theseus  
  Theseus is an execution platform for information agents. Its goals are to allow complex information management plans to be easily specified and to provide an infrastructure that optimizes the execution of such plans. Theseus is based on a streaming dataflow architecture that permits a high degree of horizontal (inter-operational) and vertical (intra-operational) parallelism and pipelining of data.  
     
  As shown in Figure 1, in the normal Von-Neumann style execution, the data will be executed serially. As a result, (a + b), (c + d) and the multiplication will be executed in order. The dataflow models, however, will execute any operator as soon as the input is available. Thus, (a+b) and (c+d) will be executed at the same time.  
     
 
Figure 1. An Example Application
 
  Theseus executes plans that are specified as dataflow graphs, specifically a network of operators that pass data between each other. Operators can be thought of as finite state machines that, when enabled, perform a specific type of information management action. Operator inputs in Theseus are similar to planning pre/post-conditions, except that they can also carry data, thus allowing information to be easily routed through the plan. Declaring plans with operators that produce and consumed named data allows execution to be specified succinctly, only in terms of those control and data dependencies necessary to ensure correct execution.  
     
  Theseus operators include those useful for data processing, remote information retrieval, local storage (i.e., in a local relational database), and those for flexible communication of plan results (i.e., via e-mail) Plans leverage from these operators and built-in support for loops and conditionals so that powerful, practical information management plans can be specified.  
     
  Through its language and execution system, Theseus enables agents to perform useful information management tasks, such as periodic execution, query result accumulation, and flexible result communication. Most importantly, through properties of its architecture, Theseus reduces the overall effect of network latencies on data integration, providing increased parallelism and asynchrony during execution so that the overall end-to-end agent execution process is substantially faster.  
       
  Speculative Execution  
       
  To improve the execution time that is the result from the web latencies, Theseus uses the technique called "speculative execution." For example, if we want to build a CarInfo Agent, we might use the following plan, which extracts data from different websites.  
       
  As shown in Figure 2,we need the result from Edumands before we can get the safety rating from NHTSA and see ConsumerGuide reviews; the input in the later stage of the plan depends on the output from the previous state. As a result, the total execution time will equal to the longest path of execution.  
     
 
Figure 1. An Example Application
 
       
  Speculative execution uses a combination of caching, machine learning, and transduction to generate the prediction of the result based on the input. Figure 2 shows the original CarInfo plan and Figure 3 shows how speculative execution works. At the beginning of the plan, Theseus use the input to the first wrapper as a hint to predict the result of the first wrapper. This prediction is used as the input into the next operator (select), while the first wrapper is still retrieving the result. Before the plan returns the data, the speculation result is verified with the result from the first wrapper and only correct results are allowed as the output of the plan.  
     
 
Figure 1. An Example Application
 
       
  Speculation operators can be implemented with every wrapper. With cascading operator as shown in Figure 4, the result of the speculation of the first wrapper can be used as a hint to speculate the result of the next wrapper and so on. In theoreticaly, cascading speculation can achieve arbitary speed up. More details about Theseus and Speculative execution can be found in the publication section.  
     
 
Figure 1. An Example Application