MonetDB 优化器

MonetDB 汇编语言优化器

设计MonetDB中间语言的首要理由是能对数据库查询有一个高层次的描述,且这种语言容易被front-end编译器产生,容易编码,优化和解析。

一个有效的优化器需要几种机制保证。它可以执行对代码碎片的标志性评估和收集结果来帮助进一步的抉择。原型情况就是一个优化器估计选择操作结果的大小。另外主要的问题是可以产生和发掘替代评估计划的空间。这种发掘可以发生在前端,也可以在运行的时候对查询碎片。

优化器的基础

1.生命周期的分析

优化器为了做一个抉择可能对代码块的特性可能有兴趣。在代码块中变量都有生命周期,用属性beginLifespan,endLifespan来表示。beginLifespan表示指令在哪里得到其第一个值,endLifespan表示最后指令在哪里被使用作为操作数或目标。如果然而,最后的使用在BARRIER块里,我们不能确定它的生命状态的结束,因为代码块redo可能模糊地使用它。对于这些情况我们关联endLifespan到跳出代码块。

在许多的情况,我们需要决定是否生命周期干扰了一个预先准备好的优化决定。在优化器的开始序列中,生命周期被计算一次。它会被维持去反映最准确的情况当优化基础代码时。特别是,这意味着任何的 move/remove/addition MAL指令调用不是为了重现计算就是为了进一步的传播。不清楚什么会是最好的策略。暂时我们只是重新计算。

在点pc指令中提到里所有的参数都会在指令qc见到且不会同时更新。把变量可能在代码块内声明都考虑进来。这可能使用BARRIER/CATCH 和 EXIT对计算。对于每一个MAL函数安全的属性相对容易决定。这种调用是为了访问MAL函数块和监视签名的参数。

任何的指令可能阻碍公共子表达式的识别。他充分地阻碍一个其参数列表与目标指令有非空的交集的不安全的函数。为了说明,考虑序列:

1
2
3
4
5
6
7
 L1 := f(A, B, C);
  ...
 G1 := g(D, E, F);
  ...
 l2 := f(A, B, C);
  ...
 L2 := h()

指令G1:=g(D, E, F) 阻塞如过G1 是{A,B,C}的别名。换言之,函数g()可能不安全和{D,E,F}和{A,B,C}有非空的交集。一个别名在以后的使用只能是只读。

MonetDB的执行优化前,设定生命周期的代码。

设置变量的生命周期setLifespan
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
Lifespan setLifespan(MalBlkPtr mb)
{
  int pc, k, depth=0, prop;
  InstrPtr p;
  int *blk;
  Lifespan span= newLifespan(mb);

  memset((char*) span,0, sizeof(LifespanRecord)* mb->vtop);
  prop = PropertyIndex("transparent");

  blk= (int *) GDKzalloc(sizeof(int)*mb->vtop);

  for (pc = 0; pc < mb->stop; pc++) {
      p = getInstrPtr(mb, pc);
      if( p->token == NOOPsymbol)
          continue;

      if( blockStart(p) && varGetProp(mb, getArg(p,0), prop) == NULL)
          depth++;

      for (k = 0; k < p->argc; k++) {
          int v = getArg(p,k);

          if (span[v].beginLifespan == 0 ){
              span[v].beginLifespan = pc;
              blk[v]= depth;
          }
          if (k < p->retc )
              span[v].lastUpdate= pc;
          if ( blk[v] == depth )
              span[v].endLifespan = pc;

          if ( k >= p->retc && blk[v] < depth )
              span[v].endLifespan = -1;   /* declared in outer scope*/
      }
      /*
      * @-
      * At a block exit we can finalize all variables defined within that block.
      * This does not hold for dataflow blocks. They merely direct the execution
      * thread, not the syntactic scope.
      */
      if( blockExit(p) ){
          for (k = 0; k < mb->vtop; k++)
          if ( span[k].endLifespan == -1)
              span[k].endLifespan = pc;
          else
          if ( span[k].endLifespan == 0 && blk[k]==depth)
              span[k].endLifespan = pc;
          if (varGetProp(mb, getArg(p,0), prop) == NULL )
              depth--;
      }
  }
  for (k = 0; k < mb->vtop; k++)
  if ( span[k].endLifespan == 0 )
      span[k].endLifespan = pc-2;/* generate them before the end */
  GDKfree(blk);
  return span;
}

2.流分析

在许多优化规则里,语句间的数据流依赖尤其重要。MAL语言编码一个多源,多节点的数据流网络。优化器特别提取部分的工作流和使用语言的属性去枚举语义相等的解决方案,在给定的代价模型这出现导致更好的性能。流图在许多优化步骤里面扮演着重要的角色。什么是原始的和什么存储结构是最足够的是不清楚的。暂时我们引入必要的操作和对程序直接的评估。

对于每个变量我们应该确定他稳定的范围。在流图中的终点描述为不会产生永久数据的dead-code。当你知道这没有影响时,可以把它去掉。Side-effect自由评估应该在前端被知的属性。暂时,我们假设对于系统所有操作都是已知的。属性“不安全”是保留去识别这靠不住的情况。特别的是,一个bun-insert操作是不安全的,因为它改变了其中一个参数。

MonetDB优化器详解:

1.累加器评估:

批量的算术运算相当昂贵,因为对于每个表达式new BATs被创建。这内存饥饿被减少通过探测累加处理的机会,如,一个(临时)的变量被重写。如

考虑下列程序片段:
1
2
3
 t3:=batcalc.*(64,t2);
 t4:=batcalc,+(t1,t3);
 optimizer.accumulators();

如果变量t2是临时变量且不会在以后的程序块用到,我们可以重用它的存储空间和在剩余的代码中传播其别名。

1
2
 batcalc.*(t2,64,t2);
 t4:=batcalc.+(t2,t1,t2);

这实现是直接的。它刚刚只是处理了在BATCALC可用的算术操作。这集合会慢慢地扩展的。关键的决定是去决定是否我们可以重写其中一个参数。在编译的时候,这是很难去检测的,如参数可能是绑定操作或代表一个通过永久BAT表示的视图的结果。因此,编译器注入调用ALGEBRA.REUSE()通过拷贝避免重写永久的BATs。

MonetDB 累加器优化器代码:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
int OPTaccumulatorsImplementation(Client cntxt, MalBlkPtr mb, MalStkPtr stk, InstrPtr pci)
{
  int i, limit,slimit;
  InstrPtr p,q;
  Module scope = cntxt->nspace;
  int actions = 0;
  InstrPtr *old;
  Lifespan span;

  (void) pci;
  (void) stk;     /* to fool compilers */
  span = setLifespan(mb);
  if( span == NULL)
      return 0;
  old= mb->stmt;
  limit= mb->stop;
  slimit= mb->ssize;
  if ( newMalBlkStmt(mb,mb->stop) < 0)
      return 0;
  for (i = 0; i < limit; i++) {
      p = old[i];

      if( getModuleId(p) != batcalcRef ) {
          pushInstruction(mb,p);
          continue;
      }
      OPTDEBUGaccumulators
          printInstruction(cntxt->fdout, mb, 0, p, LIST_MAL_ALL);
      if (p->retc==1 && p->argc == 2) {
          /* unary operation, avoid clash with binary */
          pushInstruction(mb,p);
          continue;
      }
      if( getLastUpdate(span,getArg(p,0)) != i ) {
          /* only consider the last update to this variable */
          pushInstruction(mb,p);
          continue;
      }

      if (p->retc==1  && p->argc == 3 && isaBatType(getArgType(mb,p,0))) {
          int b1 =getEndLifespan(span,getArg(p,1))<=i && getArgType(mb,p,1) == getArgType(mb,p,0);
          int b2 =getEndLifespan(span,getArg(p,2))<=i && getArgType(mb,p,2) == getArgType(mb,p,0) ;
          if ( b1 == 0 && b2 == 0){
              pushInstruction(mb,p);
              continue;
          }
          /* binary/unary operation, check arguments for being candidates */
          q= copyInstruction(p);
          p= pushBit(mb,p, b1);
          p= pushBit(mb,p, b2);

          typeChecker(cntxt->fdout, scope, mb, p, TRUE);
          if (mb->errors || p->typechk == TYPE_UNKNOWN) {
              OPTDEBUGaccumulators{
                  mnstr_printf(cntxt->fdout,"# Failed typecheck");
                  printInstruction(cntxt->fdout, mb, 0, p, LIST_MAL_ALL);
              }
              /* reset instruction error buffer */
              cntxt->errbuf[0]=0;
              mb->errors = 0;
              freeInstruction(p);
              p=q; /* restore */
          } else  {
              OPTDEBUGaccumulators{
                  mnstr_printf(cntxt->fdout, "#Found accumulation candidate ");
                  mnstr_printf(cntxt->fdout, "%d: %d(%d)\n", i, getArg(p,0),getArg(p,2));
                  printInstruction(cntxt->fdout, mb, 0, p, LIST_MAL_ALL);
              }
              freeInstruction(q);
              actions++;
          }
          OPTDEBUGaccumulators
              printInstruction(cntxt->fdout, mb, 0, p, LIST_MAL_ALL);
      }
      pushInstruction(mb,p);
  }
  for (i = limit; i<slimit; i++)
      if(old[i])
          freeInstruction(old[i]);
  GDKfree(old);
  GDKfree(span);
  DEBUGoptimizers
      mnstr_printf(cntxt->fdout,"#opt_accumulators:%d accumulations\n",actions);
  return actions;
}

2.别名的去除:

任务 OPTIMIZER.ALIASREMOVAL()浏览程序寻找简单赋值语句,如,V:=W,它用W替代了所有的接下来的V,条件是V只是被赋值一次和W在剩下的代码中不会改变。特殊的例子在迭代的代码块中,如下所示:

1
2
3
4
5
6
7
8
9
10
11
12
         i:=0;
         b:="done";
barrier  go:=true;
         c:= i+1;
         d:="step";
         v:=d;
         io.print(v);
         i:=c;
  redo go:=i lower than 2;
  exit go;
         io.print(b);
         optimizer.aliasRemoval();

字符常量被传入到PRINT()任务,当初始的赋值i:= 0 需要保留。代码块变成:

1
2
3
4
5
6
7
8
9
         i:=0;
barrier  go:=true;
         c:= i+1;
         io.print(step");
         i:=c;
  redo go:=i lower than 2;
  exit go;
         io.print("done");
         optimizer.aliasRemoval();

下面是MonetDB别名去除优化的代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
int OPTaliasesImplementation(Client cntxt, MalBlkPtr mb, MalStkPtr stk, InstrPtr p)
{
  int i,k=1, limit, actions=0;
  int *alias;
  Lifespan span;

  (void) stk;
  (void) cntxt;
  span= setLifespan(mb);
  if( span == NULL)
      return 0;

  alias= (int*) GDKmalloc(sizeof(int)* mb->vtop);
  if (alias == NULL)
      return 0;
  for(i=0; i<mb->vtop; i++) alias[i]=i;

  limit = mb->stop;
  for (i = 1; i < limit; i++){
      p= getInstrPtr(mb,i);
      mb->stmt[k++] = p;
      if (OPTisAlias(p)){
          if( getLastUpdate(span,getArg(p,0)) == i  &&
              getBeginLifespan(span,getArg(p,0)) == i  &&
              getLastUpdate(span,getArg(p,1)) <= i ){
              alias[getArg(p,0)]= alias[getArg(p,1)];
              freeInstruction(p);
              actions++;
              k--;
          } else
              OPTaliasRemap(p,alias);
      } else
          OPTaliasRemap(p,alias);
  }
  for(i=k; i<limit; i++)
      mb->stmt[i]= NULL;
  mb->stop= k;
  /*
  * @-
  * The second phase is constant alias replacement should be implemented.
  */
  GDKfree(span);
  GDKfree(alias);
  DEBUGoptimizers
      mnstr_printf(cntxt->fdout,"#opt_aliases: %d removed\n",actions);
  return actions;

3.代码工厂化 :

在大部分真正的情况,查询在它们的参数有稍微的差别被重复地调用。这种情况通过保持近期查询计划的缓存可以被查询优化器捕捉。在MonetDB上下文这样的查询被表示为参数化的MAL程序。进一步优化缓存函数将查询计划分成两块可能你有帮助。一个区域有不依赖于给定参数的操作和另一区域包含查询的核心使用所有的信息。这样的程序可以被MAL工厂表示,是一个可重入的查询计划。一个工厂化改变代码的例子如下:

1
2
3
4
5
6
7
8
function test(s:str):lng;
  b:= bat.new(:int,:str);
  bat.insert(b,1,"hello");
  z:= algebra.select(b,s,s);
  i:= aggr.count(z);
  return i;
end test;
  optimizer.factorize("user","test");

转变为下面的代码

1
2
3
4
5
6
7
8
9
10
factory user.test(s:str):lng;
  b := bat.new(:int,:str);
  bat.insert(b,1,"hello");
barrier always := true;
  z := algebra.select(b,s,s);
  i := aggr.count(z);
  yield i;
  redo always;
exit always;
end test;

包含的工厂生成器是MAL工厂化的原型实现。被采用的方法是将程序分成两块和把其包装为MAL工厂。优化假设数据库表在工厂的生命时间中只访问一次不会改变。这样的变化应该被外面检测和接着是重启的工厂。一个重定义用户可以识别‘冻僵’的参数的模式会留给将来。因为查询会映射到人和可用的工厂去处理请求。暂时我们简单地重组计划的所有参数。工厂化的操作干扰OPTIMIZER.EXPRESSIONACCUMULATION() 因为这可能会重写参数。暂时,这在本地任务会被捕捉。

MonetDB工厂化优化代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
int OPTfactorizeImplementation(Client cntxt, MalBlkPtr mb, MalStkPtr stk, InstrPtr pci)
{
  int i, k,  v, noop = 0, se;
  InstrPtr *mbnew;
  InstrPtr p,sig;
  int fk = 0, sk = 0, blk = 0, blkstart = 0;
  int *varused, returnseen = 0, retvar=0;
  InstrPtr *first, *second;
  Lifespan span;

  (void) cntxt;
  (void) pci;
  (void) stk;     /* to fool compilers */

  span = setLifespan(mb);
  if( span == NULL)
      return 0;

  varused = GDKmalloc(mb->vtop * sizeof(int));
  if ( varused == NULL) {
      GDKfree(span);
      return 0;
  }
  for (i = 0; i < mb->vtop; i++)
      varused[i] = 0;

  /* add parameters to use list */
  sig = getInstrPtr(mb, 0);
  for (i = 0; i < sig->argc; i++)
      varused[i] = 1;

  first = (InstrPtr *) GDKzalloc(mb->ssize * sizeof(InstrPtr));
  if ( first == NULL){
      GDKfree(span);
      GDKfree(varused);
      return 0;
  }
  second = (InstrPtr *) GDKzalloc(mb->ssize * sizeof(InstrPtr));
  if ( second == NULL){
      GDKfree(span);
      GDKfree(varused);
      GDKfree(first);
      return 0;
  }

  first[fk++] = getInstrPtr(mb, 0);  /* to become a factory */
  for (i = 1; i < mb->stop - 1; i++) {
      p = getInstrPtr(mb, i);
      se = 0;
      for (k = 0; k < p->argc; k++)
          if (varused[p->argv[k]])
              se++;

      /* detect blocks they are moved to the second part */
      /* a more clever scheme can be designed though */
      if (p->barrier) {
          if (p->barrier == BARRIERsymbol || p->barrier == CATCHsymbol) {
              if (blkstart == 0)
                  blkstart = i;
              blk++;
          } else if (p->barrier == EXITsymbol) {
              blk--;
              if (blk == 0)
                  blkstart = 0;
          }
      }

      /* beware, none of the target variables may live
        before the cut point.  */
      for (k = 0; k < p->retc; k++)
          if (getBeginLifespan(span, p->argv[k])< i || !OPTallowed(p))
              se = 0;
      if (p->barrier == RETURNsymbol) {
          se = 1;
          p->barrier = YIELDsymbol;
          returnseen = 1;
          retvar= getArg(p,0);
      }

      if (se == 0 && blk == 0)
          first[fk++] = p;
      else {
          if (blkstart) {
              /* copy old block stuff */
              for (k = blkstart; k < i; k++)
                  second[sk++] = first[k];
              fk = blkstart;
              blkstart = 0;
          }
          second[sk++] = p;
          for (k = 0; k < p->retc; k++)
              varused[p->argv[k]] = 1;
      }
  }
  second[sk++] = getInstrPtr(mb, i);
  /* detect need for factorization, assume so */
  if (noop || sk == 0) {
      GDKfree(varused);
      GDKfree(first);
      GDKfree(second);
      GDKfree(span);
      /* remove the FToptimizer request */
      return 1;
  }

  first[0]->token = FACTORYsymbol;

  mbnew = (InstrPtr *) GDKmalloc((mb->stop + 4) * sizeof(InstrPtr));
  if ( mbnew == NULL) {
      GDKfree(span);
      GDKfree(varused);
      GDKfree(first);
      GDKfree(second);
      return 0;
  }
  GDKfree(mb->stmt);
  mb->stmt = mbnew;

  mb->stop = mb->stop + 4;

  k = 0;
  for (i = 0; i < fk; i++)
      mb->stmt[k++] = first[i];

  /* added control block */
  v = newVariable(mb, GDKstrdup("always"), TYPE_bit);
  p = newInstruction(NULL,ASSIGNsymbol);
  p->barrier = BARRIERsymbol;
  getArg(p,0) = v;
  p= pushBit(mb,p,TRUE);
  mb->stmt[k++] = p;

  for (i = 0; i < sk - 1; i++)
      mb->stmt[k++] = second[i];

  /* finalize the factory */
  if (returnseen == 0) {
      p= newInstruction(NULL,ASSIGNsymbol);
      p->barrier = YIELDsymbol;
      getArg(p,0)= getArg(sig,0);
      mb->stmt[k++] = p;
  }
  p = newInstruction(NULL,REDOsymbol);
  p= pushReturn(mb, p, v);
  mb->stmt[k++] = p;

  p = newInstruction(NULL,EXITsymbol);
  p= pushReturn(mb, p, v);
  mb->stmt[k++] = p;

  /* return a nil value */
  if ( getVarType(mb,retvar) != TYPE_void){
      p = newInstruction(NULL,RETURNsymbol);
      getArg(p,0) = getArg(sig,0);
      pushArgument(mb,p, retvar);
      mb->stmt[k++] = p;
  }
  /* add END statement */
  mb->stmt[k++] = second[i];

  mb->stop = k;

  GDKfree(varused);
  GDKfree(first);
  GDKfree(second);
  GDKfree(span);
  return 1;
}

4.删除强制转换:

一个简单的优化器去除不需要的强制转换。它们可能来源于草率的代码生成器或函数调用决议决定。如 v:= calc.int(32); 成为一个简单地赋值,不需要函数调用。最主要的角色是一个编码一个优化算法的说明。

MonetDB删除强制转换的代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
static int coercionOptimizerStep(MalBlkPtr mb, int i, InstrPtr p)
{
  int t, k, a, b;

  a = getArg(p, 0);
  b = getArg(p, 1);
  t = getVarType(mb, b);
  if (getVarType(mb, a) != t)
      return 0;
  if (strcmp(getFunctionId(p), ATOMname(t)) == 0) {
      removeInstruction(mb, p); /* dead code */
      for (; i < mb->stop; i++) {
          p = getInstrPtr(mb, i);
          for (k = p->retc; k < p->argc; k++)
              if (p->argv[k] == a)
                  p->argv[k] = b;
      }
      return 1;
  }
  return 0;
}
int  OPTcoercionImplementation(Client cntxt,MalBlkPtr mb, MalStkPtr stk, InstrPtr pci)
{
  int i, k;
  InstrPtr p;
  int actions = 0;
  str calcRef= putName("calc",4);

  (void) cntxt;
  (void) pci;
  (void) stk;     /* to fool compilers */

  for (i = 1; i < mb->stop; i++) {
      p = getInstrPtr(mb, i);
      if (getModuleId(p) == NULL)
          continue;
      if (getModuleId(p)==calcRef && p->argc == 2) {
          k= coercionOptimizerStep(mb, i, p);
          actions += k;
          if( k) i--;
      }
  }
  /*
  * This optimizer affects the flow, but not the type and declaration
  * structure. A cheaper optimizer is sufficient.
  */
  DEBUGoptimizers
      mnstr_printf(cntxt->fdout,"#opt_coercion: %d coersions applied\n",actions);
  return actions;
}

5.消除公共子表达式:

消除公共子表达只涉及对程序块的扫描去检测重复出现的语句。最在说明最关键的问题是保证在重复指令里涉及到的参数都是不变的。对OPTIMIZER.COMMONTERMS()分析是相当简陋的。带有可能有side-effects的参数的所有函数应该被标志为‘不安全’。在MAL块中它们的使用跳出涉及所有对象的数据流图(BATs,所有的东西保存在盒子里)。消除子表达优化器位于相同指令的后面。只要发现相同的它就会停止。在我们用前面的变量代替表达式之前,我们假设我们还没有通过一个非空的代码块

1
2
3
4
5
6
7
8
9
10
    b:= bat.new(:int,:int);
  c:= bat.new(:int,:int);
  d:= algebra.select(b,0,100);
  e:= algebra.select(b,0,100);
  k1:= 24;
  k2:= 27;
  l:= k1+k2;
  l2:= k1+k2;
  l3:= l2+k1;
  optimizer.commonTerms();

会被转换成代码块,开始的两个指令不是相同的,因为它们有side 影响

1
2
3
4
5
6
    b := bat.new(:int,:int);
  c := bat.new(:int,:int);
  d := algebra.select(b,0,100
  e := d;
  l := calc.+(24,27);
  l3 := calc.+(l,24);
MonetDB 消除公共子表达式的代码段
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
int  OPTcommonTermsImplementation(Client cntxt, MalBlkPtr mb, MalStkPtr stk, InstrPtr pci)
{
  int i, j, k, prop, candidate, barrier= 0, cnt;
  InstrPtr p, q;
  int actions = 0;
  int limit, slimit;
  int *alias;
  InstrPtr *old;
  int *list; 
  /* link all final constant expressions in a list */
  /* it will help to find duplicate sql.bind calls */
  int cstlist=0;
  int *vars;

  (void) cntxt;
  (void) stk;
  (void) pci;
  alias = (int*) GDKzalloc(sizeof(int) * mb->vtop);
  list = (int*) GDKzalloc(sizeof(int) * mb->stop);
  vars = (int*) GDKzalloc(sizeof(int) * mb->vtop);
  if ( alias == NULL || list == NULL || vars == NULL){
      if(alias) GDKfree(alias);
      if(list) GDKfree(list);
      if(vars) GDKfree(vars);
      return 0;
  }

  old = mb->stmt;
  limit = mb->stop;
  slimit = mb->ssize;
  if ( newMalBlkStmt(mb, mb->ssize) < 0){
      GDKfree(alias);
      GDKfree(list);
      GDKfree(vars);
      return 0;
  }

  for ( i = 0; i < limit; i++) {
      p = old[i];

      for ( k = 0; k < p->argc; k++)
      if ( alias[getArg(p,k)] )
          getArg(p,k) = alias[getArg(p,k)];

      /* Link the statement to the previous use, based on the last argument.*/
      if ( p->retc < p->argc ) {
          candidate = vars[getArg(p,p->argc-1)];
          if ( isVarConstant(mb, getArg(p,p->argc-1)) ){
              /* all instructions with constant tail are linked */
              list[i] = cstlist;
              cstlist = i;
          } else
              list[i]= vars[ getArg(p,p->argc-1) ];
          vars[getArg(p,p->argc-1)] = i;
      } else candidate = 0;

      pushInstruction(mb,p);
      if (p->token == ENDsymbol){
          /* wrap up the remainder */
          for(i++; i<limit; i++)
              if( old[i])
                  pushInstruction(mb,old[i]);
          break;
      }
      /*
      * @-
      * Any non-empty barrier block signals the end of this optimizer,
      * the impact of the block can affect the common code.
      */
      barrier |= (p->barrier== BARRIERsymbol || p->barrier== CATCHsymbol) && old[i+1]->barrier!=EXITsymbol;
      /*
      * @-
      * Also block further optimization when you have seen an assert().
      * This works particularly for SQL, because it is not easy to track
      * the BAT identifier aliases to look for updates. The sql.assert
      * at least tells us that an update is planned.
      * Like all optimizer decisions, it is safe to stop.
      */
      barrier |= getFunctionId(p) == assertRef;
      if (p->token == NOOPsymbol || p->token == ASSIGNsymbol || barrier /* || p->retc == p->argc */) {
#ifdef DEBUG_OPT_COMMONTERMS_MORE
              mnstr_printf(cntxt->fdout, "COMMON SKIPPED[%d] %d %d\n",i, barrier, p->retc == p->argc);
#endif
          continue;
      }

      /* from here we have a candidate to look for a match */
#ifdef DEBUG_OPT_COMMONTERMS_MORE
      mnstr_printf(cntxt->fdout,"#CANDIDATE[%d] ",i);
      printInstruction(cntxt->fdout, mb, 0, p, LIST_MAL_ALL);
#endif
      prop = hasSideEffects(p,TRUE) || isUpdateInstruction(p);
      j =   isVarConstant(mb, getArg(p,p->argc-1))? cstlist: candidate;
              
      cnt = mb->stop / 128 < 32? 32 : mb->stop/128;    /* limit search depth */
      if ( !prop)
      for (; cnt > 0 && j ; cnt--, j = list[j])
          if ( getFunctionId(q=getInstrPtr(mb,j)) == getFunctionId(p) && getModuleId(q) == getModuleId(p)  ){
#ifdef DEBUG_OPT_COMMONTERMS_MORE
          mnstr_printf(cntxt->fdout,"#CANDIDATE %d, %d  %d %d ", i, j,
              hasSameSignature(mb, p, q, p->retc),
              hasSameArguments(mb, p, q));
              printInstruction(cntxt->fdout, mb, 0, q, LIST_MAL_ALL);
              mnstr_printf(cntxt->fdout," :%d %d %d=%d %d %d %d %d %d\n",
                  q->token != ASSIGNsymbol ,
                  list[getArg(q,q->argc-1)],i,
                  !hasCommonResults(p, q),
                  !hasSideEffects(q, TRUE),
                  !isUpdateInstruction(q),
                  isLinearFlow(q),
                  isLinearFlow(p));
#endif
              /*
              * @-
              * Simple assignments are not replaced either. They should be
              * handled by the alias removal part. All arguments should
              * be assigned their value before instruction p.
              */
              if ( hasSameArguments(mb, p, q) &&
                  hasSameSignature(mb, p, q, p->retc) &&
                  !hasCommonResults(p, q) &&
                  !isUnsafeFunction(q) &&
                  isLinearFlow(q)
                 ) {
                      if (safetyBarrier(p, q) ){
#ifdef DEBUG_OPT_COMMONTERMS_MORE
                      mnstr_printf(cntxt->fdout,"#safetybarrier reached\n");
#endif
                      break;
                  }
#ifdef DEBUG_OPT_COMMONTERMS_MORE
                      mnstr_printf(cntxt->fdout, "Found a common expression " "%d <-> %d\n", j, i);
                      printInstruction(cntxt->fdout, mb, 0, q, LIST_MAL_ALL);
#endif
                  clrFunction(p);
                  p->argc = p->retc;
                  for (k = 0; k < q->retc; k++){
                      alias[getArg(p,k)] = getArg(q,k);
                      p= pushArgument(mb,p, getArg(q,k));
                  }
#ifdef DEBUG_OPT_COMMONTERMS_MORE
                  mnstr_printf(cntxt->fdout, "COMMON MODIFIED EXPRESSION %d -> %d\n",i,j);
                  printInstruction(cntxt->fdout, mb, 0, p, LIST_MAL_ALL);
#endif
                  actions++;
                  break; /* end of search */
              }
          }
#ifdef DEBUG_OPT_COMMONTERMS_MORE
          else if ( hasSideEffects(q, TRUE) || isUpdateInstruction(p)){
              mnstr_printf(cntxt->fdout, "COMMON SKIPPED %d %d\n", hasSideEffects(q, TRUE) , isUpdateInstruction(p));
              printInstruction(cntxt->fdout, mb, 0, q, LIST_MAL_ALL);
          }
#endif
  }
  for(; i<slimit; i++)
      if( old[i])
          freeInstruction(old[i]);
  GDKfree(list);
  GDKfree(vars);
  GDKfree(old);
  GDKfree(alias);
  DEBUGoptimizers
      mnstr_printf(cntxt->fdout,"#opt_commonTerms: %d statements catched\n",actions);
#ifdef DEBUG_OPT_COMMONTERMS_MORE
      mnstr_printf(cntxt->fdout,"#opt_commonTerms: %d statements catched\n",actions);
#endif
  return actions;
}

6.常量表达式评估:

由编译器产生只涉及常量的参数的表达式可以被评估一次。它特别是与经常被调用的函数相关。一次的查询不会从这额外的步骤得益。考虑下列的包含重复使用的常量参数的代码片段。

1
2
3
4
5
6
7
    a:= 1+1;   io.print(a);
  b:= 2;        io.print(b);
  c:= 3*b;    io.print(c);
  d:= calc.flt(c);   io.print(d);
  e:= mmath.sin(d);  io.print(e);
  optimizer.aliasRemoval();
  optimizer.evaluate();

会被转换成代码块

1
2
3
4
5
6
    io.print(2);
  io.print(2);
  io.print(6);
  io.print(6);
  io.print(-0.279415488);
/*同样的我们尝试基于常量捕捉barrier块*/
MonetDB常量表达式优化
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
int  OPTconstantsImplementation(Client cntxt, MalBlkPtr mb, MalStkPtr stk, InstrPtr p)
{
  int i,k=1, n=0, fnd=0, actions=0;
  int *alias, *index;
  VarPtr x,y, *cst;

  OPTDEBUGconstants mnstr_printf(cntxt->fdout,"#OPT_CONSTANTS: MATCHING CONSTANTS ELEMENTS\n");

  alias= (int*) GDKzalloc(sizeof(int) * mb->vtop);
  cst= (VarPtr*) GDKzalloc(sizeof(VarPtr) * mb->vtop);
  index= (int*) GDKzalloc(sizeof(int) * mb->vtop);

  if ( alias == NULL || cst == NULL || index == NULL){
      if( alias) GDKfree(alias);
      if( cst) GDKfree(cst);
      if( index) GDKfree(index);
      return 0;
  }

  (void) stk;
  (void) cntxt;

  for (i=0; i< mb->vtop; i++)
      alias[ i]= i;
  for (i=0; i< mb->vtop; i++)
      if ( isVarConstant(mb,i)  && isVarFixed(mb,i) ){
          x= getVar(mb,i);
          fnd = 0;
          if ( x->type && x->value.vtype)
          for( k= n-1; k>=0; k--){
              y= cst[k];
              if ( x->type == y->type &&
                   x->value.vtype == y->value.vtype &&
                  ATOMcmp(x->value.vtype, VALptr(&x->value), VALptr(&y->value)) == 0){
                  OPTDEBUGconstants {
                      mnstr_printf(cntxt->fdout,"#opt_constants: matching elements %s %d %d ", getVarName(mb,i), i,k);
                      ATOMprint(x->value.vtype,VALptr(&x->value),cntxt->fdout);
                      mnstr_printf(cntxt->fdout,"\n");
                  }
                  /* re-use a constant */
                  alias[i]= index[k];
                  fnd=1;
                  actions++;
                  break;
              }
          }
          if ( fnd == 0){
              OPTDEBUGconstants mnstr_printf(cntxt->fdout,"swith elements %d %d\n", i,n);
              cst[n]= x;
              index[n]= i;
              n++;
          }
      }

  for (i = 0; i < mb->stop; i++){
      p= getInstrPtr(mb,i);
      for (k=0; k < p->argc; k++)
          getArg(p,k) = alias[getArg(p,k)];
  }
  DEBUGoptimizers
      mnstr_printf(cntxt->fdout,"#opt_constants: %d constant duplicates removed\n", actions);
  GDKfree(alias);
  GDKfree(cst);
  GDKfree(index);
  return actions;

7.代价模型方法:

代价模型是许多优化决定的基础。代价参数经常是(中间的)结果的大小和反应的时间。换言之,它们是运行聚集,如,从模拟运行中得到的最大的内存使用和总共的运行时间。当前的实现包含一个框架和对自身以代价为基础的例子。OPTIMIZER.COSTMODEL()以自己的方式在MAL程序中运行寻找关系运算和估计它们结果的大小。估计的大小被置后作为属性ROWS。

1
2
3
4
5
6
    r{rows=100} := bat.new(:oid,:int);
  s{rows=1000}:= bat.new(:oid,:int);
  rs:= algebra.select(s,1,1);
  rr:= bat.reverse(r);
  j:= algebra.join(rs,rr);
  optimizer.costModel();

该表指令的属性如下:

1
2
3
4
5
    r{rows=100} := bat.new(:oid,:int);
  s{rows=1000} := bat.new(:oid,:int);
  rs{rows=501} := algebra.select(s,1,1);
  rr{rows=100} := bat.reverse(r);
  j{rows=100} := algebra.join(rs,rr);

代价估计在真正的数据分布上未用任何的统计。它依赖于由front-end或其它优化器提供的ROWS属性。它只是使用了一些启发式代价估计器。然而,它保证空的结果会被ROWS=0标记,如果估计是精确的,否则它假设至少一个结果行。这个属性使安全传递代价估计的结果到减少代码的EMPTYSET优化器成为可能。

MonetDB代价估计实现代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
 /*The cost will be used in many places to make decisions.
 * Access should be fast.
 * The SQL front-end also makes the BAT index available as the
 * property bid. This can be used to access the BAT and involve
 * more properties into the decision procedure.
 * Also make sure you don't re-use variables, because then the
 * row count becomes non-deterministic.*/
int OPTcostModelImplementation(Client cntxt, MalBlkPtr mb, MalStkPtr stk, InstrPtr pci)
{
  int i, actions = 1;
  wrd k, c1, c2;
  InstrPtr p;
  str sortrevRef= putName("sortReverse",11);
  str sortrevTailRef= putName("sortReverseTail",15);
  str projectRef= putName("project",7);

  (void) cntxt;
  (void) stk;
  (void) pci;

  if (varGetProp(mb, getArg(mb->stmt[0], 0), inlineProp) != NULL)
      return 0;

  for (i = 0; i < mb->stop; i++) {
      p = getInstrPtr(mb, i);
      if (getModuleId(p)==algebraRef) {
          if (getFunctionId(p) == markTRef  ||
              getFunctionId(p) == markHRef  ||
              getFunctionId(p) == selectNotNilRef  ||
              getFunctionId(p) == sortRef  ||
              getFunctionId(p) == sortTailRef  ||
              getFunctionId(p) == sortrevRef  ||
              getFunctionId(p) == sortrevTailRef  ||
              getFunctionId(p) == projectRef  ){
              newRows(1,1,c1,0);
          } else if(getFunctionId(p) == unionRef ||
              getFunctionId(p) == kunionRef) {
              newRows(1,2,(c1+c2),0);
          } else if (getFunctionId(p)== kdifferenceRef) {
              newRows(1,2,(c1==0?0:c2==0?c1: c1 - c2 < 0 ? 1 : c1 - c2+1),0);
          } else if (getFunctionId(p) == joinRef ||
              getFunctionId(p) == leftjoinRef ||
              getFunctionId(p) == leftjoinPathRef ) {
              /* assume 1-1 joins */
              newRows(1,2,(c1 < c2 ? c1 : c2),0);
          } else if (getFunctionId(p) == semijoinRef ) {
              /* assume 1-1 semijoins */
              newRows(1,2,(c1 < c2? c1 : c2),0);
          } else if (getFunctionId(p) == selectRef ||
              getFunctionId(p) == uselectRef ||
              getFunctionId(p) == thetaselectRef ||
              getFunctionId(p) == thetauselectRef) {
              newRows(1,1, (c1 > 100 ? c1 / 2 +1: c1),0);
          } else if (getFunctionId(p) == crossRef) {
              newRows(1,2,((log((double) c1) + log((double) c2) > log(INT_MAX) ? INT_MAX : c1 * c2 +1)),0);
          } else if (getFunctionId(p) == tuniqueRef ) {
              newRows(1,1,( c1 < 50 ? c1 : c1 / 10+1),0);
          }
      } else if (getModuleId(p) == batcalcRef) {
          if( getFunctionId(p) == ifthenelseRef) {
              if( isaBatType(getArgType(mb,p,2) ) ) {
                  newRows(2,2, c1,0);
              } else {
                  newRows(3,3, c1,0);
              }
          } else if( isaBatType(getArgType(mb,p,1)) ){
                  newRows(1,1, c1,0);
              } else {
                  newRows(2, 2, c2,0);
              }
      } else if (getModuleId(p) == batstrRef) {
              newRows(1,1, c1,0);
      } else if (getModuleId(p) == batRef) {
          if (getFunctionId(p) == reverseRef ||
              getFunctionId(p) == setWriteModeRef  ||
              getFunctionId(p) == hashRef  ||
              getFunctionId(p) == mirrorRef) {
              newRows(1,1, c1,0);
          } else if (getFunctionId(p) == appendRef ||
                 getFunctionId(p) == insertRef ){
              /*
              * Updates are a little more complicated, because you have to
              * propagate changes in the expected size up the expression tree.
              * For example, the SQL snippet:
              *     _49:bat[:oid,:oid]{rows=0,bid=622}  := sql.bind_dbat("sys","example",3);
              *     _54 := bat.setWriteMode(_49);
              *     bat.append(_54,_47,true);
              * shows what is produced when it encounters a deletion. If a non-empty
              * append is not properly passed back to _49, the emptySet
              * optimizer might remove the complete deletion code.
              * The same holds for replacement operations, which add information to
              * an initially empty insertion BAT.
              */
              if( isaBatType(getArgType(mb,p,2)) ){
                  /* insert BAT */
                  newRows(1,2, (c1 + c2+1),1);
                  OPTbackpropagate(mb,i,getArg(p,1));
              } else {
                  /* insert scalars */
                  newRows(1,1, (c1 +1),1);
                  OPTbackpropagate(mb,i,getArg(p,1));
              }
          } else if (getFunctionId(p) == deleteRef){
              if( isaBatType(getArgType(mb,p,2)) ){
                  /* delete BAT */
                  newRows(1,2, (c1 - c2 ==0? 1: c1-c2),1);
                  OPTbackpropagate(mb,i,getArg(p,1));
              } else {
                  /* insert scalars */
                  newRows(1,1, (c1==1?1: c1-1),1);
                  OPTbackpropagate(mb,i,getArg(p,1));
              }
              OPTbackpropagate(mb,i,getArg(p,1));
          } else if (getFunctionId(p) == insertRef){
              newRows(1,1,( c1 + 1),0); /* faked */
              OPTbackpropagate(mb,i,getArg(p,1));
          }
      } else if (getModuleId(p)==groupRef) {
          if (getFunctionId(p) ==newRef) {
              newRows(1,1,( c1 / 10+1),0);
          } else {
              newRows(1,1, c1,0);
          }
      } else if (getModuleId(p)== aggrRef) {
          if (getFunctionId(p) == sumRef ||
              getFunctionId(p) == minRef ||
              getFunctionId(p) == maxRef ||
              getFunctionId(p) == avgRef) {
              newRows(1,1, ( c1?c1:c1+1),0);
          } else    if (getFunctionId(p) == countRef){
              newRows(1,1, 1,0);
          }
      } else if( p->token == ASSIGNsymbol && p->argc== 2){
          /* copy the rows property */
          c1 = getVarRows(mb, getArg(p,1));
          if (c1 != -1) {
              ValRecord v;
              
              varSetProp(mb, getArg(p,0), rowsProp, op_eq, VALset(&v, TYPE_wrd, &c1));
          }
      }
  }
  DEBUGoptimizers
      mnstr_printf(cntxt->fdout,"#opt_costModel: processed\n");
  return 1;

8.数据流优化器 :

MAL程序很大程度上是执行计划的逻辑描述。至少它关注没有副影响的操作。对于这些子计划执行的顺序需要的不是一个固定的优先级而可能是数据流驱动的评估。甚至使用多核互不影响地工作在数据流图中。数据流优化器分析代码和为了数据流驱动执行用保护块包装健壮的代码。当然,这只是必要的如果你可以前端决定可能有多线程的运行。

对于运行,解析器根据可用处理器核的数量来初始化多线程。接下来,合格的指令被排序和被解析器线程解析。数据流块可能不是成堆的。因此,为内联代码产生的任何数据流块首先被去除。

MonetDB数据流优化器实现代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
int OPTdataflowImplementation(Client cntxt, MalBlkPtr mb, MalStkPtr stk, InstrPtr p)
{
  int i,j,k, cnt, start=1,entries=0, actions=0;
  int flowblock= 0, dumbcopy=0;
  InstrPtr *old, q;
  int limit, slimit;
  Lifespan span;
  char *init;

  /* don't use dataflow on single processor systems */
  if (GDKnr_threads <= 1)
      return 0;

  (void) cntxt;
  (void) stk;
  /* inlined functions will get their dataflow control later */
  if ( varGetProp(mb, getArg(getInstrPtr(mb,0),0),inlineProp)!= NULL)
      return 0;
  span= setLifespan(mb);
  if( span == NULL)
      return 0;
  init= (char*) GDKzalloc(mb->vtop);
  if ( init == NULL){
      GDKfree(span);
      return 0;
  }
  limit= mb->stop;
  slimit= mb->ssize;
  old = mb->stmt;
  if ( newMalBlkStmt(mb, mb->ssize+20) <0 ){
      GDKfree(span);
      GDKfree(init);
      return 0;
  }
  pushInstruction(mb,old[0]);

  removeDataflow(old,limit);

  /* inject new dataflow barriers */
  for (i = 1; i<limit; i++) {
      p = old[i];

      if (p == NULL)
          continue;

      if (p->token == ENDsymbol)
          break;
      if (hasSideEffects(p,FALSE) || isUnsafeFunction(p) || blockCntrl(p) || (!dumbcopy && blockExit(p)) || (getModuleId(p) == sqlRef && isUpdateInstruction(p)) || dflowAssignTest(span,p,i) ){
          /* close old flow block */
          if (flowblock){
              int sf = simpleFlow(old,start,i);
              if (!sf && entries > 1){
                  for( j=start ; j<i; j++)
                  if (old[j])
                      for( k=0; k<old[j]->retc; k++)
                      if( getBeginLifespan(span,getArg(old[j],k)) > start && getEndLifespan(span,getArg(old[j],k)) >= i && init[getArg(old[j],k)]==0){
                          InstrPtr r= newAssignment(mb);
                          getArg(r,0)= getArg(old[j],k);
                          pushNil(mb,r,getArgType(mb,old[j],k));
                          init[getArg(old[j],k)]=1;
                      }
                  q= newFcnCall(mb,languageRef,dataflowRef);
                  q->barrier= BARRIERsymbol;
                  getArg(q,0)= flowblock;
                  /* dataflow blocks are transparent, because they are always
                    executed, either sequentially or in parallell */
                  varSetProperty(mb, getArg(q,0), "transparent",0,0);
              }
              for( j=start ; j<i; j++)
                  if (old[j])
                      pushInstruction(mb,old[j]);
              if (!sf && entries>1){
                  q= newAssignment(mb);
                  q->barrier= EXITsymbol;
                  getArg(q,0) = flowblock;
              }
              entries = 0;
              flowblock = 0;
              actions++;
          }
          pushInstruction(mb,p);
          continue;
      }

      if (blockStart(p)){
          dumbcopy++;
          if (dumbcopy == 1)
              /* close old flow block */
              if (flowblock){
                  int sf = simpleFlow(old,start,i);
                  if (!sf && entries > 1){
                      for( j=start ; j<i; j++)
                      if (old[j])
                          for( k=0; k<old[j]->retc; k++)
                          if( getBeginLifespan(span,getArg(old[j],k)) > start && getEndLifespan(span,getArg(old[j],k)) >= i && init[getArg(old[j],k)]==0){
                              InstrPtr r= newAssignment(mb);
                              getArg(r,0)= getArg(old[j],k);
                              pushNil(mb,r,getArgType(mb,old[j],k));
                              init[getArg(old[j],k)]=1;
                          }
                      q= newFcnCall(mb,languageRef,dataflowRef);
                      q->barrier= BARRIERsymbol;
                      getArg(q,0)= flowblock;
                      /* dataflow blocks are transparent, because they are always
                        executed, either sequentially or in parallell */
                      varSetProperty(mb, getArg(q,0), "transparent",0,0);
                  }
                  for( j=start ; j<i; j++)
                      if (old[j])
                          pushInstruction(mb,old[j]);
                  if (!sf && entries>1){
                      q= newAssignment(mb);
                      q->barrier= EXITsymbol;
                      getArg(q,0) = flowblock;
                  }
                  entries = 0;
                  flowblock = 0;
                  actions++;
              }
      }
      if (blockExit(p)) {
          assert(flowblock == 0);
          dumbcopy--;
          pushInstruction(mb,p);
          continue;
      }
      if (dumbcopy) {
          assert(flowblock == 0);
          pushInstruction(mb,p);
          continue;
      }
      if (flowblock == 0){
          flowblock = newTmpVariable(mb,TYPE_bit);
          entries = 0;
          start = i;
      }
      /* check if the instruction can start a flow */
      /* this should be a function call with multiple arguments */
      cnt = 0;
      if (getFunctionId(p))
          for(j=p->retc; j<p->argc; j++)
              if ( isVarConstant(mb, getArg(p,j)) || getLastUpdate(span, getArg(p,j)) <= start)
                  cnt++;
      if (cnt && dflowAssignTest(span,p,i))
          cnt = 0;

      if (cnt && cnt == p->argc-p->retc)
          entries++;
  }
  /* close old flow block */
  if (flowblock){
      int sf = simpleFlow(old,start,i);
      if (!sf && entries > 1){
          for( j=start ; j<i; j++)
          if (old[j])
              for( k=0; k<old[j]->retc; k++)
              if( getBeginLifespan(span,getArg(old[j],k)) > start && getEndLifespan(span,getArg(old[j],k)) >= i && init[getArg(old[j],k)]==0){
                  InstrPtr r= newAssignment(mb);
                  getArg(r,0)= getArg(old[j],k);
                  pushNil(mb,r,getArgType(mb,old[j],k));
                  init[getArg(old[j],k)]=1;
              }
          q= newFcnCall(mb,languageRef,dataflowRef);
          q->barrier= BARRIERsymbol;
          getArg(q,0)= flowblock;
          /* dataflow blocks are transparent, because they are always
            executed, either sequentially or in parallell */
          varSetProperty(mb, getArg(q,0), "transparent",0,0);
      }
      for( j=start ; j<i; j++)
          if (old[j])
              pushInstruction(mb,old[j]);
      if (!sf && entries>1){
          q= newAssignment(mb);
          q->barrier= EXITsymbol;
          getArg(q,0) = flowblock;
      }
      entries = 0;
      flowblock = 0;
      actions++;
  }
  /* take the remainder as is */
  for (; i<limit; i++)
      if (old[i])
          pushInstruction(mb,old[i]);
  for (; i<slimit; i++)
      if (old[i])
          freeInstruction(old[i]);
  GDKfree(old);
  GDKfree(span);
  GDKfree(init);
  DEBUGoptimizers
      mnstr_printf(cntxt->fdout,"#opt_dataflow: %d flow blocks created\n",actions);
  return actions;
}

9.去除无用代码

无用代码碎片通过赋值到不再使用的变量被识别。它可以被探测通过标记作为使用的参数作为相关的所有的变量。同时,我们建立应该在最后结果出现的一系列指令。新建的代码块在一次扫描中建立,去除无用的指令。指令对环境产生副影响,如,输出和更新BAT应该被考虑进来。这样(可能递归)函数应该被标记有一个(不安全)的属性。现在我们识别了几个重要的,否则,指令被标记为控制流指令应该被保留。一个说明性例子的MAL片段如下:

1
2
3
4
5
6
7
8
9
    V7 := bat.new(:oid,:int);
  V10 := bat.new(:int,:oid);
  V16 := algebra.markH(V7);
  V17 := algebra.join(V16,V7);
  V19 := bat.new(:oid,:int);
  V22 := bat.new(:oid,:int);
  V23 := algebra.join(V16,V22);
  io.print("done");
  optimizer.deadCodeRemoval();

去除无用的代码使程序缩小到一下的短小的代码块:

1
    io.print("done");

对无用代码的提炼来源与使用停止存在的参数因为行为被优化器做了。如,在下面的代码片段PUSHRANGES优化器可能得出变量V31变为空的和简单地通过去掉赋值语句注入一个‘无用’变量。这同时使其他代码无用。

1
2
3
4
    V30 := algebra.select( V7, 10,100);
  V31 := algebra.select(V30,-1,5);
  V32 := aggr.sum(V31);
  io.print(V32);
MonetDB去除无用的代码实现
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
int OPTdeadcodeImplementation(Client cntxt, MalBlkPtr mb, MalStkPtr stk, InstrPtr pci)
{
  int i, k, se,limit, slimit;
  InstrPtr p=0, *old= mb->stmt;
  int actions = 0;

  (void) pci;
  (void) stk;     /* to fool compilers */

  if (varGetProp(mb, getArg(mb->stmt[0], 0), inlineProp) != NULL)
      return 0;

  clrDeclarations(mb);
  chkDeclarations(cntxt->fdout, mb);
  limit= mb->stop;
  slimit = mb->ssize;
  if ( newMalBlkStmt(mb, mb->ssize) < 0)
      return 0;

  pushInstruction(mb, old[0]);
  for (i = 1; i < limit; i++) {
      p= old[i];

      se = p->token == ENDsymbol;
      if( se){
          pushInstruction(mb,p);
          for(i++; i<limit; i++)
              if(old[i])
                  pushInstruction(mb,old[i]);
          break;
      }
      if( p->token != NOOPsymbol)
      for (k = 0; k < p->retc; k++)
          if( isVarUsed(mb,getArg(p,k)) ){
              se++;
              break;
          }

      if ( p->token == NOOPsymbol){
          freeInstruction(p);
          actions++;
      } else
      if( getModuleId(p)== sqlRef && getFunctionId(p)== assertRef &&
          isVarConstant(mb,getArg(p,1)) && getVarConstant(mb,getArg(p,1)).val.ival==0){
          freeInstruction(p);
          actions++;
      } else
      if (se || hasSideEffects(p, FALSE) || isUpdateInstruction(p) || !isLinearFlow(p) ||
              isProcedure(mb,p)  ||
              (p->retc == 1 && varGetProp( mb, getArg(p,0), unsafeProp ) != NULL) ||
              p->barrier /* ==side-effect */)
          pushInstruction(mb,p);
      else {
          freeInstruction(p);
          actions++;
      }
  }
  for(; i<slimit; i++)
      if( old[i])
          freeInstruction(old[i]);
  DEBUGoptimizers
      mnstr_printf(cntxt->fdout,"#opt_deadcode: %d statements removed\n", actions);
  GDKfree(old);
  /* we may have uncovered new use-less operations */
  if (actions)
      actions += OPTdeadcodeImplementation(cntxt,mb, stk, pci);
  return actions;
}

10.去除空集:

在MAL优化期间其中最关键决定是估计产生和消耗的BAT的大小。两种情况对标志处理有兴趣。也就是,当一个BAT被知道没有包含元组和元组只有一个元素。这样的信息可能来自应用领域只是或者作为从标志评估另外的影响。这关联到作为属性被探测的程序。空集属性被呈现的消减算法使用。任何空集在程序中被传播到达一个更小和因此更快的评估。如,考虑接下来的MAL测试:

1
2
3
4
5
6
7
8
9
10
    V1 := bat.new(:oid,:int);
  V7 := bat.new(:oid,:int);
  V10{rows=0} := bat.new(:int,:oid);
  V11 := bat.reverse(V10);
  V12 := algebra.kdifference(V7,V11);
  V16 := algebra.markH(V12);
  V17 := algebra.join(V16,V7);
  bat.append(V1,V17);
  optimizer.costModel();
  optimizer.emptySet();

调用优化器用接下来的程序片段取代上面的程序

1
2
3
4
5
6
7
8
    V1 := bat.new(:oid,:int);
  V7 := bat.new(:oid,:int);
  V10{rows=0} := bat.new(:int,:oid);
  V11{rows=0} := bat.new(:oid,:int);
  V12 := V7;
  V16 := algebra.markH(V12);
  V17 := algebra.join(V16,V7);
  bat.append(V1,V17);

这代码块可以使用别名传播和去除无用代码继续优化。最后的代码块如下:

1
2
3
4
5
    V1 := bat.new(:oid,:int);
  V7 := bat.new(:oid,:int);
  V16 := algebra.markH(V7);
  V17 := algebra.join(V16,V7);
  bat.append(V1,V17);

在空集传播时,新的候选者可能出现。如,与空集进行交运算还是得到空集。它成为中间优化的目标。当前的实现是保守的。一个有限的指令集合被考虑。任何添加到MonetDB指令集调用评估它们的影响。

MonetDB去除空集的代码实现
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
static int  ESevaluate(Client cntxt, MalBlkPtr mb, char *empty)
{
  int i, j, actions = 0;
  InstrPtr p;
  str existRef = putName("exist", 5);
  str uniqueRef = putName("unique", 6);
  str suniqueRef = putName("sunique", 7);
  str intersectRef = putName("intersect", 9);
  str sintersectRef = putName("sintersect", 10);
  str kintersectRef = putName("kintersect", 10);
  str fragmentRef = putName("fragment", 8);
  int *alias;
  int runonce= FALSE;

  int limit = mb->stop, slimit= mb->ssize, ctop=0;
  InstrPtr *old = mb->stmt, *constraints;

  (void) cntxt;
  /* get query property */
  runonce = (varGetProp(mb, getArg(old[0], 0), runonceProp) != NULL);
  if (varGetProp(mb, getArg(old[0], 0), inlineProp) != NULL)
      return 0;

  if ( newMalBlkStmt(mb, mb->ssize) < 0)
      return 0;
  constraints= (InstrPtr *) GDKmalloc(sizeof(InstrPtr)*slimit);
  if ( constraints == NULL) {
      GDKfree(mb->stmt);
      mb->stmt = old;
      mb->stop = limit;
      mb->ssize = slimit;
      return 0;
  }
  alias = (int*) GDKmalloc(sizeof(int)*mb->vtop);
  if( alias == NULL){
      GDKfree(mb->stmt);
      mb->stmt = old;
      mb->stop = limit;
      mb->ssize = slimit;
      GDKfree(constraints);
      return 0;
  }
  for(i=0;i<mb->vtop; i++)
      alias[i]=i;

  /* Symbolic evaluation of the empty BAT variables */
  /* by looking at empty BAT arguments */
  for (i = 0; i < limit; i++) {
      char *f;
      p = old[i];

      pushInstruction(mb,p);
      if (p->token == ENDsymbol){
          for(i++; i<limit; i++)
              if (old[i])
                  pushInstruction(mb,old[i]);
          break;
      }
      for(j=0; j<p->argc; j++)
          p->argv[j] = alias[getArg(p,j)];
      /*
          * The bulk of the intelligence lies in inspecting calling
          * sequences to filter and replace calls with empty arguments.
          */
      f = getFunctionId(p);
      if (getModuleId(p) == sqlRef &&
          empty[getArg(p,0)] &&
         (f == bindRef || f == bindidxRef || f == binddbatRef)){
          InstrPtr q;
          /*
          * @-
          * The emptyset assertion is only needed once for relational insertions.
          * We assume here that string constants have been matched already.
          */
          if( f == bindRef && runonce == FALSE) {
              for( j=ctop-1; j>=0; j--){
                  q= constraints[j];
                  if( strcmp(getVarConstant(mb,getArg(q,2)).val.sval, getVarConstant(mb,getArg(p,2)).val.sval)==0 &&
                      strcmp(getVarConstant(mb,getArg(q,3)).val.sval, getVarConstant(mb,getArg(p,3)).val.sval)==0 &&
                      getVarConstant(mb,getArg(p,5)).val.ival<=2 && /* no updates etc */
                      getVarConstant(mb,getArg(q,5)).val.ival == getVarConstant(mb,getArg(p,5)).val.ival
                  )
                      /* don't generate the assertion */
                      goto ignoreConstraint;
              }

              q = newStmt1(mb, constraintsRef, "emptySet");
              (void) pushArgument(mb, q, getArg(p,0) );
              constraints[ctop++]= p;
          }
      ignoreConstraint:
          continue;
      }

      for (j = p->retc; j < p->argc; j++) {

          if (empty[getArg(p, j)]) {
              /* decode operations */
              if (getModuleId(p)== algebraRef) {
                  if (f == existRef) {
                      /* always false */
                      setModuleId(p, NULL);
                      setFunctionId(p, NULL);
                      p->argc = 1;
                      p->token = ASSIGNsymbol;
                      (void) pushBit(mb, p, FALSE);
                      actions++;
                      break;
                  }
                  if ( f == selectRef ||
                       f == tuniqueRef ||
                       f == likeRef  ||
                       f == sortRef  ||
                       f == sortTailRef  ||
                       f == sortHTRef  ||
                       f == sortTHRef  ||
                       f == uniqueRef  ||
                       f == suniqueRef  ||
                       f == kuniqueRef  ||
                       f == intersectRef  ||
                       f == semijoinRef ||
                       f == sintersectRef  ||
                       f == kintersectRef  ||
                       f == fragmentRef ){

                      /* result is empty */
                      propagate(1);
                      break;
                  }
                  if ( f == differenceRef ||
                       f == kdifferenceRef ) {
                      propagate(1);
                      break;
                  }
                  if ( f == sunionRef ||
                       f == kunionRef ||
                       f == unionRef) {
                      /* copy non-empty argument */
                      if( j == 1) {
                          propagate(2);
                      } else {
                          propagate(1);
                      }
                      break;
                  }
              }
              if (getModuleId(p)== batRef) {
                  if ( f == reverseRef || f == mirrorRef ){
                      empty[getArg(p, 0)]= 1;
                  }
              }
              /*
              * @-
              * If the target variable is empty and the function does not
              * have a side-effect, we can replace it with a construction
              * of the empty set. The dead-code optimizer will take care
              * of removal of superflous constructions.
              */
              if( p->retc==1 && p->token == ASSIGNsymbol &&
                  !isLinearFlow(p) &&
                  isaBatType(getArgType(mb,p,0))){
                  int tpe=getArgType(mb,p,0);
                  clrFunction(p);
                  setModuleId(p, batRef);
                  setFunctionId(p, newRef);
                  p= pushArgument(mb, p,
                      newTypeVariable(mb, getHeadType(tpe)));
                  (void) pushArgument(mb, p,
                      newTypeVariable(mb, getTailType(tpe)));
                  actions++;
                  break;
              }
          }
      }
  }
  for(; i<slimit; i++)
      if (old[i])
          freeInstruction(old[i]);
  GDKfree(old);
  if (actions) {
      DEBUGoptimizers
      mnstr_printf(cntxt->fdout,"#opt_emptyset: %d empty sets statements removed\n",actions);
      clrAllTypes(mb);     /* force a complete resolve */
  }
  GDKfree(constraints);
  GDKfree(alias);
  return actions;
}
 /*We first have to find all candidates for empty set removal.
  They are recognized by an estimated zero row count and they
  are not the target of an update.*/

int OPTemptySetImplementation(Client cntxt, MalBlkPtr mb, MalStkPtr stk, InstrPtr p)
{
  char *empty;
  int i;

  empty = (char *) GDKzalloc(mb->vsize * sizeof(char));
  if ( empty == NULL)
      return 0;
  (void) stk;
  (void) p;
  for (i = 0; i < mb->vtop; i++) {
      if (getVarRows(mb, i) == 0) {
          OPTDEBUGemptySet
              mnstr_printf(cntxt->fdout, "#START emptyset optimizer %d", i);
          empty[i] = 1;
      }
  }
  OPTDEBUGemptySet mnstr_printf(cntxt->fdout, "\n");
  i= ESevaluate(cntxt, mb, empty);
  GDKfree(empty);
  return i;
}

11.垃圾回收:

对临时的变量的垃圾回收,如字符串和BATs,在返回函数调用的时候发生。特别对于BATs这可能保留可观的资源锁定长于严格必要的时间。尽管程序员可以影响它们的生命周期通过给它们赋值NIL,从而触发垃圾会后,依靠优化器去注入这样的语句更加恰当。因为它使程序更加短小和有一个更好的代码优化的目标。OPTIMIZER.GARBAGECOLLECTOR()操作去除所有结束生命周期的BAT为新的提供空间。这特别被调用为优化的最后一步。垃圾回收影响的一小段代码:

1
2
3
4
5
6
7
8
    t1 := bat.new(:oid,:int);
  t2 := array.grid(132000,8,1,0);
  t3 := array.grid(1,100,10560,0);
  t4 := array.grid(1,100,10560,0,8);
  t5 := batcalc.+(t2,t4);
  t6 := batcalc.oid(t5);
  t7 := algebra.join(t6,t1);
  optimizer.garbageCollector();

被转换为以下的代码块

1
2
3
4
5
6
7
8
9
10
11
12
    t1 := bat.new(:oid,:int);
  t2 := array.grid(132000,8,1,0);
  t3 := array.grid(1,100,10560,0);
  t4 := array.grid(1,100,10560,0,8);
  t5 := batcalc.+(t2,t4);
  bat.setGarbage(t2);
  bat.setGarbage(t4);
  t6 := batcalc.oid(t5);
  bat.setGarbage(t5);
  t7 := algebra.join(t6,t1);
  bat.setGarbage(t6);
  bat.setGarbage(t1);

当前的算法是直接的。在每一条指令后,我们检查在未来其BAT参数是否需要。如果不需要,我们注入垃圾回收语句去释放她们呢,如果没有其它理由去保留它。这应该小心地去做,因为指令可能是循环的一部分。如果变量在循环里定义,那么我们可以安全地去掉它。

MonetDB垃圾回收代码实现
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
 /*Keeping variables around beyond their end-of-life-span
 can be marked with the proper 'keep'.*/
int OPTgarbageCollectorImplementation(Client cntxt, MalBlkPtr mb, MalStkPtr stk, InstrPtr pci)
{
  int i, j, k, n = 0, limit, vlimit, depth=0, slimit;
  InstrPtr p, q, *old;
  int actions = 0;
  Lifespan span;

  (void) pci;
  (void) cntxt;
  (void) stk;
  if (varGetProp(mb, getArg(mb->stmt[0], 0), inlineProp) != NULL)
      return 0;

  span = setLifespan(mb);
  if ( span == NULL)
      return 0;

  old= mb->stmt;
  limit = mb->stop;
  slimit = mb->ssize;
  vlimit = mb->vtop;
  if ( newMalBlkStmt(mb,mb->ssize) < 0) {
      GDKfree(span);
      return 0;
  }

  p = NULL;
  for (i = 0; i < limit; i++) {
      p = old[i];
      p->gc &=  ~GARBAGECONTROL;

      if ( p->barrier == RETURNsymbol){
          pushInstruction(mb, p);
          continue;
      }
      if (blockStart(p) )
          depth++;
      if ( p->token == ENDsymbol)
          break;
      
      pushInstruction(mb, p);
      n = mb->stop-1;
      for (j = 0; j < p->argc; j++) {
          if (getEndLifespan(span,getArg(p,j)) == i && isaBatType(getArgType(mb, p, j)) ){
              mb->var[getArg(p,j)]->eolife = n;
              p->gc |= GARBAGECONTROL;
          }
      }
      if (blockExit(p) ){
          /* force garbage collection of all within upper block */
          depth--;
          for (k = 0; k < vlimit; k++) {
              if (getEndLifespan(span,k) == i &&
                  isaBatType(getVarType(mb,k)) &&
                  varGetProp(mb, k, keepProp) == NULL){
                      q= newAssignment(mb);
                      getArg(q,0) = k;
                      setVarUDFtype(mb,k);
                      setVarFixed(mb,k);
                      q= pushNil(mb,q, getVarType(mb,k));
                      q->gc |= GARBAGECONTROL;
                      mb->var[k]->eolife = mb->stop-1;
                      actions++;
              }
          }
      }
  }
  assert(p);
  assert( p->token == ENDsymbol);
  pushInstruction(mb, p);
  for (i++; i < limit; i++)
      pushInstruction(mb, old[i]);
  for (; i < slimit; i++)
      if (old[i])
          freeInstruction(old[i]);
  getInstrPtr(mb,0)->gc |= GARBAGECONTROL;
  GDKfree(old);
  OPTDEBUGgarbageCollector{
      int k;
      mnstr_printf(cntxt->fdout, "#Garbage collected BAT variables \n");
      for ( k =0; k < vlimit; k++)
      mnstr_printf(cntxt->fdout,"%10s eolife %3d  begin %3d lastupd %3d end %3d\n",
          getVarName(mb,k), mb->var[k]->eolife,
          getBeginLifespan(span,k), getLastUpdate(span,k), getEndLifespan(span,k));
      mnstr_printf(cntxt->fdout, "End of GCoptimizer\n");
  }
  GDKfree(span);

  DEBUGoptimizers
      mnstr_printf(cntxt->fdout,"#opt_garbagecollector: %d variables reset\n",actions);
  return actions+1;
}

13.连接路径:

任务OPTIMIZER.JOINPATH()浏览代码寻找连接操作和级联它们到多连接路径。为了说明,考虑:

1
2
3
4
5
6
7
    a:= bat.new(:oid,:oid);
  b:= bat.new(:oid,:oid);
  c:= bat.new(:oid,:str);
  j1:= algebra.join(a,b);
  j2:= algebra.join(j1,c);
  j3:= algebra.join(b,b);
  j4:= algebra.join(b,j3);

优化器首先会通过它们连接的顺序替代所有的参数。下面的指令留给去除无用代码优化器优化

1
2
3
4
5
    a:= bat.new(:oid,:oid);
  j1:= algebra.join(a,b);
  j2:= algebra.joinPath(a,b,c);
  j3:= algebra.join(b,b);
  j4:= algebra.joinPath(b,b,b);

在原则上,连接路径可能包含改善性能的公共的子路径。SQL front-end 经常产生以下代码片段:

1
2
3
4
5
    t1:= algebra.join(b,c);
  z1:= algebra.join(a,t1);
  ...
  t2:= algebra.join(b,d);
  z2:= algebra.join(a,t2);

连接路径合并成:

1
2
3
    z1:= algebra.joinPath(a,b,c);
  ...
  z2:= algebra.joinPath(a,b,d);

由启发式寻找最先的两个参数控制和重用实质的连接

1
2
3
4
    _13:= algebra.join(a,b);
  z1:= algebra.join(_13,c);
  ...
  z2:= algebra.join(_13,d);

一个替代是公共重新使用的路径重识别到连接路径主体继承的部分

1
2
3
4
5
6
    x3:= algebra.join(a,b);
  r3:= bat.reverse(x3);
  j1:= join(c,r3);
  rb:= bat.reverse(b);
  ra:= bat.reverse(a);
  j1:= algebra.joinpath(c,rb,ra);
MonetDB连接路径优化代码的实现
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
int OPTjoinPathImplementation(Client cntxt, MalBlkPtr mb, MalStkPtr stk, InstrPtr p)
{
  int i,j,k, actions=0;
  int *pc;
  str joinPathRef = putName("joinPath",8);
  str leftjoinPathRef = putName("leftjoinPath",12);
  str semijoinPathRef = putName("semijoinPath",12);
  InstrPtr q,r;
  InstrPtr *old;
  int *varcnt;       /* use count */
  int limit,slimit;

  (void) cntxt;
  (void) stk;
  if (varGetProp(mb, getArg(mb->stmt[0], 0), inlineProp) != NULL)
      return 0;

  old= mb->stmt;
  limit= mb->stop;
  slimit= mb->ssize;
  if ( newMalBlkStmt(mb,mb->ssize) < 0)
      return 0;

  /* beware, new variables and instructions are introduced */
  pc= (int*) GDKzalloc(sizeof(int)* mb->vtop * 2); /* to find last assignment */
  varcnt= (int*) GDKzalloc(sizeof(int)* mb->vtop * 2);
  if (pc == NULL || varcnt == NULL){
      if (pc ) GDKfree(pc);
      if (varcnt ) GDKfree(varcnt);
      return 0;
  }
  /*
  * @-
  * Count the variable use as arguments first.
  */
  for (i = 0; i<limit; i++){
      p= old[i];
      for(j=p->retc; j<p->argc; j++)
          varcnt[getArg(p,j)]++;
  }

  for (i = 0; i<limit; i++){
      p= old[i];
      if( getModuleId(p)== algebraRef && (getFunctionId(p)== joinRef || getFunctionId(p) == leftjoinRef || getFunctionId(p) == semijoinRef)){
          /*
          * @-
          * Try to expand its argument list with what we have found so far.
          * This creates a series of join paths, many of which will be removed during deadcode elimination.
          */
          q= copyInstruction(p);
          q->argc=1;
          for(j=p->retc; j<p->argc; j++){
              r= getInstrPtr(mb,pc[getArg(p,j)]);
              /*
              * @-
              * Don't inject a pattern when it is used more than once.
              */
              if (r && varcnt[getArg(p,j)] > 1){
                  OPTDEBUGjoinPath {
                      mnstr_printf(cntxt->fdout,"#double use %d %d\n", getArg(p,j), varcnt[getArg(p,j)]);
                      printInstruction(cntxt->fdout,mb, 0, p, LIST_MAL_ALL);
                  }
                  r = 0;
              }
              OPTDEBUGjoinPath {
                  mnstr_printf(cntxt->fdout,"#expand list \n");
                  printInstruction(cntxt->fdout,mb, 0, p, LIST_MAL_ALL);
                  printInstruction(cntxt->fdout,mb, 0, q, LIST_MAL_ALL);
              }
              if ( getFunctionId(p) == joinRef){
                  if( r &&  getModuleId(r)== algebraRef && ( getFunctionId(r)== joinRef  || getFunctionId(r)== joinPathRef) ){
                      for(k= r->retc; k<r->argc; k++)
                          q = pushArgument(mb,q,getArg(r,k));
                  } else
                      q = pushArgument(mb,q,getArg(p,j));
              } else if ( getFunctionId(p) == leftjoinRef){
                  if( r &&  getModuleId(r)== algebraRef && ( getFunctionId(r)== leftjoinRef  || getFunctionId(r)== leftjoinPathRef) ){
                      for(k= r->retc; k<r->argc; k++)
                          q = pushArgument(mb,q,getArg(r,k));
                  } else
                      q = pushArgument(mb,q,getArg(p,j));
              } else if ( getFunctionId(p) == semijoinRef){
                  if( r &&  getModuleId(r)== algebraRef && ( getFunctionId(r)== semijoinRef  || getFunctionId(r)== semijoinPathRef) ){
                      for(k= r->retc; k<r->argc; k++)
                          q = pushArgument(mb,q,getArg(r,k));
                  } else
                      q = pushArgument(mb,q,getArg(p,j));
              }
          }
          OPTDEBUGjoinPath {
              chkTypes(cntxt->fdout, cntxt->nspace,mb,TRUE);
              mnstr_printf(cntxt->fdout,"#new [left]joinPath instruction\n");
              printInstruction(cntxt->fdout,mb, 0, q, LIST_MAL_ALL);
          }
          if(q->argc<= p->argc){
              /* no change */
              freeInstruction(q);
              goto wrapup;
          }
          /*
          * @-
          * Final type check and hardwire the result type, because that  can not be inferred directly from the signature
          */
          for(j=1; j<q->argc-1; j++)
              if( getTailType(getArgType(mb,q,j)) != getHeadType(getArgType(mb,q,j+1)) &&
              !( getTailType(getArgType(mb,q,j))== TYPE_oid  &&
              getHeadType(getArgType(mb,q,j))== TYPE_void) &&
              !( getTailType(getArgType(mb,q,j))== TYPE_void &&
              getHeadType(getArgType(mb,q,j))== TYPE_oid)){
              /* don't use it */
                  freeInstruction(q);
                  goto wrapup;
              }

          /* fix the type */
          setVarUDFtype(mb, getArg(q,0));
          setVarType(mb, getArg(q,0), newBatType( getHeadType(getArgType(mb,q,q->retc)), getTailType(getArgType(mb,q,q->argc-1))));
          if ( q->argc > 3  &&  getFunctionId(q) == joinRef)
              setFunctionId(q,joinPathRef);
          else if ( q->argc > 3  &&  getFunctionId(q) == leftjoinRef)
              setFunctionId(q,leftjoinPathRef);
          else if ( q->argc > 2  &&  getFunctionId(q) == semijoinRef)
              setFunctionId(q,semijoinPathRef);
          freeInstruction(p);
          p= q;
          actions++;
      }
  wrapup:
      pushInstruction(mb,p);
      for(j=0; j< p->retc; j++)
          pc[getArg(p,j)]= mb->stop-1;
  }
  for(; i<slimit; i++)
      if(old[i])
          freeInstruction(old[i]);
  /* perform the second phase, try out */
  if (actions )
      actions += OPTjoinSubPath(cntxt, mb);
  GDKfree(old);
  GDKfree(pc);
  if (varcnt ) GDKfree(varcnt);
  DEBUGoptimizers
      mnstr_printf(cntxt->fdout,"#opt_joinpath: %d statements glued\n",actions);
  return actions;
}

Comments