令狐冲慢慢走近,那汉子全身发抖,双膝一屈,跪倒在雪地之中。令狐冲怒道:“你辱我师妹,须饶你不得。”长剑指在他咽喉之上,心念一动,走近一步,低声问道:“写在雪人上的,是些什么字?”
那汉子颤声道:“是……是……‘海枯……海枯……石烂,两……情……情不……不渝’。”自从世上有了“海枯石烂,两情不渝”这八个字以来,说得如此胆战心惊、丧魂落魄的,只怕这是破题儿第一遭了。
令狐冲一呆,道:“嗯,是海枯石烂,两情不渝。”心头酸楚,长剑送出,刺入他咽喉。
——《笑傲江湖》
语义分析较困难的根本原因在于语法的可递归性,深层次的递归使得问题的分解看起来变得相当地复杂。但是如果能将递归问题转化为迭代问题,便能很大程度地简化此问题模型。递归转化为迭代的关键在于——找到最深层递归结构的全部特征,迭代化之,问题便迎刃而解。
一般情况下,人们在面对复杂的递归问题时时,亦是依据其语法规则,找到其递归深层的结构,化解之,步步迭代,如此,问题便得到解决。人类的思维很是擅长将递归问题转化为迭代问题,而学习知识的过程,则可以看成是对各种各样语法规则的理解与掌握。
一元操作符、二元操作符的递归问题,可以很简单的转化为迭代,多元操作符的情况稍复杂些。
所有的操作符及其优先级如下图:
如typeof、取地址、指针指向等,在这里并未实现。实现的包括有算数运算式、逻辑运算式、函数调用与括号。对于理解语义分析的过程,已足够。
对于不包含括号与函数的简单表达式,我们语义分析演算过程如下:
我们的数据结构:
1 '''
2 ____________________________ Syntax Tree
3 Parenthesis:
4 ["(",None]
5 [")",None]
6 Operators(grouped by precedence):
7 Unary :
8 1 + - ! ~ ["+",None] ["-",None] ["!",None] ["~",None]
9 Binary :
10 2 * / % ["*",None] ["/",None] ["%",None]
11 3 + - ["+",None] ["-",None]
12 4 << >> ["<<",None] [">>",None]
13 5 > >= < <= [">",None] [">=",None] ["<",None] ["<=",None]
14 6 == != ["==",None] ["!=",None]
15 7 & ["&",None]
16 8 ^ ["^",None]
17 9 | ["|",None]
18 10 && ["&&",None]
19 11 || ["||",None]
20 Ternary :
21 12 expr ? expr : expr ["?",None] [":",None] ["@expr","?:",listPtr0,listPtr1,listPtr2]
22 13 expr , expr , expr...
23 Var,Num,Expr,Function:
24 ["@var","varName"]
25 ["@num","num_string"]
26 ["@expr","Operator",listPtr,...]
27 ["@func","funcName",listPtr1,...]
28 ["@expr_list",["@var"|"@num"|"@expr"|"@func",...],...]
29 '''
这是我们最终的代码模块图:
其中形如 module_x_y 的函数,x表示此运算符的优先级,y表示横向序号,从零开始。代码注释已经写得很详细了,请看源代码:
1 ######################################## global list
2 OperatorList=['+','-','!','~',\
3 '*','/','%',\
4 '+','-',\
5 '<<','>>',\
6 '>','>=','<','<=',\
7 '==','!=',\
8 '&',\
9 '^',\
10 '|',\
11 '&&',\
12 '||',\
13 '?',':'\
14 ',']
15 ''' 31 + 8 * 9 '''
16 listToParse=[ ['@num','31'] , ['+',None] , ['@num','8'] , ['*',None] , ['@num','9'] ]
17
18 ########### return value :
19 ############# 0 parsed some expresions
20 ############# 1 done nothing but no errors happened
21 ################# + =: ^+A... | ...Op+A...
22 def module_1_0(lis,i):
23
24 # left i right are both indexes :)
25 left=i-1
26 right=i+1
27
28 # process: ^+A...
29 if i==0 and len(lis)>=2:
30 if lis[right][0][0]=='@':
31 rightPtr=lis[right]
32 del lis[0:2]
33 lis.insert(0,["@expr","+",rightPtr])
34 return 0
35 # process: ...Op+A...
36 if i>=1 and len(lis)>=3 and right<len(lis):
37 if lis[left][0] in OperatorList:
38 if lis[right][0][0]=='@':
39 rightPtr=lis[right]
40 del lis[i:i+2]
41 lis.insert(i,["@expr","+",rightPtr])
42 return 0
43
44 return 1
45
46 ########### return value :
47 ############# 0 parsed some expresions
48 ############# 1 done nothing but no errors happened
49 ################# - =: ^-A... | ...Op-A...
50 def module_1_1(lis,i):
51
52 # left i right are both indexes :)
53 left=i-1
54 right=i+1
55
56 # process: ^-A...
57 if i==0 and len(lis)>=2:
58 if lis[right][0][0]=='@':
59 rightPtr=lis[right]
60 del lis[0:2]
61 lis.insert(0,["@expr","-",rightPtr])
62 return 0
63 # process: ...Op-A...
64 if i>=1 and len(lis)>=3 and right<len(lis):
65 if lis[left][0] in OperatorList:
66 if lis[right][0][0]=='@':
67 rightPtr=lis[right]
68 del lis[i:i+2]
69 lis.insert(i,["@expr","-",rightPtr])
70 return 0
71
72 return 1
73
74 ########### return value :
75 ############# 0 parsed some expresions
76 ############# 1 done nothing but no errors happened
77 ################# ! =: ...!A...
78 def module_1_2(lis,i):
79
80 # left i right are both indexes :)
81 left=i-1
82 right=i+1
83
84 # process: ...!A...
85 if len(lis)>=2 and right<len(lis):
86 if lis[right][0][0]=='@':
87 rightPtr=lis[right]
88 del lis[i:i+2]
89 lis.insert(i,["@expr","!",rightPtr])
90 return 0
91
92 return 1
93
94 ########### return value :
95 ############# 0 parsed some expresions
96 ############# 1 done nothing but no errors happened
97 ################# ~ =: ...~A...
98 def module_1_3(lis,i):
99
100 # left i right are both indexes :)
101 left=i-1
102 right=i+1
103
104 # process: ...~A...
105 if len(lis)>=2 and right<len(lis):
106 if lis[right][0][0]=='@':
107 rightPtr=lis[right]
108 del lis[i:i+2]
109 lis.insert(i,["@expr","~",rightPtr])
110 return 0
111
112 return 1
113
114 ########### return value :
115 ############# 0 parsed some expresions
116 ############# 1 done nothing but no errors happened
117 ################# * =: ...A*A...
118 def module_2_0(lis,i):
119
120 # left i right are both indexes :)
121 left=i-1
122 right=i+1
123
124 # process: ...A*A...
125 if i>=1 and len(lis)>=3 and right<len(lis):
126 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
127 leftPtr=lis[left]
128 rightPtr=lis[right]
129 del lis[left:left+3]
130 lis.insert(left,["@expr","*",leftPtr,rightPtr])
131 return 0
132
133 return 1
134
135 ########### return value :
136 ############# 0 parsed some expresions
137 ############# 1 done nothing but no errors happened
138 ################# / =: ...A/A...
139 def module_2_1(lis,i):
140
141 # left i right are both indexes :)
142 left=i-1
143 right=i+1
144
145 # process: ...A/A...
146 if i>=1 and len(lis)>=3 and right<len(lis):
147 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
148 leftPtr=lis[left]
149 rightPtr=lis[right]
150 del lis[left:left+3]
151 lis.insert(left,["@expr","/",leftPtr,rightPtr])
152 return 0
153
154 return 1
155
156 ########### return value :
157 ############# 0 parsed some expresions
158 ############# 1 done nothing but no errors happened
159 ################# % =: ...A%A...
160 def module_2_2(lis,i):
161
162 # left i right are both indexes :)
163 left=i-1
164 right=i+1
165
166 # process: ...A%A...
167 if i>=1 and len(lis)>=3 and right<len(lis):
168 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
169 leftPtr=lis[left]
170 rightPtr=lis[right]
171 del lis[left:left+3]
172 lis.insert(left,["@expr","%",leftPtr,rightPtr])
173 return 0
174
175 return 1
176
177 ########### return value :
178 ############# 0 parsed some expresions
179 ############# 1 done nothing but no errors happened
180 ################# + =: ...A+A...
181 def module_3_0(lis,i):
182
183 # left i right are both indexes :)
184 left=i-1
185 right=i+1
186
187 # process: ...A+A...
188 if i>=1 and len(lis)>=3 and right<len(lis):
189 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
190 leftPtr=lis[left]
191 rightPtr=lis[right]
192 del lis[left:left+3]
193 lis.insert(left,["@expr","+",leftPtr,rightPtr])
194 return 0
195
196 return 1
197
198 ########### return value :
199 ############# 0 parsed some expresions
200 ############# 1 done nothing but no errors happened
201 ################# - =: ...A-A...
202 def module_3_1(lis,i):
203
204 # left i right are both indexes :)
205 left=i-1
206 right=i+1
207
208 # process: ...A-A...
209 if i>=1 and len(lis)>=3 and right<len(lis):
210 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
211 leftPtr=lis[left]
212 rightPtr=lis[right]
213 del lis[left:left+3]
214 lis.insert(left,["@expr","-",leftPtr,rightPtr])
215 return 0
216
217 return 1
218
219 ########### return value :
220 ############# 0 parsed some expresions
221 ############# 1 done nothing but no errors happened
222 ################# << =: ...A<<A...
223 def module_4_0(lis,i):
224
225 # left i right are both indexes :)
226 left=i-1
227 right=i+1
228
229 # process: ...A<<A...
230 if i>=1 and len(lis)>=3 and right<len(lis):
231 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
232 leftPtr=lis[left]
233 rightPtr=lis[right]
234 del lis[left:left+3]
235 lis.insert(left,["@expr","<<",leftPtr,rightPtr])
236 return 0
237
238 return 1
239
240 ########### return value :
241 ############# 0 parsed some expresions
242 ############# 1 done nothing but no errors happened
243 ################# >> =: ...A>>A...
244 def module_4_1(lis,i):
245
246 # left i right are both indexes :)
247 left=i-1
248 right=i+1
249
250 # process: ...A>>A...
251 if i>=1 and len(lis)>=3 and right<len(lis):
252 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
253 leftPtr=lis[left]
254 rightPtr=lis[right]
255 del lis[left:left+3]
256 lis.insert(left,["@expr",">>",leftPtr,rightPtr])
257 return 0
258
259 return 1
260
261 ########### return value :
262 ############# 0 parsed some expresions
263 ############# 1 done nothing but no errors happened
264 ################# > =: ...A>A...
265 def module_5_0(lis,i):
266
267 # left i right are both indexes :)
268 left=i-1
269 right=i+1
270
271 # process: ...A>A...
272 if i>=1 and len(lis)>=3 and right<len(lis):
273 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
274 leftPtr=lis[left]
275 rightPtr=lis[right]
276 del lis[left:left+3]
277 lis.insert(left,["@expr",">",leftPtr,rightPtr])
278 return 0
279
280 return 1
281
282 ########### return value :
283 ############# 0 parsed some expresions
284 ############# 1 done nothing but no errors happened
285 ################# >= =: ...A>=A...
286 def module_5_1(lis,i):
287
288 # left i right are both indexes :)
289 left=i-1
290 right=i+1
291
292 # process: ...A>=A...
293 if i>=1 and len(lis)>=3 and right<len(lis):
294 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
295 leftPtr=lis[left]
296 rightPtr=lis[right]
297 del lis[left:left+3]
298 lis.insert(left,["@expr",">=",leftPtr,rightPtr])
299 return 0
300
301 return 1
302
303 ########### return value :
304 ############# 0 parsed some expresions
305 ############# 1 done nothing but no errors happened
306 ################# < =: ...A<A...
307 def module_5_2(lis,i):
308
309 # left i right are both indexes :)
310 left=i-1
311 right=i+1
312
313 # process: ...A<A...
314 if i>=1 and len(lis)>=3 and right<len(lis):
315 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
316 leftPtr=lis[left]
317 rightPtr=lis[right]
318 del lis[left:left+3]
319 lis.insert(left,["@expr","<",leftPtr,rightPtr])
320 return 0
321
322 return 1
323
324 ########### return value :
325 ############# 0 parsed some expresions
326 ############# 1 done nothing but no errors happened
327 ################# <= =: ...A<=A...
328 def module_5_3(lis,i):
329
330 # left i right are both indexes :)
331 left=i-1
332 right=i+1
333
334 # process: ...A<=A...
335 if i>=1 and len(lis)>=3 and right<len(lis):
336 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
337 leftPtr=lis[left]
338 rightPtr=lis[right]
339 del lis[left:left+3]
340 lis.insert(left,["@expr","<=",leftPtr,rightPtr])
341 return 0
342
343 return 1
344
345 ########### return value :
346 ############# 0 parsed some expresions
347 ############# 1 done nothing but no errors happened
348 ################# == =: ...A==A...
349 def module_6_0(lis,i):
350
351 # left i right are both indexes :)
352 left=i-1
353 right=i+1
354
355 # process: ...A==A...
356 if i>=1 and len(lis)>=3 and right<len(lis):
357 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
358 leftPtr=lis[left]
359 rightPtr=lis[right]
360 del lis[left:left+3]
361 lis.insert(left,["@expr","==",leftPtr,rightPtr])
362 return 0
363
364 return 1
365
366 ########### return value :
367 ############# 0 parsed some expresions
368 ############# 1 done nothing but no errors happened
369 ################# != =: ...A!=A...
370 def module_6_1(lis,i):
371
372 # left i right are both indexes :)
373 left=i-1
374 right=i+1
375
376 # process: ...A!=A...
377 if i>=1 and len(lis)>=3 and right<len(lis):
378 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
379 leftPtr=lis[left]
380 rightPtr=lis[right]
381 del lis[left:left+3]
382 lis.insert(left,["@expr","!=",leftPtr,rightPtr])
383 return 0
384
385 return 1
386
387 ########### return value :
388 ############# 0 parsed some expresions
389 ############# 1 done nothing but no errors happened
390 ################# & =: ...A&A...
391 def module_7_0(lis,i):
392
393 # left i right are both indexes :)
394 left=i-1
395 right=i+1
396
397 # process: ...A&A...
398 if i>=1 and len(lis)>=3 and right<len(lis):
399 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
400 leftPtr=lis[left]
401 rightPtr=lis[right]
402 del lis[left:left+3]
403 lis.insert(left,["@expr","&",leftPtr,rightPtr])
404 return 0
405
406 return 1
407
408 ########### return value :
409 ############# 0 parsed some expresions
410 ############# 1 done nothing but no errors happened
411 ################# ^ =: ...A^A...
412 def module_8_0(lis,i):
413
414 # left i right are both indexes :)
415 left=i-1
416 right=i+1
417
418 # process: ...A^A...
419 if i>=1 and len(lis)>=3 and right<len(lis):
420 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
421 leftPtr=lis[left]
422 rightPtr=lis[right]
423 del lis[left:left+3]
424 lis.insert(left,["@expr","^",leftPtr,rightPtr])
425 return 0
426
427 return 1
428
429 ########### return value :
430 ############# 0 parsed some expresions
431 ############# 1 done nothing but no errors happened
432 ################# | =: ...A|A...
433 def module_9_0(lis,i):
434
435 # left i right are both indexes :)
436 left=i-1
437 right=i+1
438
439 # process: ...A|A...
440 if i>=1 and len(lis)>=3 and right<len(lis):
441 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
442 leftPtr=lis[left]
443 rightPtr=lis[right]
444 del lis[left:left+3]
445 lis.insert(left,["@expr","|",leftPtr,rightPtr])
446 return 0
447
448 return 1
449
450 ########### return value :
451 ############# 0 parsed some expresions
452 ############# 1 done nothing but no errors happened
453 ################# && =: ...A&&A...
454 def module_10_0(lis,i):
455
456 # left i right are both indexes :)
457 left=i-1
458 right=i+1
459
460 # process: ...A&&A...
461 if i>=1 and len(lis)>=3 and right<len(lis):
462 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
463 leftPtr=lis[left]
464 rightPtr=lis[right]
465 del lis[left:left+3]
466 lis.insert(left,["@expr","&&",leftPtr,rightPtr])
467 return 0
468
469 return 1
470
471 ########### return value :
472 ############# 0 parsed some expresions
473 ############# 1 done nothing but no errors happened
474 ################# || =: ...A||A...
475 def module_11_0(lis,i):
476
477 # left i right are both indexes :)
478 left=i-1
479 right=i+1
480
481 # process: ...A||A...
482 if i>=1 and len(lis)>=3 and right<len(lis):
483 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
484 leftPtr=lis[left]
485 rightPtr=lis[right]
486 del lis[left:left+3]
487 lis.insert(left,["@expr","||",leftPtr,rightPtr])
488 return 0
489
490 return 1
491
492 ########### return value :
493 ############# 0 parsed some expresions
494 ############# 1 done nothing but no errors happened
495 ################# ?: =: ...A?A:A...
496 ################# ^
497 def module_12_0(lis,i):
498
499 # left i right are both indexes :)
500 first=i-3
501 leftOp=i-2
502 left=i-1
503 right=i+1
504
505 # process: ...A?A:A...
506 # ^
507 if i>=3 and len(lis)>=5 and right<len(lis):
508 if lis[right][0][0]=='@' and lis[left][0][0]=='@' and\
509 lis[leftOp][0]=='?' and lis[first][0][0]=='@':
510 firstPtr=lis[first]
511 leftPtr=lis[left]
512 rightPtr=lis[right]
513 del lis[first:first+5]
514 lis.insert(first,["@expr","?:",firstPtr,leftPtr,rightPtr])
515 return 0
516
517 return 1
518
519 ########### return value :
520 ############# 0 parsed some expresions
521 ############# 1 done nothing but no errors happened
522 ################# , =: A,A,...A,A
523 def module_13_0(lis,i):
524
525 # process: A,A,...A,A
526 if len(lis)==1 and lis[0][0][0]!='@':
527 return 1
528 if len(lis)==1 and lis[0][0][0]=='@':
529 return 0
530 if (len(lis)%2)==1 :
531 i=1
532 if lis[0][0][0]!='@':
533 return 1
534 while i<len(lis):
535 if lis[i+1][0][0]=='@' and lis[i][0]==',':
536 i=i+2
537 else:
538 return 1
539 ls=[['@expr_list']]
540 i=0
541 while i<len(lis):
542 ls[0].append(lis[i])
543 i=i+2
544 del lis[:]
545 lis[:]=ls[:]
546 return 0
547 return 1
View Code
上面的代码虽然很大,却是最简单的一部分了,其实可以采取一些方法显著地压缩代码量,但是时间有限。
下面给出一元运算符、二元运算符、三元运算符及逗号分隔符的语义分析过程,这是本文的核心代码之一:
1 ######################################## global list
2 # construct a module dictionary
3 # module_dic_tuple[priority]['Operator'](lis,i)
4 module_dic_tuple=({}, { '+':module_1_0,'-':module_1_1,'!':module_1_2,'~':module_1_3 },\
5 { '*':module_2_0,'/':module_2_1,'%':module_2_2 }, \
6 { '+':module_3_0,'-':module_3_1 },\
7 { '<<':module_4_0,'>>':module_4_1 },\
8 { '>':module_5_0,'>=':module_5_1,'<':module_5_2,'<=':module_5_3 },\
9 { '==':module_6_0,'!=':module_6_1 },\
10 { '&':module_7_0 },\
11 { '^':module_8_0 },\
12 { '|':module_9_0 },\
13 { '&&':module_10_0 },\
14 { '||':module_11_0 },\
15 { '?:':module_12_0 },\
16 { ',':module_13_0 } )
17
18 operator_priority_tuple=( () , ('+', '-', '!', '~') , ('*','/','%'),\
19 ('+','-'),('<<','>>'),\
20 ('>','>=','<','<='),('==','!='),\
21 ('&'),('^'),('|'),('&&'),('||'),('?',':'),(',') )
22
23 ############################# parse:unary,binary,ternary,comma expr
24 ########### return value :
25 ############# 0 parsed sucessfully
26 ############# 1 syntax error
27 def parse_simple_expr(lis):
28 if len(lis)==0:
29 return 1
30 #if lis[len(lis)-1][0][0]!='@':
31 # return 1
32 #if lis[0][0][0]!='@' and lis[0][0] not in ('+','-','!','~'):
33 # return 1
34 for pri in range(1,12): # pri 1,2,3,4,5,6,7,8,9,10,11
35 i=0
36 while 1:
37 if len(lis)==1 and lis[0][0][0]=='@':
38 return 0
39 if i>=len(lis):
40 break
41 if lis[i][0] in operator_priority_tuple[pri]:
42 if module_dic_tuple[pri][lis[i][0]](lis,i)==0:
43 i=0
44 continue
45 else:
46 i=i+1
47 continue
48 else:
49 i=i+1
50 for pri in range(12,13): # pri 12 # parse ...A?A:A...
51 i=0
52 while 1:
53 if len(lis)==1 and lis[0][0][0]=='@':
54 return 0
55 if i>=len(lis):
56 break
57 if lis[i][0]==':':
58 if module_dic_tuple[pri]['?:'](lis,i)==0:
59 i=0
60 continue
61 else:
62 i=i+1
63 continue
64 else:
65 i=i+1
66 return module_dic_tuple[13][','](lis,0)
67 return 1
上面代码中,使用了函数引用的词典链表来简化此部分的代码数量。
这一部分就不进行验证展示了,具体过程与前面的文章《一个简单的语义分析算法:单步算法——Python实现》中的描述类似。
实现了 parse_simple_expr 功能之后,剩下的函数与括号的语义分析变得简单些,演算过程如下:
代码实现:
1 ########### return value :[intStatusCode,indexOf'(',indexOf')']
2 ############# intStatusCode
3 ############# 0 sucessfully
4 ############# 1 no parenthesis matched
5 ############# 2 list is null :(
6 def module_parenthesis_place(lis):
7 length=len(lis)
8 err=0
9 x=0
10 y=0
11 if length==0:
12 return [2,None,None]
13 try:
14 x=lis.index([")",None])
15 except:
16 err=1
17 lis.reverse()
18 try:
19 y=lis.index(["(",None],length-x-1)
20 except:
21 err=1
22 lis.reverse()
23 y=length-y-1
24 if err==1:
25 return [1,None,None]
26 else:
27 return [0,y,x]
28
29
30 ############################# parse:unary binary ternary prenthesis function expr
31 ########### return value :
32 ############# 0 parsed sucessfully
33 ############# 1 syntax error
34 ############################# find first ')'
35 def parse_comp_expr(lis):
36 while 1:
37 if len(lis)==0:
38 return 1
39 if len(lis)==1:
40 if lis[0][0][0]=='@':
41 return 0
42 else:
43 return 1
44 place=module_parenthesis_place(lis)
45 if place[0]==0:
46 mirror=lis[(place[1]+1):place[2]]
47 if parse_simple_expr(mirror)==0:
48 if place[1]>=1 and lis[place[1]-1][0]=='@var':
49 '''func'''
50 funcName=lis[place[1]-1][1]
51 del lis[place[1]-1:(place[2]+1)]
52 lis.insert(place[1]-1,["@func",funcName,mirror[0]])
53 else:
54 del lis[place[1]:(place[2]+1)]
55 lis.insert(place[1],mirror[0])
56 else:
57 return 1
58 else:
59 return parse_simple_expr(lis)
60 return 1
如此,代码到此结束。
下面给出实验结果:
>>> ls=[['(',None],['@var','f'],['(',None],['@num','1'],[',',None],['@num','2'],[',',None],['@num','3'],[',',None],['!',None],['-',None],['@var','x'],['?',None],['@var','y'],[':',None],['~',None],['@var','z'],[')',None],['-',None],['@num','3'],[')',None],['/',None],['@num','4']]
>>> ls
[['(', None], ['@var', 'f'], ['(', None], ['@num', '1'], [',', None], ['@num', '2'], [',', None], ['@num', '3'], [',', None], ['!', None], ['-', None], ['@var', 'x'], ['?', None], ['@var', 'y'], [':', None], ['~', None], ['@var', 'z'], [')', None], ['-', None], ['@num', '3'], [')', None], ['/', None], ['@num', '4']]
>>> len(ls)
23
>>> parse_comp_expr(ls);ls
0
[['@expr', '/', ['@expr', '-', ['@func', 'f', ['@expr_list', ['@num', '1'], ['@num', '2'], ['@num', '3'], ['@expr', '?:', ['@expr', '!', ['@expr', '-', ['@var', 'x']]], ['@var', 'y'], ['@expr', '~', ['@var', 'z']]]]], ['@num', '3']], ['@num', '4']]]
>>> len(ls)
1
>>>
附录:
本文的全部源代码如下:
1 '''
2 ____________________________Syntax & Syntax Tree
3 Parenthesis:
4 ["(",None]
5 [")",None]
6 Operators(grouped by precedence):
7 Unary :
8 1 + - ! ~ ["+",None] ["-",None] ["!",None] ["~",None]
9 Binary :
10 2 * / % ["*",None] ["/",None] ["%",None]
11 3 + - ["+",None] ["-",None]
12 4 << >> ["<<",None] [">>",None]
13 5 > >= < <= [">",None] [">=",None] ["<",None] ["<=",None]
14 6 == != ["==",None] ["!=",None]
15 7 & ["&",None]
16 8 ^ ["^",None]
17 9 | ["|",None]
18 10 && ["&&",None]
19 11 || ["||",None]
20 Ternary :
21 12 expr ? expr : expr ["?",None] [":",None] ["@expr","?:",listPtr0,listPtr1,listPtr2]
22 13 expr , expr , expr...
23 Var,Num,Expr,Function:
24 ["@var","varName"]
25 ["@num","num_string"]
26 ["@expr","Operator",listPtr,...]
27 ["@func","funcName",listPtr1,...]
28 ["@expr_list",["@var"|"@num"|"@expr"|"@func",...],...]
29 '''
30
31 ######################################## global list
32 OperatorList=['+','-','!','~',\
33 '*','/','%',\
34 '+','-',\
35 '<<','>>',\
36 '>','>=','<','<=',\
37 '==','!=',\
38 '&',\
39 '^',\
40 '|',\
41 '&&',\
42 '||',\
43 '?',':'\
44 ',']
45 ''' 31 + 8 * 9 '''
46 listToParse=[ ['@num','31'] , ['+',None] , ['@num','8'] , ['*',None] , ['@num','9'] ]
47
48 ########### return value :
49 ############# 0 parsed some expresions
50 ############# 1 done nothing but no errors happened
51 ################# + =: ^+A... | ...Op+A...
52 def module_1_0(lis,i):
53
54 # left i right are both indexes :)
55 left=i-1
56 right=i+1
57
58 # process: ^+A...
59 if i==0 and len(lis)>=2:
60 if lis[right][0][0]=='@':
61 rightPtr=lis[right]
62 del lis[0:2]
63 lis.insert(0,["@expr","+",rightPtr])
64 return 0
65 # process: ...Op+A...
66 if i>=1 and len(lis)>=3 and right<len(lis):
67 if lis[left][0] in OperatorList:
68 if lis[right][0][0]=='@':
69 rightPtr=lis[right]
70 del lis[i:i+2]
71 lis.insert(i,["@expr","+",rightPtr])
72 return 0
73
74 return 1
75
76 ########### return value :
77 ############# 0 parsed some expresions
78 ############# 1 done nothing but no errors happened
79 ################# - =: ^-A... | ...Op-A...
80 def module_1_1(lis,i):
81
82 # left i right are both indexes :)
83 left=i-1
84 right=i+1
85
86 # process: ^-A...
87 if i==0 and len(lis)>=2:
88 if lis[right][0][0]=='@':
89 rightPtr=lis[right]
90 del lis[0:2]
91 lis.insert(0,["@expr","-",rightPtr])
92 return 0
93 # process: ...Op-A...
94 if i>=1 and len(lis)>=3 and right<len(lis):
95 if lis[left][0] in OperatorList:
96 if lis[right][0][0]=='@':
97 rightPtr=lis[right]
98 del lis[i:i+2]
99 lis.insert(i,["@expr","-",rightPtr])
100 return 0
101
102 return 1
103
104 ########### return value :
105 ############# 0 parsed some expresions
106 ############# 1 done nothing but no errors happened
107 ################# ! =: ...!A...
108 def module_1_2(lis,i):
109
110 # left i right are both indexes :)
111 left=i-1
112 right=i+1
113
114 # process: ...!A...
115 if len(lis)>=2 and right<len(lis):
116 if lis[right][0][0]=='@':
117 rightPtr=lis[right]
118 del lis[i:i+2]
119 lis.insert(i,["@expr","!",rightPtr])
120 return 0
121
122 return 1
123
124 ########### return value :
125 ############# 0 parsed some expresions
126 ############# 1 done nothing but no errors happened
127 ################# ~ =: ...~A...
128 def module_1_3(lis,i):
129
130 # left i right are both indexes :)
131 left=i-1
132 right=i+1
133
134 # process: ...~A...
135 if len(lis)>=2 and right<len(lis):
136 if lis[right][0][0]=='@':
137 rightPtr=lis[right]
138 del lis[i:i+2]
139 lis.insert(i,["@expr","~",rightPtr])
140 return 0
141
142 return 1
143
144 ########### return value :
145 ############# 0 parsed some expresions
146 ############# 1 done nothing but no errors happened
147 ################# * =: ...A*A...
148 def module_2_0(lis,i):
149
150 # left i right are both indexes :)
151 left=i-1
152 right=i+1
153
154 # process: ...A*A...
155 if i>=1 and len(lis)>=3 and right<len(lis):
156 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
157 leftPtr=lis[left]
158 rightPtr=lis[right]
159 del lis[left:left+3]
160 lis.insert(left,["@expr","*",leftPtr,rightPtr])
161 return 0
162
163 return 1
164
165 ########### return value :
166 ############# 0 parsed some expresions
167 ############# 1 done nothing but no errors happened
168 ################# / =: ...A/A...
169 def module_2_1(lis,i):
170
171 # left i right are both indexes :)
172 left=i-1
173 right=i+1
174
175 # process: ...A/A...
176 if i>=1 and len(lis)>=3 and right<len(lis):
177 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
178 leftPtr=lis[left]
179 rightPtr=lis[right]
180 del lis[left:left+3]
181 lis.insert(left,["@expr","/",leftPtr,rightPtr])
182 return 0
183
184 return 1
185
186 ########### return value :
187 ############# 0 parsed some expresions
188 ############# 1 done nothing but no errors happened
189 ################# % =: ...A%A...
190 def module_2_2(lis,i):
191
192 # left i right are both indexes :)
193 left=i-1
194 right=i+1
195
196 # process: ...A%A...
197 if i>=1 and len(lis)>=3 and right<len(lis):
198 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
199 leftPtr=lis[left]
200 rightPtr=lis[right]
201 del lis[left:left+3]
202 lis.insert(left,["@expr","%",leftPtr,rightPtr])
203 return 0
204
205 return 1
206
207 ########### return value :
208 ############# 0 parsed some expresions
209 ############# 1 done nothing but no errors happened
210 ################# + =: ...A+A...
211 def module_3_0(lis,i):
212
213 # left i right are both indexes :)
214 left=i-1
215 right=i+1
216
217 # process: ...A+A...
218 if i>=1 and len(lis)>=3 and right<len(lis):
219 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
220 leftPtr=lis[left]
221 rightPtr=lis[right]
222 del lis[left:left+3]
223 lis.insert(left,["@expr","+",leftPtr,rightPtr])
224 return 0
225
226 return 1
227
228 ########### return value :
229 ############# 0 parsed some expresions
230 ############# 1 done nothing but no errors happened
231 ################# - =: ...A-A...
232 def module_3_1(lis,i):
233
234 # left i right are both indexes :)
235 left=i-1
236 right=i+1
237
238 # process: ...A-A...
239 if i>=1 and len(lis)>=3 and right<len(lis):
240 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
241 leftPtr=lis[left]
242 rightPtr=lis[right]
243 del lis[left:left+3]
244 lis.insert(left,["@expr","-",leftPtr,rightPtr])
245 return 0
246
247 return 1
248
249 ########### return value :
250 ############# 0 parsed some expresions
251 ############# 1 done nothing but no errors happened
252 ################# << =: ...A<<A...
253 def module_4_0(lis,i):
254
255 # left i right are both indexes :)
256 left=i-1
257 right=i+1
258
259 # process: ...A<<A...
260 if i>=1 and len(lis)>=3 and right<len(lis):
261 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
262 leftPtr=lis[left]
263 rightPtr=lis[right]
264 del lis[left:left+3]
265 lis.insert(left,["@expr","<<",leftPtr,rightPtr])
266 return 0
267
268 return 1
269
270 ########### return value :
271 ############# 0 parsed some expresions
272 ############# 1 done nothing but no errors happened
273 ################# >> =: ...A>>A...
274 def module_4_1(lis,i):
275
276 # left i right are both indexes :)
277 left=i-1
278 right=i+1
279
280 # process: ...A>>A...
281 if i>=1 and len(lis)>=3 and right<len(lis):
282 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
283 leftPtr=lis[left]
284 rightPtr=lis[right]
285 del lis[left:left+3]
286 lis.insert(left,["@expr",">>",leftPtr,rightPtr])
287 return 0
288
289 return 1
290
291 ########### return value :
292 ############# 0 parsed some expresions
293 ############# 1 done nothing but no errors happened
294 ################# > =: ...A>A...
295 def module_5_0(lis,i):
296
297 # left i right are both indexes :)
298 left=i-1
299 right=i+1
300
301 # process: ...A>A...
302 if i>=1 and len(lis)>=3 and right<len(lis):
303 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
304 leftPtr=lis[left]
305 rightPtr=lis[right]
306 del lis[left:left+3]
307 lis.insert(left,["@expr",">",leftPtr,rightPtr])
308 return 0
309
310 return 1
311
312 ########### return value :
313 ############# 0 parsed some expresions
314 ############# 1 done nothing but no errors happened
315 ################# >= =: ...A>=A...
316 def module_5_1(lis,i):
317
318 # left i right are both indexes :)
319 left=i-1
320 right=i+1
321
322 # process: ...A>=A...
323 if i>=1 and len(lis)>=3 and right<len(lis):
324 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
325 leftPtr=lis[left]
326 rightPtr=lis[right]
327 del lis[left:left+3]
328 lis.insert(left,["@expr",">=",leftPtr,rightPtr])
329 return 0
330
331 return 1
332
333 ########### return value :
334 ############# 0 parsed some expresions
335 ############# 1 done nothing but no errors happened
336 ################# < =: ...A<A...
337 def module_5_2(lis,i):
338
339 # left i right are both indexes :)
340 left=i-1
341 right=i+1
342
343 # process: ...A<A...
344 if i>=1 and len(lis)>=3 and right<len(lis):
345 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
346 leftPtr=lis[left]
347 rightPtr=lis[right]
348 del lis[left:left+3]
349 lis.insert(left,["@expr","<",leftPtr,rightPtr])
350 return 0
351
352 return 1
353
354 ########### return value :
355 ############# 0 parsed some expresions
356 ############# 1 done nothing but no errors happened
357 ################# <= =: ...A<=A...
358 def module_5_3(lis,i):
359
360 # left i right are both indexes :)
361 left=i-1
362 right=i+1
363
364 # process: ...A<=A...
365 if i>=1 and len(lis)>=3 and right<len(lis):
366 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
367 leftPtr=lis[left]
368 rightPtr=lis[right]
369 del lis[left:left+3]
370 lis.insert(left,["@expr","<=",leftPtr,rightPtr])
371 return 0
372
373 return 1
374
375 ########### return value :
376 ############# 0 parsed some expresions
377 ############# 1 done nothing but no errors happened
378 ################# == =: ...A==A...
379 def module_6_0(lis,i):
380
381 # left i right are both indexes :)
382 left=i-1
383 right=i+1
384
385 # process: ...A==A...
386 if i>=1 and len(lis)>=3 and right<len(lis):
387 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
388 leftPtr=lis[left]
389 rightPtr=lis[right]
390 del lis[left:left+3]
391 lis.insert(left,["@expr","==",leftPtr,rightPtr])
392 return 0
393
394 return 1
395
396 ########### return value :
397 ############# 0 parsed some expresions
398 ############# 1 done nothing but no errors happened
399 ################# != =: ...A!=A...
400 def module_6_1(lis,i):
401
402 # left i right are both indexes :)
403 left=i-1
404 right=i+1
405
406 # process: ...A!=A...
407 if i>=1 and len(lis)>=3 and right<len(lis):
408 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
409 leftPtr=lis[left]
410 rightPtr=lis[right]
411 del lis[left:left+3]
412 lis.insert(left,["@expr","!=",leftPtr,rightPtr])
413 return 0
414
415 return 1
416
417 ########### return value :
418 ############# 0 parsed some expresions
419 ############# 1 done nothing but no errors happened
420 ################# & =: ...A&A...
421 def module_7_0(lis,i):
422
423 # left i right are both indexes :)
424 left=i-1
425 right=i+1
426
427 # process: ...A&A...
428 if i>=1 and len(lis)>=3 and right<len(lis):
429 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
430 leftPtr=lis[left]
431 rightPtr=lis[right]
432 del lis[left:left+3]
433 lis.insert(left,["@expr","&",leftPtr,rightPtr])
434 return 0
435
436 return 1
437
438 ########### return value :
439 ############# 0 parsed some expresions
440 ############# 1 done nothing but no errors happened
441 ################# ^ =: ...A^A...
442 def module_8_0(lis,i):
443
444 # left i right are both indexes :)
445 left=i-1
446 right=i+1
447
448 # process: ...A^A...
449 if i>=1 and len(lis)>=3 and right<len(lis):
450 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
451 leftPtr=lis[left]
452 rightPtr=lis[right]
453 del lis[left:left+3]
454 lis.insert(left,["@expr","^",leftPtr,rightPtr])
455 return 0
456
457 return 1
458
459 ########### return value :
460 ############# 0 parsed some expresions
461 ############# 1 done nothing but no errors happened
462 ################# | =: ...A|A...
463 def module_9_0(lis,i):
464
465 # left i right are both indexes :)
466 left=i-1
467 right=i+1
468
469 # process: ...A|A...
470 if i>=1 and len(lis)>=3 and right<len(lis):
471 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
472 leftPtr=lis[left]
473 rightPtr=lis[right]
474 del lis[left:left+3]
475 lis.insert(left,["@expr","|",leftPtr,rightPtr])
476 return 0
477
478 return 1
479
480 ########### return value :
481 ############# 0 parsed some expresions
482 ############# 1 done nothing but no errors happened
483 ################# && =: ...A&&A...
484 def module_10_0(lis,i):
485
486 # left i right are both indexes :)
487 left=i-1
488 right=i+1
489
490 # process: ...A&&A...
491 if i>=1 and len(lis)>=3 and right<len(lis):
492 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
493 leftPtr=lis[left]
494 rightPtr=lis[right]
495 del lis[left:left+3]
496 lis.insert(left,["@expr","&&",leftPtr,rightPtr])
497 return 0
498
499 return 1
500
501 ########### return value :
502 ############# 0 parsed some expresions
503 ############# 1 done nothing but no errors happened
504 ################# || =: ...A||A...
505 def module_11_0(lis,i):
506
507 # left i right are both indexes :)
508 left=i-1
509 right=i+1
510
511 # process: ...A||A...
512 if i>=1 and len(lis)>=3 and right<len(lis):
513 if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
514 leftPtr=lis[left]
515 rightPtr=lis[right]
516 del lis[left:left+3]
517 lis.insert(left,["@expr","||",leftPtr,rightPtr])
518 return 0
519
520 return 1
521
522 ########### return value :
523 ############# 0 parsed some expresions
524 ############# 1 done nothing but no errors happened
525 ################# ?: =: ...A?A:A...
526 ################# ^
527 def module_12_0(lis,i):
528
529 # left i right are both indexes :)
530 first=i-3
531 leftOp=i-2
532 left=i-1
533 right=i+1
534
535 # process: ...A?A:A...
536 # ^
537 if i>=3 and len(lis)>=5 and right<len(lis):
538 if lis[right][0][0]=='@' and lis[left][0][0]=='@' and\
539 lis[leftOp][0]=='?' and lis[first][0][0]=='@':
540 firstPtr=lis[first]
541 leftPtr=lis[left]
542 rightPtr=lis[right]
543 del lis[first:first+5]
544 lis.insert(first,["@expr","?:",firstPtr,leftPtr,rightPtr])
545 return 0
546
547 return 1
548
549 ########### return value :
550 ############# 0 parsed some expresions
551 ############# 1 done nothing but no errors happened
552 ################# , =: A,A,...A,A
553 def module_13_0(lis,i):
554
555 # process: A,A,...A,A
556 if len(lis)==1 and lis[0][0][0]!='@':
557 return 1
558 if len(lis)==1 and lis[0][0][0]=='@':
559 return 0
560 if (len(lis)%2)==1 :
561 i=1
562 if lis[0][0][0]!='@':
563 return 1
564 while i<len(lis):
565 if lis[i+1][0][0]=='@' and lis[i][0]==',':
566 i=i+2
567 else:
568 return 1
569 ls=[['@expr_list']]
570 i=0
571 while i<len(lis):
572 ls[0].append(lis[i])
573 i=i+2
574 del lis[:]
575 lis[:]=ls[:]
576 return 0
577 return 1
578
579 ######################################## global list
580 # construct a module dictionary
581 # module_dic_tuple[priority]['Operator'](lis,i)
582 module_dic_tuple=({}, { '+':module_1_0,'-':module_1_1,'!':module_1_2,'~':module_1_3 },\
583 { '*':module_2_0,'/':module_2_1,'%':module_2_2 }, \
584 { '+':module_3_0,'-':module_3_1 },\
585 { '<<':module_4_0,'>>':module_4_1 },\
586 { '>':module_5_0,'>=':module_5_1,'<':module_5_2,'<=':module_5_3 },\
587 { '==':module_6_0,'!=':module_6_1 },\
588 { '&':module_7_0 },\
589 { '^':module_8_0 },\
590 { '|':module_9_0 },\
591 { '&&':module_10_0 },\
592 { '||':module_11_0 },\
593 { '?:':module_12_0 },\
594 { ',':module_13_0 } )
595
596 operator_priority_tuple=( () , ('+', '-', '!', '~') , ('*','/','%'),\
597 ('+','-'),('<<','>>'),\
598 ('>','>=','<','<='),('==','!='),\
599 ('&'),('^'),('|'),('&&'),('||'),('?',':'),(',') )
600
601 ############################# parse:unary,binary,ternary,comma expr
602 ########### return value :
603 ############# 0 parsed sucessfully
604 ############# 1 syntax error
605 def parse_simple_expr(lis):
606 if len(lis)==0:
607 return 1
608 #if lis[len(lis)-1][0][0]!='@':
609 # return 1
610 #if lis[0][0][0]!='@' and lis[0][0] not in ('+','-','!','~'):
611 # return 1
612 for pri in range(1,12): # pri 1,2,3,4,5,6,7,8,9,10,11
613 i=0
614 while 1:
615 if len(lis)==1 and lis[0][0][0]=='@':
616 return 0
617 if i>=len(lis):
618 break
619 if lis[i][0] in operator_priority_tuple[pri]:
620 if module_dic_tuple[pri][lis[i][0]](lis,i)==0:
621 i=0
622 continue
623 else:
624 i=i+1
625 continue
626 else:
627 i=i+1
628 for pri in range(12,13): # pri 12 # parse ...A?A:A...
629 i=0
630 while 1:
631 if len(lis)==1 and lis[0][0][0]=='@':
632 return 0
633 if i>=len(lis):
634 break
635 if lis[i][0]==':':
636 if module_dic_tuple[pri]['?:'](lis,i)==0:
637 i=0
638 continue
639 else:
640 i=i+1
641 continue
642 else:
643 i=i+1
644 return module_dic_tuple[13][','](lis,0)
645 return 1
646
647 ########### return value :[intStatusCode,indexOf'(',indexOf')']
648 ############# intStatusCode
649 ############# 0 sucessfully
650 ############# 1 no parenthesis matched
651 ############# 2 list is null :(
652 def module_parenthesis_place(lis):
653 length=len(lis)
654 err=0
655 x=0
656 y=0
657 if length==0:
658 return [2,None,None]
659 try:
660 x=lis.index([")",None])
661 except:
662 err=1
663 lis.reverse()
664 try:
665 y=lis.index(["(",None],length-x-1)
666 except:
667 err=1
668 lis.reverse()
669 y=length-y-1
670 if err==1:
671 return [1,None,None]
672 else:
673 return [0,y,x]
674
675
676 ############################# parse:unary binary ternary prenthesis function expr
677 ########### return value :
678 ############# 0 parsed sucessfully
679 ############# 1 syntax error
680 ############################# find first ')'
681 def parse_comp_expr(lis):
682 while 1:
683 if len(lis)==0:
684 return 1
685 if len(lis)==1:
686 if lis[0][0][0]=='@':
687 return 0
688 else:
689 return 1
690 place=module_parenthesis_place(lis)
691 if place[0]==0:
692 mirror=lis[(place[1]+1):place[2]]
693 if parse_simple_expr(mirror)==0:
694 if place[1]>=1 and lis[place[1]-1][0]=='@var':
695 '''func'''
696 funcName=lis[place[1]-1][1]
697 del lis[place[1]-1:(place[2]+1)]
698 lis.insert(place[1]-1,["@func",funcName,mirror[0]])
699 else:
700 del lis[place[1]:(place[2]+1)]
701 lis.insert(place[1],mirror[0])
702 else:
703 return 1
704 else:
705 return parse_simple_expr(lis)
706 return 1
View Code
由于当树结构稍复杂时,分析其结构很是耗费时间,接下来,我们将开发一个将代码中的树结构图形化显示的简陋工具。