6 GRAMMARS

Figure 5showsthegrammaticalconstructionofaspecificsentence.Observethatthereare: (1) variousvariables,e.g.,(sentence),(nounphrase),···;

(2) variousterminalwords,e.g.,“The”,“boy,”···;

(3) abeginningvariable(sentence);

(4) varioussubstitutionsorproductions,e.g.

)sentence* → )nounphrase*)verbphrase*

)objectphrase* → )article*)noun*

)noun* → apple

Thefinalsentenceonlycontainsterminals,althoughbothvariablesandterminalsappearinitsconstructionbythe productions.Thisintuitivedescriptionisgiveninordertomotivatethefollowing definition ofagrammarand the languageitgenerates.

Fig. 5

Definition 4: Aphrasestructuregrammaror,simply,agrammarGconsistsoffourparts: (1) Afiniteset(vocabulary)V.

(2) AsubsetTofVwhoseelementsarecalledterminals;theelementsofN =V\T arecallednon-terminals

orvariables.

(3) AnonterminalsymbolScalledthestartsymbol.

(4) Afiniteset Pof productions.(Aproductionisan orderedpair(α,β),usuallywrittenα→β,whereαand

βarewordsinV, andtheproductionmustcontainatleastonenonterminalonitsleftsideα.)

SuchagrammarGisdenotedbyG=G(V,T,S,P)whenwewanttoindicateitsfourparts.

Thefollowingnotation,unlessotherwisestatedorimplied,willbeusedforourgrammars.Terminalswill bedenotedbyitaliclowercaseLatinletters,a,b,c,···,andnonterminalswillbedenotedbyitaliccapitalLatin letters,A,B,C,···,withSasthestartsymbol.Also,Greekletters,α,β,···,willdenotewordsinV,thatis,

wordsinterminalsandnonterminals.Furthermore,wewillwrite

α→(β1, β2,···,βk) insteadofα→β1,α→β2,···, α→βk

Remark:Frequently,wewilldefineagrammarGbyonlygivingitsproductions,assumingimplicitlythatS

isthestartsymbolandthattheterminalsandnonterminalsofGareonlythoseappearingintheproductions.

EXAMPLE 9 ThefollowingdefinesagrammarGwithSasthestartsymbol:

12345

V={A,B,S,a,b},T={a,b},P={S→AB,A→Aa,B→Bb,A→a,B→b}

Theproductionsmaybeabbreviatedasfollows:S→AB, A→(Aa,a),B→(Bb,b)

LanguageL(G)ofaGrammarG

Supposewandw arewordsoverthevocabularysetVofagrammarG.Wewrite

w⇒w

ifw canbeobtainedfromwbyusingoneoftheproductions;thatis,ifthereexistswordsuandvsuchthat

w=uαvandw =uβvandthereisaproductionα→β.Furthermore,wewrite

w⇒⇒worw ∗ w

ifw canbeobtainedfromwusingafinitenumberofproductions.

NowletGbeagrammarwithterminalsetT.ThelanguageofG,denotedbyL(G), consistsofallwordsin

TthatcanbeobtainedfromthestartsymbolSbytheaboveprocess;thatis,

L(G) ={w∈T∗|S⇒⇒w}

EXAMPLE 10 ConsiderthegrammarGinExample 9.Observethatw=a2b4 canbeobtainedfrom the startsymbolSasfollows:

S⇒AB ⇒AaB⇒aaB⇒aaBb⇒aaBbb⇒aaBbbb⇒aabbbb=a2b4

Hereweusedtheproductions1,2,4,3,3,3,5,respectively.ThuswecanwriteS⇒⇒a2b4.Hencew=a2b4

belongstoL(G). Moregenerally,theproductionsequence:

1,2(rtimes),4,3(stimes),5

willproducetheword w=arabsbwhererandsarenonnegativeintegers.Ontheother hand,nosequenceof productions canproduceanaafterab.Accordingly,

L(G) ={ambn|mandnarepositiveintegers}

That is,thelanguageL(G) ofthegrammarGconsistsofallwordswhichbegin with oneormorea’sfollowed by oneormoreb’s.

EXAMPLE 11 FindthelanguageL(G) over{a,b,c}generatedbythegrammarG:

S→aSb,aS→Aa,Aab→c

Firstwemustapplythefirstproductiononeormoretimestoobtainthewordw=anSbn wheren0. To eliminateS,wemustapplythesecondproductiontoobtainthewordw =amAabbm wherem=n−1≥0. Nowwecanonlyapplythethirdproductiontofinallyobtainthewordw =amcbm wherem≥0.Accordingly,

L(G) ={amcbm|mnonnegative}

Thatis,L(G) consistsofallwordswiththesamenonnegativenumberofa’sandb’sseparatedbya c.

TypesofGrammars

Grammarsareclassifiedaccordingtothekindsofproduction whichareallowed.Thefollowinggrammar classification isduetoNoamChomsky.

AType0grammarhasnorestrictionsonitsproductions.Types1,2,and3aredefinedasfollows:

(1) AgrammarGissaidtobeofType1ifeveryproduction isoftheformα→βwhere|α|≤|β|orofthe form α→l.

(2) AgrammarGissaidtobeofType2ifeveryproduction isoftheformA→βwheretheleftsideAisa nonterminal.

(3) AgrammarGissaidtobeofType3ifevery productionisoftheformA→aorA→aB,thatis,where theleftsideAisasinglenonterminalandtherightsideisasingleterminaloraterminalfollowedbya

nonterminal,oroftheformS→l.

Observethatthegrammarsformahierarchy;thatis,everyType3grammarisaType2grammar,everyType

2grammarisaType1grammar,andeveryType1grammarisaType0grammar.

Grammarsarealsoclassifiedintermsofcontext-sensitive,context-free,andregularasfollows. (a) AgrammarGissaidtobecontext-sensitiveiftheproductionsareoftheform

αAα →αβα

Thename“context-sensitive”comesfromthefactthatwecanreplacethevariableAbyβinawordonly when Aliesbetweenαandα.

(b) AgrammarGissaidtobecontext-freeiftheproductionsareoftheform

A→β

Thename“context-free”comesfromthefactthatwecannowreplacethevariableAbyβregardlessof where Aappears.

(c) AgrammarGissaidtoberegulariftheproductionsareoftheform

A→a,A→aB,S→l

Observethatacontext-freegrammaristhesameasaType2grammar, andaregular grammaristhesameasa

Type3grammar.

Afundamentalrelationshipbetweenregulargrammarsandfiniteautomatafollows.

Theorem 4: Alanguage LcanbegeneratedbyaType3(regular) grammar G,ifandonlyifthereexistsa finite automatonMwhichacceptsL.

ThusalanguageLisregulariffL=L(r)whererisaregularexpressioniffL=L(M)whereMisafinite automatoniffL=L(G)whereGisaregulargrammar.(Recallthat“iff”isanabbreviationfor“ifandonlyif.”)

EXAMPLE 12 ConsiderthelanguageL={anbn|n0}. (a) Findacontext-freegrammarGwhichgeneratesL.

ClearlythegrammarGwiththefollowingproductionswillgenerateL:

S→ab,S→aSb

NotethatGiscontext-free

(b) FindaregulargrammarGwhichgeneratesL.

ByExample 8,Lisnotaregularlanguage.ThusLcannotbegeneratedbyaregulargrammar.

DerivationTreesofContext-FreeGrammars

Consideracontext-free (Type2)grammarG. AnyderivationofawordwinL(G) canberepresented graphicallybymeansofanordered,rootedtreeT,calledaderivationtreeorparsetree. Weillustratesucha derivation treebelow.

LetGbethecontext-freegrammarwiththefollowingproductions:

S→aAB,A→Bba,B→bB,B→c

Thewordw=acbabccanbederivedfromSasfollows:

S⇒aAB⇒a(Bba)B⇒acbaB⇒acba(bB)⇒acbabc

OnecandrawaderivationtreeTofthewordwasindicatedbyFig. 6.Specifically,webeginwithSasthe rootandthenaddbranchestothetreeaccording totheproductionusedinthederivation ofw. Thisyieldsthe completedtreeTwhichisshowninFig. 6(e).Thesequence ofleavesfromlefttorightinTisthederived wordw.Also,anynon-leafinTisavariable,sayA,andtheimmediatesuccessors(children)ofAformaword

αwhereA→αistheproductionofGusedinthederivationofw.

Fig. 6

Backus-NaurForm

Thereisanothernotation,calledtheBackus-Naurform,whichissometimesusedfordescribingtheproduc- tions ofacontext-free(Type2)grammar.Specifically:

(i) “::=”isusedinsteadof“→.”

(ii) Everynonterminalisenclosedinbrackets)*.

(iii) All productionswiththesamenonterminalleft-handside arecombinedinto one statementwithallthe right-hand sideslistedontherightof::=separatedbyverticalbars.

Forinstance,theproductionsA→aB, A→b,A→BCarecombinedintotheonestatement:

)A* ::=a)B*|b|)B*)C*

MachinesandGrammars

Theorem 4tellsusthattheregularlanguagescorrespondtothefinitestateautomata(FSA).Therearealso machines, morepowerfulthantheFSA,whichcorrespondtotheothergrammars.

(a) PushdownAutomata:ApushdownautomatonPissimilartoaFSAexceptthatPhasanauxiliary stack whichprovidesanunlimitedamountofmemoryforP.AlanguageLisrecognizedbyapushdownautomaton PifandonlyifLiscontext-free.

(b) LinearBoundedAutomata:AlinearboundedautomatonBismorepowerfulthanapushdownautomaton.

Suchan automatonBusesatapewhichislinearlyboundedby the lengthof the inputwordw.Alanguage

LisrecognizedbyalinearboundedautomatonBifandonlyifLiscontext-sensitive.

(c) TuringMachine:ATuringmachineM,namedaftertheBritishmathematicianAlanTuring,usesaninfinite tape;itisabletorecognizeeverylanguageLthatcanbegeneratedbyanyphase-structuregrammarG. Infact,a TuringmachineMisoneofanumberofequivalentways todefinethenotionofa“computable” function.

Thediscussionofthepushdownautomataandthelinearboundedautomataliesbeyondthescopeofthistext. We willdiscussTuringmachinesinChapter13.