6 GRAMMARS
Figure 5showsthegrammaticalconstructionofaspecificsentence.Observethatthereare: (1) variousvariables,e.g.,(sentence),(nounphrase),···;
(2) variousterminalwords,e.g.,“The”,“boy,”···;
(3) abeginningvariable(sentence);
(4) varioussubstitutionsorproductions,e.g.
)sentence* → )nounphrase*)verbphrase*
)objectphrase* → )article*)noun*
)noun* → apple
Thefinalsentenceonlycontainsterminals,althoughbothvariablesandterminalsappearinitsconstructionbythe productions.Thisintuitivedescriptionisgiveninordertomotivatethefollowing definition ofagrammarand the languageitgenerates.
Fig. 5
Definition 4: Aphrasestructuregrammaror,simply,agrammarGconsistsoffourparts: (1) Afiniteset(vocabulary)V.
(2) AsubsetTofVwhoseelementsarecalledterminals;theelementsofN =V\T arecallednon-terminals
orvariables.
(3) AnonterminalsymbolScalledthestartsymbol.
(4) Afiniteset Pof productions.(Aproductionisan orderedpair(α,β),usuallywrittenα→β,whereαand
βarewordsinV, andtheproductionmustcontainatleastonenonterminalonitsleftsideα.)
SuchagrammarGisdenotedbyG=G(V,T,S,P)whenwewanttoindicateitsfourparts.
Thefollowingnotation,unlessotherwisestatedorimplied,willbeusedforourgrammars.Terminalswill bedenotedbyitaliclowercaseLatinletters,a,b,c,···,andnonterminalswillbedenotedbyitaliccapitalLatin letters,A,B,C,···,withSasthestartsymbol.Also,Greekletters,α,β,···,willdenotewordsinV,thatis,
wordsinterminalsandnonterminals.Furthermore,wewillwrite
α→(β1, β2,···,βk) insteadofα→β1,α→β2,···, α→βk
Remark:Frequently,wewilldefineagrammarGbyonlygivingitsproductions,assumingimplicitlythatS
isthestartsymbolandthattheterminalsandnonterminalsofGareonlythoseappearingintheproductions.
EXAMPLE 9 ThefollowingdefinesagrammarGwithSasthestartsymbol:
12345
V={A,B,S,a,b},T={a,b},P={S→AB,A→Aa,B→Bb,A→a,B→b}
Theproductionsmaybeabbreviatedasfollows:S→AB, A→(Aa,a),B→(Bb,b)
LanguageL(G)ofaGrammarG
Supposewandw arewordsoverthevocabularysetVofagrammarG.Wewrite
w⇒w
ifw canbeobtainedfromwbyusingoneoftheproductions;thatis,ifthereexistswordsuandvsuchthat
w=uαvandw =uβvandthereisaproductionα→β.Furthermore,wewrite
w⇒⇒worw ∗ w
ifw canbeobtainedfromwusingafinitenumberofproductions.
NowletGbeagrammarwithterminalsetT.ThelanguageofG,denotedbyL(G), consistsofallwordsin
TthatcanbeobtainedfromthestartsymbolSbytheaboveprocess;thatis,
L(G) ={w∈T∗|S⇒⇒w}
EXAMPLE 10 ConsiderthegrammarGinExample 9.Observethatw=a2b4 canbeobtainedfrom the startsymbolSasfollows:
S⇒AB ⇒AaB⇒aaB⇒aaBb⇒aaBbb⇒aaBbbb⇒aabbbb=a2b4
Hereweusedtheproductions1,2,4,3,3,3,5,respectively.ThuswecanwriteS⇒⇒a2b4.Hencew=a2b4
belongstoL(G). Moregenerally,theproductionsequence:
1,2(rtimes),4,3(stimes),5
willproducetheword w=arabsbwhererandsarenonnegativeintegers.Ontheother hand,nosequenceof productions canproduceanaafterab.Accordingly,
L(G) ={ambn|mandnarepositiveintegers}
That is,thelanguageL(G) ofthegrammarGconsistsofallwordswhichbegin with oneormorea’sfollowed by oneormoreb’s.
EXAMPLE 11 FindthelanguageL(G) over{a,b,c}generatedbythegrammarG:
S→aSb,aS→Aa,Aab→c
Firstwemustapplythefirstproductiononeormoretimestoobtainthewordw=anSbn wheren0. To eliminateS,wemustapplythesecondproductiontoobtainthewordw =amAabbm wherem=n−1≥0. Nowwecanonlyapplythethirdproductiontofinallyobtainthewordw =amcbm wherem≥0.Accordingly,
L(G) ={amcbm|mnonnegative}
Thatis,L(G) consistsofallwordswiththesamenonnegativenumberofa’sandb’sseparatedbya c.
TypesofGrammars
Grammarsareclassifiedaccordingtothekindsofproduction whichareallowed.Thefollowinggrammar classification isduetoNoamChomsky.
AType0grammarhasnorestrictionsonitsproductions.Types1,2,and3aredefinedasfollows:
(1) AgrammarGissaidtobeofType1ifeveryproduction isoftheformα→βwhere|α|≤|β|orofthe form α→l.
(2) AgrammarGissaidtobeofType2ifeveryproduction isoftheformA→βwheretheleftsideAisa nonterminal.
(3) AgrammarGissaidtobeofType3ifevery productionisoftheformA→aorA→aB,thatis,where theleftsideAisasinglenonterminalandtherightsideisasingleterminaloraterminalfollowedbya
nonterminal,oroftheformS→l.
Observethatthegrammarsformahierarchy;thatis,everyType3grammarisaType2grammar,everyType
2grammarisaType1grammar,andeveryType1grammarisaType0grammar.
Grammarsarealsoclassifiedintermsofcontext-sensitive,context-free,andregularasfollows. (a) AgrammarGissaidtobecontext-sensitiveiftheproductionsareoftheform
αAα →αβα
Thename“context-sensitive”comesfromthefactthatwecanreplacethevariableAbyβinawordonly when Aliesbetweenαandα.
(b) AgrammarGissaidtobecontext-freeiftheproductionsareoftheform
A→β
Thename“context-free”comesfromthefactthatwecannowreplacethevariableAbyβregardlessof where Aappears.
(c) AgrammarGissaidtoberegulariftheproductionsareoftheform
A→a,A→aB,S→l
Observethatacontext-freegrammaristhesameasaType2grammar, andaregular grammaristhesameasa
Type3grammar.
Afundamentalrelationshipbetweenregulargrammarsandfiniteautomatafollows.
Theorem 4: Alanguage LcanbegeneratedbyaType3(regular) grammar G,ifandonlyifthereexistsa finite automatonMwhichacceptsL.
ThusalanguageLisregulariffL=L(r)whererisaregularexpressioniffL=L(M)whereMisafinite automatoniffL=L(G)whereGisaregulargrammar.(Recallthat“iff”isanabbreviationfor“ifandonlyif.”)
EXAMPLE 12 ConsiderthelanguageL={anbn|n0}. (a) Findacontext-freegrammarGwhichgeneratesL.
ClearlythegrammarGwiththefollowingproductionswillgenerateL:
S→ab,S→aSb
NotethatGiscontext-free
(b) FindaregulargrammarGwhichgeneratesL.
ByExample 8,Lisnotaregularlanguage.ThusLcannotbegeneratedbyaregulargrammar.
DerivationTreesofContext-FreeGrammars
Consideracontext-free (Type2)grammarG. AnyderivationofawordwinL(G) canberepresented graphicallybymeansofanordered,rootedtreeT,calledaderivationtreeorparsetree. Weillustratesucha derivation treebelow.
LetGbethecontext-freegrammarwiththefollowingproductions:
S→aAB,A→Bba,B→bB,B→c
Thewordw=acbabccanbederivedfromSasfollows:
S⇒aAB⇒a(Bba)B⇒acbaB⇒acba(bB)⇒acbabc
OnecandrawaderivationtreeTofthewordwasindicatedbyFig. 6.Specifically,webeginwithSasthe rootandthenaddbranchestothetreeaccording totheproductionusedinthederivation ofw. Thisyieldsthe completedtreeTwhichisshowninFig. 6(e).Thesequence ofleavesfromlefttorightinTisthederived wordw.Also,anynon-leafinTisavariable,sayA,andtheimmediatesuccessors(children)ofAformaword
αwhereA→αistheproductionofGusedinthederivationofw.
Fig. 6
Backus-NaurForm
Thereisanothernotation,calledtheBackus-Naurform,whichissometimesusedfordescribingtheproduc- tions ofacontext-free(Type2)grammar.Specifically:
(i) “::=”isusedinsteadof“→.”
(ii) Everynonterminalisenclosedinbrackets)*.
(iii) All productionswiththesamenonterminalleft-handside arecombinedinto one statementwithallthe right-hand sideslistedontherightof::=separatedbyverticalbars.
Forinstance,theproductionsA→aB, A→b,A→BCarecombinedintotheonestatement:
)A* ::=a)B*|b|)B*)C*
MachinesandGrammars
Theorem 4tellsusthattheregularlanguagescorrespondtothefinitestateautomata(FSA).Therearealso machines, morepowerfulthantheFSA,whichcorrespondtotheothergrammars.
(a) PushdownAutomata:ApushdownautomatonPissimilartoaFSAexceptthatPhasanauxiliary stack whichprovidesanunlimitedamountofmemoryforP.AlanguageLisrecognizedbyapushdownautomaton PifandonlyifLiscontext-free.
(b) LinearBoundedAutomata:AlinearboundedautomatonBismorepowerfulthanapushdownautomaton.
Suchan automatonBusesatapewhichislinearlyboundedby the lengthof the inputwordw.Alanguage
LisrecognizedbyalinearboundedautomatonBifandonlyifLiscontext-sensitive.
(c) TuringMachine:ATuringmachineM,namedaftertheBritishmathematicianAlanTuring,usesaninfinite tape;itisabletorecognizeeverylanguageLthatcanbegeneratedbyanyphase-structuregrammarG. Infact,a TuringmachineMisoneofanumberofequivalentways todefinethenotionofa“computable” function.
Thediscussionofthepushdownautomataandthelinearboundedautomataliesbeyondthescopeofthistext. We willdiscussTuringmachinesinChapter13.