Malcolm Hayward

                                                       Analysis of a Corpus of Poetry

                                              by a Connectionist Model of Poetic Meter

 

Abstract.  A corpus of 1000 lines of poetry (ten 100 line samples from ten different authors) is analyzed by a computerized connectionist model of poetic meter.  The analysis finds that poets utilize measurably distinct patterns of stress and suggests that these patterns might "fingerprint" individual writers.  In addition, the analysis shows that the variations of metrical patterns are in accord with the prevailing verse aesthetics of the period in which poets are writing.

           

Introduction

            In English poetry, the single most compelling discriminator of that genre--that which defines a poem as a poem--has traditionally been its meter.  Meter defines the length of the line, and thus the distinctive look of a poem on the page, and it sets, for the hearer of a poem, the telling regularity of a rhythm.  Whether this rhythm also carries the burden of some of a poem's meaning or whether it is used only for a conventional aesthetic effect that invites the reader to take pleasure in its regularity or variations, meter is one of the central attributes of the genre of poetry.

            While the meter of a poem may or may not be strongly attended to by the poem's audience, or its critics, metrics has always been a matter of substantial concern for poets (see Addison [1994]).  At each point in a line of poetry one factor in the decision favoring one word or syntactic pattern over another has been the metrical impact of that choice.  Moreover, the limits of choice are not merely defined by a correctness rule such as the following: All stressed positions must have stressed syllables and no unstressed positions may have a stressed syllable.  Metrical variations, resulting in what Halle and Keyser (1971), and others, have termed "metrical complexity" or "tension," are allowable and, in fact, produce much of the interest in a poem's rhythm.  Traugott (1989), for example, speaking of Auden's poetry, notes that "a complex metrical design can . . . be identified that complements and enriches the multifarious verbal icons functioning at other levels of the language" (294).  In fact, poetic rhythm may only work when it destroys that very sense of design that it invokes; the extreme position is taken by Shklovsky (1917), who says, "the problem is not one of complicating the rhythm, but of disordering of the rhythm" (p. 24)--a disordering which cannot be predicted.  Such variations might not merely exist to enrich the iconic functions of language, but might create, in Eichenbaum's (1926) terms, an "independent significance" (p. 110)--independent of any meaning within the verse.

            In the past, research on metrical practice and theory has given much attention to the nature of these variations.  Both traditional metrics and generative metrics have sought both to describe and define the types of variation that are allowable or possible before a passage is described as "unmetrical" and to explore the effects of such variations on interpretations of the poem.  In general, traditional metrics has been more productive in the latter task, ascribing certain patterns of meaning or effects to metrical variation, while generative metrics has had better success in developing a theoretical basis for understanding what types of variation are possible and what syntactic and lexical constraints may be operating to limit a poet's choice of variant rhythms.  There are, of course, a number of examples of generative theories being used productively to analyze meaning patterns, most notably the work of Tarlinskaja (1984, 1987a, 1987b, 1989).  The more usual view is, however, stated by Attridge (1982), in speaking of the affective functions of rhythm: "As always, the semantic properties take the lead, and may or may not be reinforced or modified by the formal properties" (p. 297).  This is certainly the position of traditional and musical systems.  To cite, for example, what has been this century's most widely read text on verse, Cleanth Brooks and Robert Penn Warren's Understanding Poetry (1976), "The sense of vitality that we find in good verse arises from a tension between the tug of the metrical pattern toward a flat uniformity on the one hand and, on the other, the special stress on certain words that is demanded by the rhetorical pattern.  The abstract pattern of the meter sets up certain expectancies as to where the stress is to fall, but the expressive importance of this or that particular word forces us to modify or even to violate the pattern" (503).  Thus each system, the traditional and the linguistic-generative, has had to develop a theoretical basis for describing and dealing with metrical variation, whether in terms of "allowability" (what metrical variations are permissible), or in interpretive terms (what does a variation mean, if anything).  The interplay of these and other theories has been analyzed interestingly by Cureton (1992; 1993). 

            Some areas of metricality have, however, not been as fully explored as they might be, and these areas may offer other options for approaching issues which have been long contested and are not yet resolved.  For example, it might be hypothesized that the types of variation a poet will introduce into his or her lines of poetry will not be random.  Rather, the poet will choose, or develop, or fall into certain regular ways of treating variations.  To some extent this pattern or regularity of variation--if it exists--will lie, it might again be hypothesized, within the prevailing verse aesthetic of the period, or at least within the aesthetic of the poet's stylistic preference.  At the same time, however, the poet's metrical choices should show a distinctiveness and individuality.  Moreover, even in cases in which the verse is regular, a stress falling in a stress position, for example, the amount of stress given to a syllable and the difference between the amount of stress on a syllable and the amount on the preceding syllable will create a characteristic metrical style or sound for the poet.  Such issues have always been accessible to careful readers attuned to, for example, the regular cadences of Pope's neo-classical lines, or the rough and ragged rhythms of Robert Browning.  Systematic comparisons and analyses have, however, been hampered by the lack of an appropriate methodology and system for describing quantitatively the metrical features that result in a poet's characteristic style.

            Such an analytical system must meet several criteria.  First, the system should take account of variations and of the positions in which variations occur.  Second, it should be finely attuned to the amount of stress at each position, rather than merely assigning a single unit of stress to a syllable.  Third, it should consider stress as a performance quality rather than a theoretical possibility.  Finally, the system must be sensitive to the context of individual variations.  An analytic system should regard the shape of the whole line, and not just the metrical foot.  Moreover, the system should consider the place of the line within larger verse units of the poem.  Stress may be used in performance to highlight words which express key images or meanings or which rhyme or alliterate with other words in the line or neighboring lines.  A system which meets these criteria should provide a useful way to explore both individual styles and verse aesthetics generally.

            The connectionist model of poetic meter was developed to meet the above criteria, and, particularly, to account for the interaction of intonation, lexical stress, prosodic devices, syntactic patterns, and interpretive emphases, in the production of the stress pattern that is achieved in a line of poetry (Hayward, 1991), particularly in its delivery (see Jakobson, 1960).  The model has been used to analyze certain problems in generative metrics which have seemed elusive, such as measures of metricality (Hayward, 1996).  One of the primary intentions in the development of this model, however, was to provide a quantitative basis for comparisons between poets.   This paper describes the analysis of a corpus of poetry from ten authors representing different periods.  Nine of the authors are British; one is American.  Two are from the Renaissance (Jonson and Donne).  Two represent neo-classical verse (Prior and Pope).  Three are from the Romantic period (Wordsworth, Coleridge, and Keats).  Two writers are Victorians (Tennyson and Browning).  Finally, one twentieth-century American writer was chosen (Frost).  The selections were made to create a sample that was representative of a range of styles, but which included within it some pairings of writers who would seem, at least on the surface, to share identifiable qualities (such as Prior and Pope or Wordsworth and Coleridge).

            One goal for this analysis was to see whether it would be possible to differentiate among the metrical patterns developed by individual writers.  It would be useful, for example, to see whether this model could address the question of whether poets develop distinctive metrical patterns.  A second goal was to explore stylistic distinctions among periods.  Neoclassical poetry of the eighteenth century is generally categorized as regular--see, for example, Fussell (1954).  But what does "regular" mean?  Fewer variations than other verse?  Or variations only in certain places--a kind of regular irregularity?  Or does regularity indicate that the variations are not as extreme as for other writers, either in their number or in the amount of stress associated with the variation?  This study begins an analysis of such issues as these, which have seldom been based upon quantitative methods.

            For the purposes of this paper, I have framed the following hypotheses which will be tested by the model:

            1.  Authors may have unique stress patterns and the pattern may serve as a "fingerprint" for the author.

            2.  Some writers may show a greater degree of variation or regularity than other writers.

            3.  A plot of patterns of metrical activation may show systematic patterns of variation among writers in accord with the prevailing verse aesthetics of the period in which they are writing.

Method

            The connectionist model is based upon the parallel distributed processing (PDP) models of McClelland and Rumelhart (1988), in particular, the constraint satisfaction model.  In this model, metrical stress is viewed as the activation of a likelihood of stress, though as Attridge (1982) points out, stress may be realized in performance by a number of different means.  I have chosen to analyze iambic pentameter poetry (the most widely used verse form in English), which has ten positions (usually individual syllables) at which a varying amount of stress may be placed.

            In this computerized model, each of these ten positions is connected to five other units, representing possible inputs towards stress from intonation, lexical features, prosody, syntax, and interpretation.  Within the model, possible inputs (representing an increased likelihood of activation) from each of the five units are set at 0.0, 0.1, or 0.2 (for the sake of convenience, I will now treat these as whole numbers, 0, 1, and 2).  A point of rising intonation, for example, is assigned a value of 1.  At the lexical level, the stressed syllable of a bisyllable is given a value of 1 (the other syllable and all monosyllabic words receive a 0 value), while the syllables in words of more than two syllables receive a value of 2 for primary stress, 1 for secondary stress, or 0 for tertiary stress.  Prosodic features are coded by 1 for each syllable which shares an alliteration or assonance with another syllable and 1 for each case of rhyme.  The grammatical structure of a line is encoded by the assignment of 2 for the subject, active verb, or object of a verb, 1 for the object of a preposition, the subject, verb, or object within a subordinate clause, and so on.  The interpretation of the significance of a word within a poem or play is the most subjective element of this analysis; for example, a particularly compelling metaphor or image that seems to lie at the heart of the meaning of the poem might receive 2.  A word contributing to a recurrent theme might receive 1. 

            For example, consider the following line from Wordsworth's Ode, "Intimations of Immortality":

            The clouds that gather round the setting sun  (l. 195)

My coding for each of the ten syllables is as follows:

            Intonation                     0000000100

            Lexical             0001000100

            Prosodic                       0100010101

            Syntactic                      0201000001

            Interpretive                   0100000100

This coding represents a rise in intonation at "setting," stress on the first syllables of "gather" and "setting," assonance between "clouds" and "round" and alliteration between "setting" and "sun," the function of "clouds" as the subject of a sentence, with "gather" and "sun" acting as verb in a dependent clause and object of a preposition, and an interpretation of the line focussing on the "clouds" and the fact that the sun is "setting," in contrast to the bright days described in the preceding lines. 

            Each position representing metrical stress is also connected to its neighboring positions, with a negative weight to decrease the stress on adjacent syllables.  Finally, a bias for stress on even numbered syllables is built into the system.  After inputs for all connected units are assigned, the system is sent through a series of 30 cycles in which inputs and activations from and to each of the sixty nodes are measured and averaged.  What is finally achieved is a measurement of the potential activation of metrical stress for each of the ten positions for that particular line of poetry.  For example, the stress activation levels reached when these inputs are given to the program for the line from Wordsworth are:

  0.000  0.821  0.000  0.698  0.000  0.543  0.000  0.884  0.000  0.660 

Stress falls in the normal iambic positions in this line, although there is some variation in the amount of stress achieved, with the strongest stress on "clouds" (.821) and "set" (.884).  Further discussion of the model and some of the specifics of its design are found in Hayward, 1991.

            To test the three hypotheses framed above, I chose a corpus of 1000 lines of iambic pentameter poetry, 100 lines from each of the 10 poets, representing a range of periods and styles.  The passages selected were at least 100 lines long; that is, each 100 line sample formed an integral, 100 line section of a poem.  I attempted to choose poems of the same type: all poems contained extended meditative and descriptive passages and all included at least some dramatic content.  These two criteria were chosen to minimize the possible effects of genre on the meter and the possibility that a poet's style may change radically over time.  A later study will explore metrical differences among genres and stylistic developments over time of individual poets.

            After inputs for all lines of poetry were determined, each line was sent through thirty cycles to achieve a final level of stress.  For example, the stress activation generated by the computer for the first five lines of the Pope selection ("Epistle to Dr. Arbuthnot") is as follows (at the thirtieth cycle):

            Shut, shut the door, good John!  (fatigued, I said),

            Tie up the knocker, say I'm sick, I'm dead.

            The Dog Star rages!  nay, 'tis past a doubt

            All Bedlam, or Parnassus, is let out:

            Fire in each eye, and papers in each hand . . .  (ll. 1-5)

0.895 0.749 0.129 0.705 0.000 0.709 0.049 0.747 0.000 0.503

0.633 0.570 0.002 0.895 0.000 0.402 0.559 0.867 0.698 0.861

0.000 0.871 0.374 0.922 0.127 0.721 0.000 0.239 0.000 0.780

0.000 0.826 0.309 0.122 0.085 0.869 0.344 0.041 0.689 0.662

0.676 0.143 0.222 0.760 0.000 0.933 0.000 0.132 0.325 0.653

The numbers range from 0, no activation, to a maximum of .999, indicating a total activation of stress.  While Pope is known for his regular meter, there is much variation afoot here.  Lines 1, 2, and 5, all show instances of poetic inversion--the first syllable of the foot is more likely to be stressed than the second syllable.  Another unusual feature occurs in lines 2 and 4, which close with spondees.  Line 4 is, in fact, highly irregular, with weak stresses on the fourth and eighth syllables (normally stressed positions). 

 

Results

            Once the activations for all positions in all lines were computed, a number of statistical tests were performed to explore the hypotheses mentioned above.  The statistical package used for the analysis was SPSS. 

 

            1.  The first hypothesis to be tested is the degree to which poets are unique in their patterns of stress activation.  Means and standard deviations were computed for each poet's stress of each syllable.  The results of these analyses are printed in Table 1.  A multivariate repeated measures analysis for the total group was performed and found statistically significant differences among all ten poets (multivariate criterion Pillai's Trace: F(81, 8937)=2.962, p<.001).  A multivariate repeated measures analysis was also performed for all poets for each individual syllable.  Again, significant differences were found (multivariate criterion Pillai's Trace: F(9, 994)=1613.784, p<.001).  Each poet employs a pattern of stress significantly different from that of every other poet.

 

                                                        [TABLE 1 ABOUT HERE.]

The means of the stress activation levels appear fairly close for all poets, though there are some marked differences; one might note the emphasis found in Pope's fourth syllable (.74) or Prior's (.70), compared to that of Frost (.57), while Frost actually places a fair amount of stress on his third syllables (.19) compared to Pope (.09) or Jonson (.07).  Poets also differ from one another in their willingness to adopt relatively varied (or stable) amounts of stress at different positions in the line.  The table shows that patterns of variability are highly idiosyncratic: some--for example, Pope and Tennyson--introduce a fair amount of variability in the first syllable, while others--such as Donne and Wordsworth--tend to vary the stress on the second and sixth syllables more than the other poets do.

            A second way to consider the uniqueness of a poet's style is to look the difference between the activation of the first and second syllables, the third and fourth, and so on.  Here I am, of course, adopting the traditional metrical measure, the poetic foot.  My analysis does not measure metricality in a traditional or generative sense, but rather looks for the amount of stress by which the two syllables in a foot differ.  This measure of "iambic difference" was computed and again a multivariate repeated measures analysis was performed for the total group.  Here too differences were significant (multivariate criterion Pillai's Trace: F(4, 999)=66.954, p<.001).  The same test was performed by poet; again differences were significant (multivariate criterion Pillai's Trace: F(4, 990)=67.855, p<.001).  The means and standard deviations of iambic difference are reported in Table 2.  In general, poets with a higher level of iambic difference might be characterized as having a more regular meter; there will be a more accentuated rhythm apparent in the lines.  Poets with lower means (and higher standard deviations) characteristically display a more irregular meter.

                                                        [TABLE 2 ABOUT HERE.]

For example, Matthew Prior shows the highest degree of iambic difference within almost every foot, reflecting his highly cadenced rhythm, compared to Frost's relatively flat, almost prosaic line.  Tennyson's meters are interesting, showing a great deal of variability in the first, third, and fourth feet, compared to the other poets. 

            2.  The question of iambic difference brings up the second issue to be addressed, the measurement of the degree of variability or regularity of each individual poet's meter.  To create this measurement, I first computed the mean stress activation at each of the ten positions in the poet's lines (that is, the average stress activation for the first syllable, the second syllable, and so on).  I then computed the squared variation of stress activation from each mean for each syllable, again by position.  I then summed these squared activations by line, and took the square root of that sum.  This created a measure of variance for each poet line by line.  Finally, I took the mean of those variances to create a measure of average variance for each poet.  Table 3 presents the means of the stress activation of all syllables for all poets and the mean line variance.

                                                         [TABLE 3 ABOUT HERE]

There is not much that is surprising here, perhaps.  As noted previously, the mean stress activation remains fairly consistent for most poets, with the exception of Ben Jonson, who writes a relatively less stressed line than the other poets.  The means of the line variations do, however, allow conclusions to be drawn that are in line with standard metrical analyses.  Prior, Jonson, and Pope prove to be relatively regular, with a lower variance from their established patterns.  Browning and Tennyson use a fair amount of variation in their rhythms--Browning particularly so.  Wordsworth, Coleridge, Donne, Keats, and Frost form a kind of middle group.

            3.  The previous analysis has implications for the relation between patterns of stress activation and the prevailing verse aesthetic of a period, the issue of the third hypothesis.  In the neo-classical poetry of Jonson, Prior, and Pope one expects a certain regularity of rhythm, smooth numbers, as it was termed.  This regularity is born out by the analysis of the results in Table 3.  I also approached the issue by grouping the standard deviations (by syllable) of three types of poets: neoclassic, represented by Jonson, Prior, and Pope; romantic, including Wordsworth, Coleridge, and Keats; and Victorian, comprising Browning and Tennyson.  Differences in the prevailing aesthetic norms are evident in Table 4. 

                                                        [TABLE 4 ABOUT HERE.]

Victorian poets show a greater degree of irregularity at each point in the line except syllables 2 and 6. Neo-classical poets are far more regular, especially at the ends of their lines.

 

Discussion

            In some ways, the results of this analysis do more to confirm expectations concerning metricality than to open surprising new vistas: a poem by Frost, after all, "sounds" different than a poem by Browning; an experienced reader could certainly distinguish a couplet by Pope from one by Wordsworth.  The analysis does, however, allow the assignment of a typical pattern to individual poets--at least to the extent that these selections are representative of the poets.  And it does verify that verse aesthetics, such as a neoclassical emphasis on smoothness in numbers, is quantifiable.  But the analysis does not account for the reasons that poets closely connected in time and aesthetic goals, such as Wordsworth and Coleridge, show differences.  Prosody is closely allied to syntactic patterns, to lexical choice, and to aesthetic criteria.  These criteria are exceedingly complex and are related to the poet's own sense of form and rhythm, to the needs of the particular poem at each point in the poem, and, as the analysis suggests, to the prevailing verse aesthetic as embodied in the poet's work.  Moreover, syntax and lexical choice are also influenced by the same considerations as the aesthetic criteria.  As Tarlinskaja (1984) points out, syntax and lexicon are to a large extent generically determined; poetry is different than prose, and perhaps, given its emphasis on form, far more determined, far less free, than prose.  Perhaps the determining influence of the poetic genre, particularly the meter, which forces the poet, in the act of creation, to foreground strictly formal considerations, produces a tendency in poets to develop highly individual styles.  Poets are not, however, entirely free in this.  There are limits to metricality, after all, even for Wordsworth and Frost, for whom effective verse was to approach the sounds of well-measured prose or normal speech. 

            The issue of limits leads back to the question of what constitutes regularity.  While all poets introduce metrical variations, the average activations of stress at each point in the line are strikingly consistent.  What differentiates poets in terms of regularity is the number and range of variations that might be allowed at different points in the line.  Matthew Prior often inverts the normal stress in the first two syllables, yet is far less likely than the other poets to vary the concluding four syllables.  Inversion in the first metrical foot is, however, the most common place for all poets in which a weak syllable may find a strong stress.  Prior might then be termed the most "regular" of the poets because his metrical variations are found in normal places (a regular irregularity) and because his lines overall show fewer variations than other poets.

            In summary, the computerized connectionist model of poetic meter was successful in determining significant differences among the ten poets analyzed.  Moreover, as expected, the analysis highlighted differences among poets working with different aesthetic standards.  To the degree that poets of a particular period work from the same aesthetic principles, it was also possible to describe a "typical" verse pattern.  To an extent these findings also confirm the connectionist model's ability to identify and analyze significant features of iambic pentameter poetry. 

            The findings point toward directions for further research, such as in attribution studies.  As poets do have distinctive stress patterns, it should be possible to compare samples of disputed authorship with known samples as at least one indication of the probability of authorship.  Again, the model points a way to analyze stylistic influences between poets.  And it may provide a means for analyzing the development or changes in verse sophistication that occur in one writer over time.


                                                                 WORKS CITED

Addison, C. (1994).  Once upon a time: A reader-response approach to prosody.  College English, 56, 655-687.

Attridge, D.  (1982).  The rhythms of English poetry.  London: Longman. 

tBrooks, C., and Warren, R. P. (1976).  Understanding poetry.  4th ed.; New York: Holt, Rinehart and Winston.

Cureton, R. D. (1992).  Rhythmic phrasing in English verse.  London: Longman.

---.  (1993).  Aspects of verse study.  Style, 27, 521-529.

Eichenbaum, B. (1924).  The theory of the "Formal Method."  In Russian formalist criticism: Four essays.  Trans. L. T. Lemon & M. J. Reis.  (pp. 99-139).  Lincoln: U of Nebraska Press.

Fussell, P.  (1954).  Theory of prosody in eighteenth-century England.  Connecticut College Monograph No. 5; reprint Archon Books, 1966.

Halle, M., and Keyser, S. J.  (1971).  English stress: Its form, its growth, and its role in verse.  New York: Harper and Row.

Hayward, M.  (1991).  A connectionist model of poetic meter.  Poetics, 20, 303-317.

Hayward, M.  (1996).  Application of a connectionist model of poetic meter to problems in generative metrics.  Research in Humanities Computing 4.  (pp. 185-192).  Oxford: Clarendon P.

Jakobson, R.  (1960).  Closing statement: Linguistics and poetics.  In Sebeok, T. A. (Ed.).  Style in language.  (pp. 350-377).  Cambridge, MA: MIT Press.

McClelland, J. L., & Rumelhart, D. E.  (1988).  Explorations in parallel distributed processing: A handbook of models, programs, and exercises.  Cambridge: MIT Press.

Shklovsky, V.  (1917).  Art as technique.  In Russian formalist criticism: Four essays.  Trans. L. T. Lemon & M. J. Reis.  (pp. 3-24).  Lincoln: U of Nebraska Press.

Tarlinskaja, M.  (1984).  Rhythm‑Morphology‑Syntax‑Rhythm.  Style, 18, 1‑26.

---.  (1987a).  Rhythm and meaning.  Style, 21, 1-35.

---.  (1987b).  Meter and language: Binary and ternary meters in English and Russian.  Style, 21, 626‑649.

---.  (1989).  Meter and Meaning: Semantic associations of the English 'Dolnik' verse form.  Style, 23, 238‑260.

Traugott, E. C.  (1989).  Meter in Auden's "Streams."  In Kiparsky, P. & G. Youmans (Eds.).  Phonetics and Phonology: Volume 1, Rhythm and Meter.  (pp. 291-304).  San Diego: Academic Press.


Table 1:

 

Mean Activation Levels of Each Syllable

 

Poet

s1

s2

s3

s4

s5

s6

s7

s8

s9

s10

AP

.25

.60

.09

.74

.14

.60

.12

.63

.10

.80

AT

.25

.54

.15

.60

.17

.64

.15

.56

.14

.69

BJ

.13

.57

.07

.65

.06

.56

.06

.64

.07

.72

JD

.15

.57

.11

.63

.14

.60

.14

.69

.13

.72

JK

.20

.60

.13

.64

.14

.63

.12

.63

.13

.70

MP

.18

.58

.14

.70

.12

.61

.11

.71

.12

.81

RB

.25

.53

.18

.62

.17

.64

.16

.60

.20

.67

RF

.22

.52

.19

.57

.14

.61

.12

.59

.14

.71

SC

.22

.58

.17

.67

.14

.58

.16

.64

.15

.74

WW

.15

.58

.13

.65

.16

.61

.15

.63

.13

.71

 

Standard Deviation of Activation for Each Syllable

 

Poet

s1

s2

s3

s4

s5

s6

s7

s8

s9

s10

AP

.30